Author: "Zhang, Jingyu" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhang, Jingyu"' showing total 5,864 results

Start Over Author "Zhang, Jingyu"

5,864 results on '"Zhang, Jingyu"'

1. Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements

Author: Zhang, Jingyu, Elgohary, Ahmed, Magooda, Ahmed, Khashabi, Daniel, and Van Durme, Benjamin
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The current paradigm for safety alignment of large language models (LLMs) follows a one-size-fits-all approach: the model refuses to interact with any content deemed unsafe by the model provider. This approach lacks flexibility in the face of varying social norms across cultures and regions. In addition, users may have diverse safety needs, making a model with static safety standards too restrictive to be useful, as well as too costly to be re-aligned. We propose Controllable Safety Alignment (CoSA), a framework designed to adapt models to diverse safety requirements without re-training. Instead of aligning a fixed model, we align models to follow safety configs -- free-form natural language descriptions of the desired safety behaviors -- that are provided as part of the system prompt. To adjust model safety behavior, authorized users only need to modify such safety configs at inference time. To enable that, we propose CoSAlign, a data-centric method for aligning LLMs to easily adapt to diverse safety configs. Furthermore, we devise a novel controllability evaluation protocol that considers both helpfulness and configured safety, summarizing them into CoSA-Score, and construct CoSApien, a human-authored benchmark that consists of real-world LLM use cases with diverse safety requirements and corresponding evaluation prompts. We show that CoSAlign leads to substantial gains of controllability over strong baselines including in-context alignment. Our framework encourages better representation and adaptation to pluralistic human values in LLMs, and thereby increasing their practicality.
Published: 2024

2. RATIONALYST: Pre-training Process-Supervision for Improving Reasoning

Author: Jiang, Dongwei, Wang, Guoxuan, Lu, Yining, Wang, Andrew, Zhang, Jingyu, Liu, Chuyu, Van Durme, Benjamin, and Khashabi, Daniel
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: The reasoning steps generated by LLMs might be incomplete, as they mimic logical leaps common in everyday communication found in their pre-training data: underlying rationales are frequently left implicit (unstated). To address this challenge, we introduce RATIONALYST, a model for process-supervision of reasoning based on pre-training on a vast collection of rationale annotations extracted from unlabeled data. We extract 79k rationales from web-scale unlabelled dataset (the Pile) and a combination of reasoning datasets with minimal human intervention. This web-scale pre-training for reasoning allows RATIONALYST to consistently generalize across diverse reasoning tasks, including mathematical, commonsense, scientific, and logical reasoning. Fine-tuned from LLaMa-3-8B, RATIONALYST improves the accuracy of reasoning by an average of 3.9% on 7 representative reasoning benchmarks. It also demonstrates superior performance compared to significantly larger verifiers like GPT-4 and similarly sized models fine-tuned on matching training sets., Comment: Our code, data, and model can be found at this repository: https://github.com/JHU-CLSP/Rationalyst
Published: 2024

3. YOLO-PPA based Efficient Traffic Sign Detection for Cruise Control in Autonomous Driving

Author: Zhang, Jingyu, Zhang, Wenqing, Tan, Chaoyi, Li, Xiangtian, and Sun, Qianyi
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: It is very important to detect traffic signs efficiently and accurately in autonomous driving systems. However, the farther the distance, the smaller the traffic signs. Existing object detection algorithms can hardly detect these small scaled signs.In addition, the performance of embedded devices on vehicles limits the scale of detection models.To address these challenges, a YOLO PPA based traffic sign detection algorithm is proposed in this paper.The experimental results on the GTSDB dataset show that compared to the original YOLO, the proposed method improves inference efficiency by 11.2%. The mAP 50 is also improved by 93.2%, which demonstrates the effectiveness of the proposed YOLO PPA.
Published: 2024

4. Explain EEG-based End-to-end Deep Learning Models in the Frequency Domain

Author: Wang, Hanqi, Yang, Kun, Zhang, Jingyu, Chen, Tao, and Song, Liang
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: The recent rise of EEG-based end-to-end deep learning models presents a significant challenge in elucidating how these models process raw EEG signals and generate predictions in the frequency domain. This challenge limits the transparency and credibility of EEG-based end-to-end models, hindering their application in security-sensitive areas. To address this issue, we propose a mask perturbation method to explain the behavior of end-to-end models in the frequency domain. Considering the characteristics of EEG data, we introduce a target alignment loss to mitigate the out-of-distribution problem associated with perturbation operations. Additionally, we develop a perturbation generator to define perturbation generation in the frequency domain. Our explanation method is validated through experiments on multiple representative end-to-end deep learning models in the EEG decoding field, using an established EEG benchmark dataset. The results demonstrate the effectiveness and superiority of our method, and highlight its potential to advance research in EEG-based end-to-end models.
Published: 2024

5. Core: Robust Factual Precision with Informative Sub-Claim Identification

Author: Jiang, Zhengping, Zhang, Jingyu, Weir, Nathaniel, Ebner, Seth, Wanner, Miriam, Sanders, Kate, Khashabi, Daniel, Liu, Anqi, and Van Durme, Benjamin
Subjects: Computer Science - Computation and Language
Abstract: Hallucinations pose a challenge to the application of large language models (LLMs) thereby motivating the development of metrics to evaluate factual precision. We observe that popular metrics using the Decompose-Then-Verify framework, such as \FActScore, can be manipulated by adding obvious or repetitive subclaims to artificially inflate scores. This observation motivates our new customizable plug-and-play subclaim selection component called Core, which filters down individual subclaims according to their uniqueness and informativeness. We show that many popular factual precision metrics augmented by Core are substantially more robust on a wide range of knowledge domains. We release an evaluation framework supporting easy and modular use of Core and various decomposition strategies, which we recommend adoption by the community. We also release an expansion of the FActScore biography dataset to facilitate further studies of decomposition-based factual precision evaluation.
Published: 2024

6. Research on Edge Detection of LiDAR Images Based on Artificial Intelligence Technology

Author: Yang, Haowei, Wang, Liyang, Zhang, Jingyu, Cheng, Yu, and Xiang, Ao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: With the widespread application of Light Detection and Ranging (LiDAR) technology in fields such as autonomous driving, robot navigation, and terrain mapping, the importance of edge detection in LiDAR images has become increasingly prominent. Traditional edge detection methods often face challenges in accuracy and computational complexity when processing LiDAR images. To address these issues, this study proposes an edge detection method for LiDAR images based on artificial intelligence technology. This paper first reviews the current state of research on LiDAR technology and image edge detection, introducing common edge detection algorithms and their applications in LiDAR image processing. Subsequently, a deep learning-based edge detection model is designed and implemented, optimizing the model training process through preprocessing and enhancement of the LiDAR image dataset. Experimental results indicate that the proposed method outperforms traditional methods in terms of detection accuracy and computational efficiency, showing significant practical application value. Finally, improvement strategies are proposed for the current method's shortcomings, and the improvements are validated through experiments.
Published: 2024

7. Application of Natural Language Processing in Financial Risk Detection

Author: Wang, Liyang, Cheng, Yu, Xiang, Ao, Zhang, Jingyu, and Yang, Haowei
Subjects: Quantitative Finance - Risk Management, Computer Science - Computation and Language
Abstract: This paper explores the application of Natural Language Processing (NLP) in financial risk detection. By constructing an NLP-based financial risk detection model, this study aims to identify and predict potential risks in financial documents and communications. First, the fundamental concepts of NLP and its theoretical foundation, including text mining methods, NLP model design principles, and machine learning algorithms, are introduced. Second, the process of text data preprocessing and feature extraction is described. Finally, the effectiveness and predictive performance of the model are validated through empirical research. The results show that the NLP-based financial risk detection model performs excellently in risk identification and prediction, providing effective risk management tools for financial institutions. This study offers valuable references for the field of financial risk management, utilizing advanced NLP techniques to improve the accuracy and efficiency of financial risk detection.
Published: 2024

8. Research on the Application of Computer Vision Based on Deep Learning in Autonomous Driving Technology

Author: Zhang, Jingyu, Cao, Jin, Chang, Jinghao, Li, Xinjin, Liu, Houze, and Li, Zhenglin
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: This research aims to explore the application of deep learning in autonomous driving computer vision technology and its impact on improving system performance. By using advanced technologies such as convolutional neural networks (CNN), multi-task joint learning methods, and deep reinforcement learning, this article analyzes in detail the application of deep learning in image recognition, real-time target tracking and classification, environment perception and decision support, and path planning and navigation. Application process in key areas. Research results show that the proposed system has an accuracy of over 98% in image recognition, target tracking and classification, and also demonstrates efficient performance and practicality in environmental perception and decision support, path planning and navigation. The conclusion points out that deep learning technology can significantly improve the accuracy and real-time response capabilities of autonomous driving systems. Although there are still challenges in environmental perception and decision support, with the advancement of technology, it is expected to achieve wider applications and greater capabilities in the future. potential.
Published: 2024

9. Role of land-ocean interactions in stepwise Northern Hemisphere Glaciation.

Author: Zhong, Yi, Tan, Ning, Abell, Jordan, Sun, Chijun, Kaboth-Bahr, Stefanie, Ford, Heather, Herbert, Timothy, Pullen, Alex, Horikawa, Keiji, Yu, Jimin, Struve, Torben, Weber, Michael, Clift, Peter, Larrasoaña, Juan, Lu, Zhengyao, Yang, Hu, Bahr, André, Chen, Tianyu, Zhang, Jingyu, Wei, Cao, Xia, Wenyue, Yang, Sheng, and Liu, Qingsong
Abstract: The investigation of triggers causing the onset and intensification of Northern Hemisphere Glaciation (NHG) during the late Pliocene is essential for understanding the global climate system, with important implications for projecting future climate changes. Despite their critical roles in the global climate system, influences of land-ocean interactions on high-latitude ice sheets remain largely unexplored. Here, we present a high-resolution Asian dust record from Ocean Drilling Program Site 1208 in the North Pacific, which lies along the main route of the westerlies. Our data indicate that atmosphere-land-ocean interactions affected aeolian dust emissions through modulating moisture and vegetation in dust source regions, highlighting a critical role of terrestrial systems in initiating the NHG as early as 3.6 Myr ago. Combined with additional multi-proxy and model results, we further show that westerly wind strength was enhanced, mainly at low-to-middle tropospheric levels, during major glacial events at about 3.3 and 2.7 Myr ago. We suggest that coupled responses of Earths surface dynamics and atmospheric circulation in the Plio-Pleistocene likely involved feedbacks related to changes in paleogeography, ocean circulation, and global climate.
Published: 2024

10. DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation

Author: Tan, Weiting, Zhang, Jingyu, Shen, Lingfeng, Khashabi, Daniel, and Koehn, Philipp
Subjects: Computer Science - Computation and Language
Abstract: Non-autoregressive Transformers (NATs) are recently applied in direct speech-to-speech translation systems, which convert speech across different languages without intermediate text data. Although NATs generate high-quality outputs and offer faster inference than autoregressive models, they tend to produce incoherent and repetitive results due to complex data distribution (e.g., acoustic and linguistic variations in speech). In this work, we introduce DiffNorm, a diffusion-based normalization strategy that simplifies data distributions for training NAT models. After training with a self-supervised noise estimation objective, DiffNorm constructs normalized target data by denoising synthetically corrupted speech features. Additionally, we propose to regularize NATs with classifier-free guidance, improving model robustness and translation quality by randomly dropping out source information during training. Our strategies result in a notable improvement of about +7 ASR-BLEU for English-Spanish (En-Es) and +2 ASR-BLEU for English-French (En-Fr) translations on the CVSS benchmark, while attaining over 14x speedup for En-Es and 5x speedup for En-Fr translations compared to autoregressive baselines., Comment: Accepted at NeurIPS 2024
Published: 2024

11. Research on Credit Risk Early Warning Model of Commercial Banks Based on Neural Network Algorithm

Author: Cheng, Yu, Yang, Qin, Wang, Liyang, Xiang, Ao, and Zhang, Jingyu
Subjects: Quantitative Finance - Risk Management, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: In the realm of globalized financial markets, commercial banks are confronted with an escalating magnitude of credit risk, thereby imposing heightened requisites upon the security of bank assets and financial stability. This study harnesses advanced neural network techniques, notably the Backpropagation (BP) neural network, to pioneer a novel model for preempting credit risk in commercial banks. The discourse initially scrutinizes conventional financial risk preemptive models, such as ARMA, ARCH, and Logistic regression models, critically analyzing their real-world applications. Subsequently, the exposition elaborates on the construction process of the BP neural network model, encompassing network architecture design, activation function selection, parameter initialization, and objective function construction. Through comparative analysis, the superiority of neural network models in preempting credit risk in commercial banks is elucidated. The experimental segment selects specific bank data, validating the model's predictive accuracy and practicality. Research findings evince that this model efficaciously enhances the foresight and precision of credit risk management.
Published: 2024

12. Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

Author: Li, Zhenglin, Guan, Bo, Wei, Yuanzhou, Zhou, Yiming, Zhang, Jingyu, and Xu, Jinxin
Subjects: Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing
Abstract: Generative Adversarial Networks (GANs) have significantly advanced image processing, with Pix2Pix being a notable framework for image-to-image translation. This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images, addressing the scarcity of such images crucial for domains like urban planning and autonomous vehicle training. We detail the Pix2Pix model's utilization for generating high-fidelity datasets, supported by a dataset of paired map and aerial images, and enhanced by a tailored training regimen. The results demonstrate the model's capability to accurately render complex urban features, establishing its efficacy and potential for broad real-world applications.
Published: 2024

13. Research on Splicing Image Detection Algorithms Based on Natural Image Statistical Characteristics

Author: Xiang, Ao, Zhang, Jingyu, Yang, Qin, Wang, Liyang, and Cheng, Yu
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: With the development and widespread application of digital image processing technology, image splicing has become a common method of image manipulation, raising numerous security and legal issues. This paper introduces a new splicing image detection algorithm based on the statistical characteristics of natural images, aimed at improving the accuracy and efficiency of splicing image detection. By analyzing the limitations of traditional methods, we have developed a detection framework that integrates advanced statistical analysis techniques and machine learning methods. The algorithm has been validated using multiple public datasets, showing high accuracy in detecting spliced edges and locating tampered areas, as well as good robustness. Additionally, we explore the potential applications and challenges faced by the algorithm in real-world scenarios. This research not only provides an effective technological means for the field of image tampering detection but also offers new ideas and methods for future related research.
Published: 2024

14. Research on Detection of Floating Objects in River and Lake Based on AI Intelligent Image Recognition

Author: Zhang, Jingyu, Xiang, Ao, Cheng, Yu, Yang, Qin, and Wang, Liyang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: With the rapid advancement of artificial intelligence technology, AI-enabled image recognition has emerged as a potent tool for addressing challenges in traditional environmental monitoring. This study focuses on the detection of floating objects in river and lake environments, exploring an innovative approach based on deep learning. By intricately analyzing the technical pathways for detecting static and dynamic features and considering the characteristics of river and lake debris, a comprehensive image acquisition and processing workflow has been developed. The study highlights the application and performance comparison of three mainstream deep learning models -SSD, Faster-RCNN, and YOLOv5- in debris identification. Additionally, a detection system for floating objects has been designed and implemented, encompassing both hardware platform construction and software framework development. Through rigorous experimental validation, the proposed system has demonstrated its ability to significantly enhance the accuracy and efficiency of debris detection, thus offering a new technological avenue for water quality monitoring in rivers and lakes
Published: 2024

15. SELF-[IN]CORRECT: LLMs Struggle with Discriminating Self-Generated Responses

Author: Jiang, Dongwei, Zhang, Jingyu, Weller, Orion, Weir, Nathaniel, Van Durme, Benjamin, and Khashabi, Daniel
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Can LLMs consistently improve their previous outputs for better results? For this to be true, LLMs would need to be better at discriminating among previously-generated alternatives, than generating initial responses. We explore the validity of this hypothesis in practice. We first formulate a unified framework that allows us to compare the generative and discriminative capability of any model on any task. In our resulting experimental analysis of several open-source and industrial LLMs, we observe that models are not reliably better at discriminating among previously-generated alternatives than generating initial responses. This finding challenges the notion that LLMs may be able to enhance their performance only through their own judgment.
Published: 2024

16. Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data

Author: Zhang, Jingyu, Marone, Marc, Li, Tianjian, Van Durme, Benjamin, and Khashabi, Daniel
Subjects: Computer Science - Computation and Language
Abstract: To trust the fluent generations of large language models (LLMs), humans must be able to verify their correctness against trusted, external sources. Recent efforts, such as providing citations via retrieved documents or post-hoc provenance, enhance verifiability but provide no guarantees on their correctness. To address these limitations, we tackle the verifiability goal with a different philosophy: trivializing the verification process by developing models that quote verbatim statements from trusted sources in their pre-training data. We propose Quote-Tuning, which demonstrates the feasibility of aligning models to quote. The core of Quote-Tuning is a fast membership inference function that efficiently verifies text against trusted corpora. We leverage this tool to design a reward function to quantify quotes in model responses, and curate datasets for preference learning. Experiments show that Quote-Tuning significantly increases verbatim quotes from high-quality documents by up to 130% relative to base models while maintaining response quality. Quote-Tuning is applicable in different tasks, generalizes to out-of-domain data and diverse model families, and provides additional benefits to truthfulness. Our method not only serves as a hassle-free method to increase quoting but also opens up avenues for improving LLM trustworthiness through better verifiability.
Published: 2024

17. Advanced Feature Manipulation for Enhanced Change Detection Leveraging Natural Language Models

Author: Li, Zhenglin, Huang, Yangchen, Zhu, Mengran, Zhang, Jingyu, Chang, JingHao, and Liu, Houze
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Change detection is a fundamental task in computer vision that processes a bi-temporal image pair to differentiate between semantically altered and unaltered regions. Large language models (LLMs) have been utilized in various domains for their exceptional feature extraction capabilities and have shown promise in numerous downstream applications. In this study, we harness the power of a pre-trained LLM, extracting feature maps from extensive datasets, and employ an auxiliary network to detect changes. Unlike existing LLM-based change detection methods that solely focus on deriving high-quality feature maps, our approach emphasizes the manipulation of these feature maps to enhance semantic relevance., Comment: This version is not our full version based on our new progress, related data, and methodology we are dealing with, and based on the rules and the laws, we are adjusting our current version
Published: 2024

18. Tur[k]ingBench: A Challenge Benchmark for Web Agents

Author: Xu, Kevin, Kordi, Yeganeh, Nayak, Tanay, Asija, Ado, Wang, Yizhong, Sanders, Kate, Byerly, Adam, Zhang, Jingyu, Van Durme, Benjamin, and Khashabi, Daniel
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Human-Computer Interaction
Abstract: Can advanced multi-modal models effectively tackle complex web-based tasks? Such tasks are often found on crowdsourcing platforms, where crowdworkers engage in challenging micro-tasks within web-based environments. Building on this idea, we present TurkingBench, a benchmark consisting of tasks presented as web pages with textual instructions and multi-modal contexts. Unlike previous approaches that rely on artificially synthesized web pages, our benchmark uses natural HTML pages originally designed for crowdsourcing workers to perform various annotation tasks. Each task's HTML instructions are instantiated with different values derived from crowdsourcing tasks, creating diverse instances. This benchmark includes 32.2K instances spread across 158 tasks. To support the evaluation of TurkingBench, we have developed a framework that links chatbot responses to actions on web pages (e.g., modifying a text box, selecting a radio button). We assess the performance of cutting-edge private and open-source models, including language-only and vision-language models (such as GPT4 and InternVL), on this benchmark. Our results show that while these models outperform random chance, there is still significant room for improvement. We hope that this benchmark will drive progress in the evaluation and development of web-based agents.
Published: 2024

19. In-situ construction of MnCO3@CNTs nanosheets for high-capacity aqueous zinc ion batteries

Author: Li, Tao, Dai, GeLiang, Liu, SiYu, Zhang, JingYu, and Sun, AoKui
Published: 2024
Full Text: View/download PDF

20. The landscape of programmed cell death-related lncRNAs in Alzheimer’s disease and Parkinson’s disease

Author: Zhao, Ning, Wang, Junyi, Huang, Shan, Zhang, Jingyu, Bao, Jin, Ni, Haisen, Gao, Xinhang, and Zhang, Chunlong
Published: 2024
Full Text: View/download PDF

21. Dynamic analysis of planar four-bar mechanism with clearance in microgravity environment

Author: Ren, Jiechao, Zhang, Jingyu, and Wei, Qiang
Published: 2024
Full Text: View/download PDF

22. Higher cefazolin concentrations in synovial fluid with intraosseous regional prophylaxis in knee arthroplasty: a randomized controlled trial

Author: Zhang, Jingyu, Chen, Guangxiang, Yu, Xiao, Liu, Yubo, Li, Zhiqiang, Zhang, Xiangxin, Zhong, Qiao, and Xu, Renjie
Published: 2024
Full Text: View/download PDF

23. k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

Author: Hou, Abe Bohan, Zhang, Jingyu, Wang, Yichen, Khashabi, Daniel, and He, Tianxing
Subjects: Computer Science - Computation and Language, Computer Science - Cryptography and Security, Computer Science - Computers and Society, Computer Science - Machine Learning
Abstract: Recent watermarked generation algorithms inject detectable signatures during language generation to facilitate post-hoc detection. While token-level watermarks are vulnerable to paraphrase attacks, SemStamp (Hou et al., 2023) applies watermark on the semantic representation of sentences and demonstrates promising robustness. SemStamp employs locality-sensitive hashing (LSH) to partition the semantic space with arbitrary hyperplanes, which results in a suboptimal tradeoff between robustness and speed. We propose k-SemStamp, a simple yet effective enhancement of SemStamp, utilizing k-means clustering as an alternative of LSH to partition the embedding space with awareness of inherent semantic structure. Experimental results indicate that k-SemStamp saliently improves its robustness and sampling efficiency while preserving the generation quality, advancing a more effective tool for machine-generated text detection., Comment: Accepted to ACL 24 Findings
Published: 2024

24. The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts

Author: Shen, Lingfeng, Tan, Weiting, Chen, Sihao, Chen, Yunmo, Zhang, Jingyu, Xu, Haoran, Zheng, Boyuan, Koehn, Philipp, and Khashabi, Daniel
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: As the influence of large language models (LLMs) spans across global communities, their safety challenges in multilingual settings become paramount for alignment research. This paper examines the variations in safety challenges faced by LLMs across different languages and discusses approaches to alleviating such concerns. By comparing how state-of-the-art LLMs respond to the same set of malicious prompts written in higher- vs. lower-resource languages, we observe that (1) LLMs tend to generate unsafe responses much more often when a malicious prompt is written in a lower-resource language, and (2) LLMs tend to generate more irrelevant responses to malicious prompts in lower-resource languages. To understand where the discrepancy can be attributed, we study the effect of instruction tuning with reinforcement learning from human feedback (RLHF) or supervised finetuning (SFT) on the HH-RLHF dataset. Surprisingly, while training with high-resource languages improves model alignment, training in lower-resource languages yields minimal improvement. This suggests that the bottleneck of cross-lingual alignment is rooted in the pretraining stage. Our findings highlight the challenges in cross-lingual LLM safety, and we hope they inform future research in this direction.
Published: 2024

25. Enhancing network security with information-guided-enhanced Runge Kutta feature selection for intrusion detection

Author: Yuan, Li, Tian, Xiongjun, Yuan, Jiacheng, zhang, Jingyu, Dai, Xiaojing, Heidari, Ali Asghar, Chen, Huiling, and Yu, Sudan
Published: 2024
Full Text: View/download PDF

26. Association of Domestic Water Hardness with All-Cause and Cause-Specific Cancers: Evidence from 447,996 UK Biobank Participants

Author: Yang, Hongxi, Wang, Qi, Zhang, Shuquan, Zhang, Jingyu, Zhang, Yuan, and Feng, Jiangtao
Subjects: Drinking water -- Contamination, Cancer -- Diagnosis -- Care and treatment, Epidemiology -- Analysis, Environmental issues, Health, Diagnosis, Care and treatment, Analysis, Contamination, Health aspects, Environmental aspects
Abstract: BACKGROUND: Accumulating evidence suggests that domestic water hardness is linked to health outcomes, but its association to all-cause and cause-specific cancers warrants investigation. OBJECTIVE: The aim of this study was to investigate the association of domestic hard water with all-cause and cause-specific cancers. METHODS: In the prospective cohort study, a total of 447,996 participants from UK Biobank who were free of cancer at baseline were included and followed up for 16 y. All-cause and 22 common cause-specific cancer diagnoses were ascertained using hospital inpatient records and self-reported data until 30 November 2022. Domestic water hardness, measured by CaC[O.sub.3] concentrations, was obtained from the local water supply companies across England, Scotland, and Wales in 2005. Data were analyzed using Cox proportional hazard models, with adjustments for known measured confounders, including demographic, socioeconomic, clinical, biochemical, lifestyle, and environmental factors. RESULTS: Over a median follow-up of 13.6 y (range: 12.7-14.4 y), 58,028 all-cause cancer events were documented. A U-shaped relationship between domestic water hardness and all-cause cancers was observed (p for nonlinearity 60-120 mg/L), 0.88 (95% CI: 0.84, 0.91) for those exposed to hard water (>120-180 mg/L) and 1.06 (95% CI: 1.04, 1.08) for those exposed to very hard water (>180 mg/L). Additionally, domestic water hardness was associated with 11 of 22 cause-specific cancers, including cancers of the esophagus, stomach, colorectal tract, lung, breast, prostate, and bladder, as well as non-Hodgkin lymphoma, multiple myeloma, malignant melanoma, and hematological malignancies. Moreover, we observed a positive linear relationship between water hardness and bladder cancer. DISCUSSION: Our findings suggest that domestic water hardness was associated with all-cause and multiple cause-specific cancers. Findings from the UK Biobank support a potentially beneficial association between hard water and the incidence of all-cause cancer. However, very hard water may increase the risk of all-cause cancer. https://doi.org/10.1289/EHP13606, Introduction Cancer is a leading cause of mortality worldwide in 2020, nearly 10 million deaths were due to cancer, accounting for approximately one-sixth of all deaths. (1-3) It is projected [...]
Published: 2024
Full Text: View/download PDF

27. Correlation between omega-3 intake and the incidence of diabetic retinopathy based on NHANES from 2005 to 2008

Author: Zhang, Jingyu, Li, Huangdong, Deng, Qian, Huang, Amy Michelle, Qiu, Wangjian, Wang, Li, Xiang, Zheng, Yang, Ruiming, Liang, Jiamian, and Liu, Zhiping
Published: 2024
Full Text: View/download PDF

28. Patterns of participation and performance at the class level in English online education: A longitudinal cluster analysis of online K-12 after-school education in China

Author: Wang, Fei, Zhu, Xiaopeng, Pi, Lingli, Xiao, Xingyao, and Zhang, Jingyu
Published: 2024
Full Text: View/download PDF

29. A single-wavelength laser relaxation spectroscopy-based machine learning solution for apple mechanical damage detection

Author: Lian, Junbo, Zhang, Jingyu, Liu, Quan, Zhu, Runhao, Ning, Jingyuan, Xiong, Siyi, Hui, Guohua, Gao, Yuanyuan, and Lou, Xiongwei
Published: 2024
Full Text: View/download PDF

30. Wide-Angle Broadband Solar Absorber Based on Multilayer Etched Toroidal Structure

Author: Zhang, Zuoxin, Feng, Hengli, Wang, Jincheng, Liu, Chang, Fang, Dongchao, Wang, Guan, Zhang, Jingyu, Ran, Lingling, and Gao, Yang
Published: 2024
Full Text: View/download PDF

31. On the Zero-Shot Generalization of Machine-Generated Text Detectors

Author: Pu, Xiao, Zhang, Jingyu, Han, Xiaochuang, Tsvetkov, Yulia, and He, Tianxing
Subjects: Computer Science - Computation and Language
Abstract: The rampant proliferation of large language models, fluent enough to generate text indistinguishable from human-written language, gives unprecedented importance to the detection of machine-generated text. This work is motivated by an important research question: How will the detectors of machine-generated text perform on outputs of a new generator, that the detectors were not trained on? We begin by collecting generation data from a wide range of LLMs, and train neural detectors on data from each generator and test its performance on held-out generators. While none of the detectors can generalize to all generators, we observe a consistent and interesting pattern that the detectors trained on data from a medium-size LLM can zero-shot generalize to the larger version. As a concrete application, we demonstrate that robust detectors can be built on an ensemble of training data from medium-sized models.
Published: 2023

32. SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation

Author: Hou, Abe Bohan, Zhang, Jingyu, He, Tianxing, Wang, Yichen, Chuang, Yung-Sung, Wang, Hongwei, Shen, Lingfeng, Van Durme, Benjamin, Khashabi, Daniel, and Tsvetkov, Yulia
Subjects: Computer Science - Computation and Language
Abstract: Existing watermarking algorithms are vulnerable to paraphrase attacks because of their token-level design. To address this issue, we propose SemStamp, a robust sentence-level semantic watermarking algorithm based on locality-sensitive hashing (LSH), which partitions the semantic space of sentences. The algorithm encodes and LSH-hashes a candidate sentence generated by an LLM, and conducts sentence-level rejection sampling until the sampled sentence falls in watermarked partitions in the semantic embedding space. A margin-based constraint is used to enhance its robustness. To show the advantages of our algorithm, we propose a "bigram" paraphrase attack using the paraphrase that has the fewest bigram overlaps with the original sentence. This attack is shown to be effective against the existing token-level watermarking method. Experimental results show that our novel semantic watermark algorithm is not only more robust than the previous state-of-the-art method on both common and bigram paraphrase attacks, but also is better at preserving the quality of generation., Comment: Accepted to NAACL 24 Main
Published: 2023

33. Cross-modal and Cross-domain Knowledge Transfer for Label-free 3D Segmentation

Author: Zhang, Jingyu, Yang, Huitong, Wu, Dai-Jie, Keung, Jacky, Li, Xuesong, Zhu, Xinge, and Ma, Yuexin
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Current state-of-the-art point cloud-based perception methods usually rely on large-scale labeled data, which requires expensive manual annotations. A natural option is to explore the unsupervised methodology for 3D perception tasks. However, such methods often face substantial performance-drop difficulties. Fortunately, we found that there exist amounts of image-based datasets and an alternative can be proposed, i.e., transferring the knowledge in the 2D images to 3D point clouds. Specifically, we propose a novel approach for the challenging cross-modal and cross-domain adaptation task by fully exploring the relationship between images and point clouds and designing effective feature alignment strategies. Without any 3D labels, our method achieves state-of-the-art performance for 3D point cloud semantic segmentation on SemanticKITTI by using the knowledge of KITTI360 and GTA5, compared to existing unsupervised and weakly-supervised baselines., Comment: 12 pages,4 figures,accepted
Published: 2023

34. SimAC: simulating agile collaboration to generate acceptance criteria in user story elaboration

Author: Li, Yishu, Keung, Jacky, Yang, Zhen, Ma, Xiaoxue, Zhang, Jingyu, and Liu, Shuo
Published: 2024
Full Text: View/download PDF

35. A Multi-Strategy Computer-Assisted EFL Writing Learning System with Deep Learning Incorporated and Its Effects on Learning: A Writing Feedback Perspective

Author: Chen, Binbin, Bao, Lina, Zhang, Rui, Zhang, Jingyu, Liu, Feng, Wang, Shuai, and Li, Mingjiang
Abstract: Language learning has increasingly benefited from Computer-Assisted Language Learning (CALL) technologies, especially with Artificial Intelligence involved in recent years. CALL in writing learning acknowledged as the core of language learning is being realized by technologies like Automated Writing Evaluation (AWE), and Automated Essay Scoring (AES), which have developed considerably in both computer and language education fields. AWE has effectively enhanced EFL students' writing performance to some extent, but such technology can only provide an evaluation in the form of scores, the majority of which are based on holistic scoring, resulting in the inability to provide comprehensive and detailed content-based feedback. In order to provide not only the writing multiple trait-specific evaluation scores, but also detailed writing feedback, we proposed a computer-assisted EFL writing learning system incorporating the neural network models and a couple of semantic-based NLP techniques, MsCAEWL, which fully meets the requirements of writing feedback theory, i.e., multiple, continuous, timely, clear, and multi-aspect guidance interactive feedback. The results of comparison experiments with the AWE baseline models and human raters demonstrated the superiority and the high correlation contained by the proposed system. The independent-sample t-test and paired-sample t-test results of the experiments on MsCAEWL effect validation suggested the significant impact of our proposed system in enhancing students' EFL writing proficiency.
Published: 2024
Full Text: View/download PDF

36. Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception

Author: Yang, Kun, Yang, Dingkang, Zhang, Jingyu, Li, Mingcheng, Liu, Yang, Liu, Jing, Wang, Hanqi, Sun, Peng, and Song, Liang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Multi-agent collaborative perception as a potential application for vehicle-to-everything communication could significantly improve the perception performance of autonomous vehicles over single-agent perception. However, several challenges remain in achieving pragmatic information sharing in this emerging research. In this paper, we propose SCOPE, a novel collaborative perception framework that aggregates the spatio-temporal awareness characteristics across on-road agents in an end-to-end manner. Specifically, SCOPE has three distinct strengths: i) it considers effective semantic cues of the temporal context to enhance current representations of the target agent; ii) it aggregates perceptually critical spatial information from heterogeneous agents and overcomes localization errors via multi-scale feature interactions; iii) it integrates multi-source representations of the target agent based on their complementary contributions by an adaptive fusion paradigm. To thoroughly evaluate SCOPE, we consider both real-world and simulated scenarios of collaborative 3D object detection tasks on three datasets. Extensive experiments demonstrate the superiority of our approach and the necessity of the proposed components., Comment: Accepted by ICCV 2023
Published: 2023

37. Cuproptosis-related lncRNA JPX regulates malignant cell behavior and epithelial-immune interaction in head and neck squamous cell carcinoma via miR-193b-3p/PLAU axis

Author: Sun, Mouyuan, Zhan, Ning, Yang, Zhan, Zhang, Xiaoting, Zhang, Jingyu, Peng, Lianjie, Luo, Yaxian, Lin, Lining, Lou, Yiting, You, Dongqi, Qiu, Tao, Liu, Zhichao, Wang, Qianting, Liu, Yu, Sun, Ping, Yu, Mengfei, and Wang, Huiming
Published: 2024
Full Text: View/download PDF

38. Correlated tunneling in high-order above threshold dissociative ionization of H2

Author: Hao, Xiaolei, Wang, Junping, Zhang, Zhaohan, Qin, Jiarui, Shu, Zheng, Li, Chan, Zhang, Jingyu, Li, Weidong, He, Feng, and Chen, Jing
Published: 2024
Full Text: View/download PDF

39. An electronic patient-reported outcome symptom monitor: the Chinese experience with rapid development of a ready-to-go symptom monitor

Author: Zhang, Jingyu, Guo, Qing, Chen, Jiaojiao, Liu, Yajie, Kang, Dan, Xiang, Rumei, Shi, Jiaheng, Yang, Jinliang, Tang, Xiaojun, Nie, Yuxian, Qiu, Jingfu, Wang, Xu, Yang, Zhu, Liu, Jie, and Shi, Qiuling
Published: 2024
Full Text: View/download PDF

40. Volatile-mediated interspecific plant interaction promotes root colonization by beneficial bacteria via induced shifts in root exudation

Author: Zhou, Xingang, Zhang, Jingyu, Shi, Jibo, Khashi u Rahman, Muhammad, Liu, Hongwei, Wei, Zhong, Wu, Fengzhi, and Dini-Andreote, Francisco
Published: 2024
Full Text: View/download PDF

41. Nr-CWS regulates METTL3-mediated m6A modification of CDS2 mRNA in vascular endothelial cells and has prognostic significance

Author: Zhang, Jingyu, Chen, Feifei, Wei, Wuhan, Ning, Qianqian, Zhu, Dong, Fan, Jiang, Wang, Haoyu, Wang, Jian, Zhang, Aijun, Jin, Peisheng, and Li, Qiang
Published: 2024
Full Text: View/download PDF

42. Global burden of thyroid cancer from 1990 to 2021: a systematic analysis from the Global Burden of Disease Study 2021

Author: Zhou, Tianjiao, Wang, Xiaoting, Zhang, Jingyu, Zhou, Enhui, Xu, Chen, Shen, Ying, Zou, Jianyin, Lu, Wen, Su, Kaiming, Huang, Weijun, Yi, Hongliang, and Yin, Shankai
Published: 2024
Full Text: View/download PDF

43. Effect of fly ash and curing temperature on the properties of magnesium phosphate repair mortar

Author: Liu, Junxia, Zhang, Jingyu, Li, Anbang, Xia, Xiaomin, and Chen, Junpeng
Published: 2024
Full Text: View/download PDF

44. Shortness of breath on the day of discharge: an early alert for post-discharge complications in patients undergoing lung cancer surgery

Author: Kang, Dan, Lei, Cheng, Zhang, Yong, Wei, Xing, Dai, Wei, Xu, Wei, Zhang, Jingyu, Yu, Qingsong, Su, Xueyao, Huang, Yanyan, and Shi, Qiuling
Published: 2024
Full Text: View/download PDF

45. Sex dimorphism of IL-17-secreting peripheral blood mononuclear cells in ankylosing spondylitis based on bioinformatics analysis and machine learning

Author: Li, Sifang, Chao, Hua, Li, Zihao, Chen, Siwen, Zhang, Jingyu, Hao, Wenjun, Zhang, Shuai, Liu, Caijun, and Liu, Hui
Published: 2024
Full Text: View/download PDF

46. HMMF: a hybrid multi-modal fusion framework for predicting drug side effect frequencies

Author: Liu, Wuyong, Zhang, Jingyu, Qiao, Guanyu, Bian, Jilong, Dong, Benzhi, and Li, Yang
Published: 2024
Full Text: View/download PDF

47. A Georeferenced Dataset for Mapping and Assessing Subgrade Defects in China’s High-Speed Railways

Author: Wang, Jinchen, Wang, Luqi, Zhang, Yinsheng, Zhang, Jingyu, Li, Jianlin, and Li, Sen
Published: 2024
Full Text: View/download PDF

48. Multi-defect risk assessment in high-speed rail subgrade infrastructure in China

Author: Wang, Jinchen, Zhang, Yinsheng, Wang, Luqi, Sun, Yifan, Zhang, Jingyu, Li, Jianlin, and Li, Sen
Published: 2024
Full Text: View/download PDF

49. Analytical evaluation of circulating tumor DNA sequencing assays

Author: Li, Wenjin, Huang, Xiayu, Patel, Rajesh, Schleifman, Erica, Fu, Shijing, Shames, David S., and Zhang, Jingyu
Published: 2024
Full Text: View/download PDF

50. PXMP4 promotes gastric cancer cell epithelial-mesenchymal transition via the PI3K/AKT signaling pathway

Author: Li, Wei, Dong, Xiangyang, Wan, Zhidan, Wang, Wenxin, Zhang, Jingyu, Mi, Yongrun, Li, Ruiyuan, Xu, Zishan, Wang, Beixi, Li, Na, and He, Guoyang
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

5,864 results on '"Zhang, Jingyu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources