94,520 results on '"ZHANG, Xiao"'
Search Results
2. Dense halves in balanced 2-partition of K4-free graphs
- Author
-
Xu, Yue and Zhang, Xiao-Dong
- Subjects
Mathematics - Combinatorics ,05C35, 05C69 - Abstract
A balanced 2-partition of a graph is a bipartition $A,A^c$ of $V(G)$ such that $|A|=|A^c|$. Balogh, Clemen, and Lidick\'y conjectured that for every $K_4$-free graph on $n$ (even) vertices, there exists a balanced 2-partition $A,A^c$ such that $\max\{e(A),e(A^c)\}\leq n^2/16$ edges. In this paper, we present a family of counterexamples to the conjecture and provide a new upper bound ($0.074n^2$) for every sufficiently large even integer $n$., Comment: 18 pages, 3 figures
- Published
- 2024
3. Magnifier: Detecting Network Access via Lightweight Traffic-based Fingerprints
- Author
-
Li, Wenhao, Wang, Qiang, Bao, Huaifeng, Zhang, Xiao-Yu, Ying, Lingyun, and Li, Zhaoxuan
- Subjects
Computer Science - Networking and Internet Architecture ,Computer Science - Cryptography and Security - Abstract
Network access detection plays a crucial role in global network management, enabling efficient network monitoring and topology measurement by identifying unauthorized network access and gathering detailed information about mobile devices. Existing methods for endpoint-based detection primarily rely on deploying monitoring software to recognize network connections. However, the challenges associated with developing and maintaining such systems have limited their universality and coverage in practical deployments, especially given the cost implications of covering a wide array of devices with heterogeneous operating systems. To tackle the issues, we propose Magnifier for mobile device network access detection that, for the first time, passively infers access patterns from backbone traffic at the gateway level. Magnifier's foundation is the creation of device-specific access patterns using the innovative Domain Name Forest (dnForest) fingerprints. We then employ a two-stage distillation algorithm to fine-tune the weights of individual Domain Name Trees (dnTree) within each dnForest, emphasizing the unique device fingerprints. With these meticulously crafted fingerprints, Magnifier efficiently infers network access from backbone traffic using a lightweight fingerprint matching algorithm. Our experimental results, conducted in real-world scenarios, demonstrate that Magnifier exhibits exceptional universality and coverage in both initial and repetitive network access detection in real-time. To facilitate further research, we have thoughtfully curated the NetCess2023 dataset, comprising network access data from 26 different models across 7 brands, covering the majority of mainstream mobile devices. We have also made both the Magnifier prototype and the NetCess2023 dataset publicly available\footnote{https://github.com/SecTeamPolaris/Magnifier}.
- Published
- 2024
4. Trigger$^3$: Refining Query Correction via Adaptive Model Selector
- Author
-
Zhang, Kepu, Sun, Zhongxiang, Zhang, Xiao, Zang, Xiaoxue, Zheng, Kai, Song, Yang, and Xu, Jun
- Subjects
Computer Science - Computation and Language - Abstract
In search scenarios, user experience can be hindered by erroneous queries due to typos, voice errors, or knowledge gaps. Therefore, query correction is crucial for search engines. Current correction models, usually small models trained on specific data, often struggle with queries beyond their training scope or those requiring contextual understanding. While the advent of Large Language Models (LLMs) offers a potential solution, they are still limited by their pre-training data and inference cost, particularly for complex queries, making them not always effective for query correction. To tackle these, we propose Trigger$^3$, a large-small model collaboration framework that integrates the traditional correction model and LLM for query correction, capable of adaptively choosing the appropriate correction method based on the query and the correction results from the traditional correction model and LLM. Trigger$^3$ first employs a correction trigger to filter out correct queries. Incorrect queries are then corrected by the traditional correction model. If this fails, an LLM trigger is activated to call the LLM for correction. Finally, for queries that no model can correct, a fallback trigger decides to return the original query. Extensive experiments demonstrate Trigger$^3$ outperforms correction baselines while maintaining efficiency.
- Published
- 2024
5. Retrieval-Augmented Semantic Parsing: Using Large Language Models to Improve Generalization
- Author
-
Zhang, Xiao, Meng, Qianru, and Bos, Johan
- Subjects
Computer Science - Computation and Language - Abstract
Open-domain semantic parsing remains a challenging task, as models often rely on heuristics and struggle to handle unseen concepts. In this paper, we investigate the potential of large language models (LLMs) for this task and introduce Retrieval-Augmented Semantic Parsing (RASP), a simple yet effective approach that integrates external lexical knowledge into the parsing process. Our experiments not only show that LLMs outperform previous encoder-decoder baselines for semantic parsing, but that RASP further enhances their ability to predict unseen concepts, nearly doubling the performance of previous models on out-of-distribution concepts. These findings highlight the promise of leveraging large language models and retrieval mechanisms for robust and open-domain semantic parsing., Comment: Submitted to ARR
- Published
- 2024
6. Reconciling Human Development and Giant Panda Protection Goals: Cost-efficiency Evaluation of Farmland Reverting and Energy Substitution Programs in Wolong National Reserve
- Author
-
Liu, Keyi, Chen, Yufeng, Xu, Liyan, Zhang, Xiao, Wang, Zilin, Li, Hailong, Yang, Yansheng, You, Hong, and Li, Dihua
- Subjects
Computer Science - Computers and Society - Abstract
Balancing human development with conservation necessitates ecological policies that optimize outcomes within limited budgets, highlighting the importance of cost-efficiency and local impact analysis. This study employs the Socio-Econ-Ecosystem Multipurpose Simulator (SEEMS), an Agent-Based Model (ABM) designed for simulating small-scale Coupled Human and Nature Systems (CHANS), to evaluate the cost-efficiency of two major ecology conservation programs: Grain-to-Green (G2G) and Firewood-to-Electricity (F2E). Focusing on China Wolong National Reserve, a worldwide hot spot for flagship species conservation, the study evaluates the direct benefits of these programs, including reverted farmland area and firewood consumption, along with their combined indirect benefits on habitat quality, carbon emissions, and gross economic benefits. The findings are as follows: (1) The G2G program achieves optimal financial efficiency at approximately 500 CNY/Mu, with diminishing returns observed beyond 1000 CNY/Mu; (2) For the F2E program, the most fiscally cost-efficient option arises when the subsidized electricity price is at 0.4-0.5 CNY/kWh, while further reductions of the prices to below 0.1 CNY/kWh result in a diminishing cost-benefit ratio; (3) Comprehensive cost-efficiency analysis reveals no significant link between financial burden and carbon emissions, but a positive correlation with habitat quality and an inverted U-shaped relationship with total economic income; (4) Pareto analysis identifies 18 optimal dual-policy combinations for balancing carbon footprint, habitat quality, and gross economic benefits; (5) Posterior Pareto optimization further refines the selection of a specific policy scheme for a given realistic scenario. The analytical framework of this paper helps policymakers design economically viable and environmentally sustainable policies, addressing global conservation challenges., Comment: 28 pages, 8 figures
- Published
- 2024
7. Nested Diffusion Models Using Hierarchical Latent Priors
- Author
-
Zhang, Xiao, Jiang, Ruoxi, Willett, Rebecca, and Maire, Michael
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We introduce nested diffusion models, an efficient and powerful hierarchical generative framework that substantially enhances the generation quality of diffusion models, particularly for images of complex scenes. Our approach employs a series of diffusion models to progressively generate latent variables at different semantic levels. Each model in this series is conditioned on the output of the preceding higher-level models, culminating in image generation. Hierarchical latent variables guide the generation process along predefined semantic pathways, allowing our approach to capture intricate structural details while significantly improving image quality. To construct these latent variables, we leverage a pre-trained visual encoder, which learns strong semantic visual representations, and modulate its capacity via dimensionality reduction and noise injection. Across multiple datasets, our system demonstrates significant enhancements in image quality for both unconditional and class/text conditional generation. Moreover, our unconditional generation system substantially outperforms the baseline conditional system. These advancements incur minimal computational overhead as the more abstract levels of our hierarchy work with lower-dimensional representations.
- Published
- 2024
8. Joint Mode Selection and Beamforming Designs for Hybrid-RIS Assisted ISAC Systems
- Author
-
Lin, Yingbin, Wang, Feng, Zhang, Xiao, Han, Guojun, and Lau, Vincent K. N.
- Subjects
Computer Science - Information Theory ,Electrical Engineering and Systems Science - Signal Processing - Abstract
This paper considers a hybrid reconfigurable intelligent surface (RIS) assisted integrated sensing and communication (ISAC) system, where each RIS element can flexibly switch between the active and passive modes. Subject to the signal-to-interference-plus-noise ratio (SINR) constraint for each communication user (CU) and the transmit power constraints for both the base station (BS) and the active RIS elements, with the objective of maximizing the minimum beampattern gain among multiple targets, we jointly optimize the BS transmit beamforming for ISAC and the mode selection of each RIS reflecting element, as well as the RIS reflection coefficient matrix. Such formulated joint hybrid-RIS assisted ISAC design problem is a mixed-integer nonlinear program, which is decomposed into two low-dimensional subproblems being solved in an alternating manner. Specifically, by using the semidefinite relaxation (SDR) technique along with the rank-one beamforming construction process, we efficiently obtain the optimal ISAC transmit beamforming design at the BS. Via the SDR and successive convex approximation (SCA) techniques, we jointly determine the active/passive mode selection and reflection coefficient for each RIS element. Numerical results demonstrate that the proposed design solution is significantly superior to the existing baseline solutions., Comment: 5 pages, 4 figures
- Published
- 2024
9. Can Targeted Clean-Label Poisoning Attacks Generalize?
- Author
-
Chen, Zhizhen, Dutta, Subrat Kishore, Zhao, Zhengyu, Lin, Chenhao, Shen, Chao, and Zhang, Xiao
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning - Abstract
Targeted poisoning attacks aim to compromise the model's prediction on specific target samples. In a common clean-label setting, they are achieved by slightly perturbing a subset of training samples given access to those specific targets. Despite continuous efforts, it remains unexplored whether such attacks can generalize to unknown variations of those targets. In this paper, we take the first step to systematically study this generalization problem. Observing that the widely adopted, cosine similarity-based attack exhibits limited generalizability, we propose a well-generalizable attack that leverages both the direction and magnitude of model gradients. In particular, we explore diverse target variations, such as an object with varied viewpoints and an animal species with distinct appearances. Extensive experiments across various generalization scenarios demonstrate that our method consistently achieves the best attack effectiveness. For example, our method outperforms the cosine similarity-based attack by 20.95% in attack success rate with similar overall accuracy, averaged over four models on two image benchmark datasets. The code is available at https://github.com/jiaangk/generalizable_tcpa, Comment: 12 pages, 5 figures, 5 tables
- Published
- 2024
10. Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data
- Author
-
DeAndres-Tame, Ivan, Tolosana, Ruben, Melzi, Pietro, Vera-Rodriguez, Ruben, Kim, Minchul, Rathgeb, Christian, Liu, Xiaoming, Gomez, Luis F., Morales, Aythami, Fierrez, Julian, Ortega-Garcia, Javier, Zhong, Zhizhou, Huang, Yuge, Mi, Yuxi, Ding, Shouhong, Zhou, Shuigeng, He, Shuai, Fu, Lingzhi, Cong, Heng, Zhang, Rongyu, Xiao, Zhihong, Smirnov, Evgeny, Pimenov, Anton, Grigorev, Aleksei, Timoshenko, Denis, Asfaw, Kaleb Mesfin, Low, Cheng Yaw, Liu, Hao, Wang, Chuyi, Zuo, Qing, He, Zhixiang, Shahreza, Hatef Otroshi, George, Anjith, Unnervik, Alexander, Rahimi, Parsa, Marcel, Sébastien, Neto, Pedro C., Huber, Marco, Kolf, Jan Niklas, Damer, Naser, Boutros, Fadi, Cardoso, Jaime S., Sequeira, Ana F., Atzori, Andrea, Fenu, Gianni, Marras, Mirko, Štruc, Vitomir, Yu, Jiang, Li, Zhangjie, Li, Jichun, Zhao, Weisong, Lei, Zhen, Zhu, Xiangyu, Zhang, Xiao-Yu, Biesseck, Bernardo, Vidal, Pedro, Coelho, Luiz, Granada, Roger, and Menotti, David
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society ,Computer Science - Machine Learning - Abstract
Synthetic data is gaining increasing popularity for face recognition technologies, mainly due to the privacy concerns and challenges associated with obtaining real data, including diverse scenarios, quality, and demographic groups, among others. It also offers some advantages over real data, such as the large amount of data that can be generated or the ability to customize it to adapt to specific problem-solving needs. To effectively use such data, face recognition models should also be specifically designed to exploit synthetic data to its fullest potential. In order to promote the proposal of novel Generative AI methods and synthetic data, and investigate the application of synthetic data to better train face recognition systems, we introduce the 2nd FRCSyn-onGoing challenge, based on the 2nd Face Recognition Challenge in the Era of Synthetic Data (FRCSyn), originally launched at CVPR 2024. This is an ongoing challenge that provides researchers with an accessible platform to benchmark i) the proposal of novel Generative AI methods and synthetic data, and ii) novel face recognition systems that are specifically proposed to take advantage of synthetic data. We focus on exploring the use of synthetic data both individually and in combination with real data to solve current challenges in face recognition such as demographic bias, domain adaptation, and performance constraints in demanding situations, such as age disparities between training and testing, changes in the pose, or occlusions. Very interesting findings are obtained in this second edition, including a direct comparison with the first one, in which synthetic databases were restricted to DCFace and GANDiffFace.
- Published
- 2024
11. PROFIT: A Specialized Optimizer for Deep Fine Tuning
- Author
-
Chakravarthy, Anirudh S, Zheng, Shuai Kyle, Huang, Xin, Hemachandra, Sachithra, Zhang, Xiao, Chai, Yuning, and Chen, Zhao
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Fine-tuning pre-trained models has become invaluable in computer vision and robotics. Recent fine-tuning approaches focus on improving efficiency rather than accuracy by using a mixture of smaller learning rates or frozen backbones. To return the spotlight to model accuracy, we present PROFIT (Proximally Restricted Optimizer For Iterative Training), one of the first optimizers specifically designed for incrementally fine-tuning converged models on new tasks or datasets. Unlike traditional optimizers such as SGD or Adam, which make minimal assumptions due to random initialization, PROFIT leverages the structure of a converged model to regularize the optimization process, leading to improved results. By employing a simple temporal gradient orthogonalization process, PROFIT outperforms traditional fine-tuning methods across various tasks: image classification, representation learning, and large-scale motion prediction. Moreover, PROFIT is encapsulated within the optimizer logic, making it easily integrated into any training pipeline with minimal engineering effort. A new class of fine-tuning optimizers like PROFIT can drive advancements as fine-tuning and incremental training become increasingly prevalent, reducing reliance on costly model training from scratch., Comment: technical report
- Published
- 2024
12. Brownian spin-locking effect
- Author
-
Zhang, Xiao, Chen, Peiyang, Li, Mei, Shi, Yuzhi, Hasman, Erez, Wang, Bo, and Chen, Xianfeng
- Subjects
Physics - Optics ,Condensed Matter - Disordered Systems and Neural Networks ,Physics - Applied Physics ,Physics - Biological Physics - Abstract
Brownian systems are characterized by spatiotemporal disorder, which arises from the erratic motion of particles driven by thermal fluctuations. When light interacts with such systems, it typically produces unpolarized and uncorrelated fields. Here, we report the observation of a large-scale spin-locking effect of light within a Brownian medium. In an observation direction perpendicular to the incident wave momentum, scattering naturally divides into two diffusion regions, each associated with an opposite spin from the Brownian nanoparticles. This effect arises from the intrinsic spin-orbit interactions of scattering from individual nanoparticles, which ubiquitously generate radiative spin fields that propagate through the Brownian medium with multiple incoherent scattering. It offers a novel experimental platform for exploring macroscale spin behaviors of diffused light, with potential applications in precision metrology for measuring various nanoparticle properties. Our findings may inspire the study of analogous phenomena for different waves from novel spin-orbit interactions in complex disordered systems., Comment: 48 pages, 20 figures
- Published
- 2024
13. ATP-LLaVA: Adaptive Token Pruning for Large Vision Language Models
- Author
-
Ye, Xubing, Gan, Yukang, Ge, Yixiao, Zhang, Xiao-Ping, and Tang, Yansong
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Large Vision Language Models (LVLMs) have achieved significant success across multi-modal tasks. However, the computational cost of processing long visual tokens can be prohibitively expensive on resource-limited devices. Previous methods have identified redundancy in visual tokens within the Large Language Model (LLM) decoder layers and have mitigated this by pruning tokens using a pre-defined or fixed ratio, thereby reducing computational overhead. Nonetheless, we observe that the impact of pruning ratio varies across different LLM layers and instances (image-prompt pairs). Therefore, it is essential to develop a layer-wise and instance-wise vision token pruning strategy to balance computational cost and model performance effectively. We propose ATP-LLaVA, a novel approach that adaptively determines instance-specific token pruning ratios for each LLM layer. Specifically, we introduce an Adaptive Token Pruning (ATP) module, which computes the importance score and pruning threshold based on input instance adaptively. The ATP module can be seamlessly integrated between any two LLM layers with negligible computational overhead. Additionally, we develop a Spatial Augmented Pruning (SAP) strategy that prunes visual tokens with both token redundancy and spatial modeling perspectives. Our approach reduces the average token count by 75% while maintaining performance, with only a minimal 1.9% degradation across seven widely used benchmarks. The project page can be accessed via https://yxxxb.github.io/ATP-LLaVA-page/., Comment: 11 pages, 4 figures
- Published
- 2024
14. Night-Side Relativistic Electron Precipitation Bursts in the Outer Radiation Belt: Insights from ELFIN and THEMIS
- Author
-
Lu, Xi, Zhang, Xiao-Jia, Artemyev, Anton V., Angelopoulos, Vassilis, and Bortnik, Jacob
- Subjects
Physics - Space Physics ,Physics - Plasma Physics - Abstract
Electromagnetic whistler-mode waves play a crucial role in the acceleration and precipitation of radiation belt electrons. Statistical surveys of wave characteristics suggest that these waves should preferentially scatter and precipitate relativistic electrons on the day side. However, the night-side region is expected to be primarily associated with electron acceleration. The recent low-altitude observations reveal relativistic electron precipitation in the night-side region. In this paper, we present statistical surveys of night-side relativistic electron losses due to intense precipitation bursts. We demonstrate that such bursts are associated with storm time substorm injections and are likely related to relativistic electron scattering by ducted whistler-mode waves. We also speculate on the role of injections in creating conditions favorable for relativistic electron precipitation.
- Published
- 2024
15. LampMark: Proactive Deepfake Detection via Training-Free Landmark Perceptual Watermarks
- Author
-
Wang, Tianyi, Huang, Mengxiao, Cheng, Harry, Zhang, Xiao, and Shen, Zhiqi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Deepfake facial manipulation has garnered significant public attention due to its impacts on enhancing human experiences and posing privacy threats. Despite numerous passive algorithms that have been attempted to thwart malicious Deepfake attacks, they mostly struggle with the generalizability challenge when confronted with hyper-realistic synthetic facial images. To tackle the problem, this paper proposes a proactive Deepfake detection approach by introducing a novel training-free landmark perceptual watermark, LampMark for short. We first analyze the structure-sensitive characteristics of Deepfake manipulations and devise a secure and confidential transformation pipeline from the structural representations, i.e. facial landmarks, to binary landmark perceptual watermarks. Subsequently, we present an end-to-end watermarking framework that imperceptibly and robustly embeds and extracts watermarks concerning the images to be protected. Relying on promising watermark recovery accuracies, Deepfake detection is accomplished by assessing the consistency between the content-matched landmark perceptual watermark and the robustly recovered watermark of the suspect image. Experimental results demonstrate the superior performance of our approach in watermark recovery and Deepfake detection compared to state-of-the-art methods across in-dataset, cross-dataset, and cross-manipulation scenarios., Comment: Accepted to ACM MM 2024
- Published
- 2024
- Full Text
- View/download PDF
16. PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement
- Author
-
Ayalew, Tewodros, Zhang, Xiao, Wu, Kevin Yuanbo, Jiang, Tianchong, Maire, Michael, and Walter, Matthew R.
- Subjects
Computer Science - Robotics ,Computer Science - Artificial Intelligence - Abstract
We present PROGRESSOR, a novel framework that learns a task-agnostic reward function from videos, enabling policy training through goal-conditioned reinforcement learning (RL) without manual supervision. Underlying this reward is an estimate of the distribution over task progress as a function of the current, initial, and goal observations that is learned in a self-supervised fashion. Crucially, PROGRESSOR refines rewards adversarially during online RL training by pushing back predictions for out-of-distribution observations, to mitigate distribution shift inherent in non-expert observations. Utilizing this progress prediction as a dense reward together with an adversarial push-back, we show that PROGRESSOR enables robots to learn complex behaviors without any external supervision. Pretrained on large-scale egocentric human video from EPIC-KITCHENS, PROGRESSOR requires no fine-tuning on in-domain task-specific data for generalization to real-robot offline RL under noisy demonstrations, outperforming contemporary methods that provide dense visual reward for robotic learning. Our findings highlight the potential of PROGRESSOR for scalable robotic applications where direct action labels and task-specific rewards are not readily available., Comment: 15 pages,13 figures
- Published
- 2024
17. UVCG: Leveraging Temporal Consistency for Universal Video Protection
- Author
-
Li, KaiZhou, Gu, Jindong, Yu, Xinchun, Cao, Junjie, Tang, Yansong, and Zhang, Xiao-Ping
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
The security risks of AI-driven video editing have garnered significant attention. Although recent studies indicate that adding perturbations to images can protect them from malicious edits, directly applying image-based methods to perturb each frame in a video becomes ineffective, as video editing techniques leverage the consistency of inter-frame information to restore individually perturbed content. To address this challenge, we leverage the temporal consistency of video content to propose a straightforward and efficient, yet highly effective and broadly applicable approach, Universal Video Consistency Guard (UVCG). UVCG embeds the content of another video(target video) within a protected video by introducing continuous, imperceptible perturbations which has the ability to force the encoder of editing models to map continuous inputs to misaligned continuous outputs, thereby inhibiting the generation of videos consistent with the intended textual prompts. Additionally leveraging similarity in perturbations between adjacent frames, we improve the computational efficiency of perturbation generation by employing a perturbation-reuse strategy. We applied UVCG across various versions of Latent Diffusion Models (LDM) and assessed its effectiveness and generalizability across multiple LDM-based editing pipelines. The results confirm the effectiveness, transferability, and efficiency of our approach in safeguarding video content from unauthorized modifications.
- Published
- 2024
18. GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs
- Author
-
Basani, Advik Raj and Zhang, Xiao
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Cryptography and Security ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Large Language Models (LLMs) have shown impressive proficiency across a range of natural language processing tasks yet remain vulnerable to adversarial prompts, known as jailbreak attacks, carefully designed to elicit harmful responses from LLMs. Traditional methods rely on manual heuristics, which suffer from limited generalizability. While being automatic, optimization-based attacks often produce unnatural jailbreak prompts that are easy to detect by safety filters or require high computational overhead due to discrete token optimization. Witnessing the limitations of existing jailbreak methods, we introduce Generative Adversarial Suffix Prompter (GASP), a novel framework that combines human-readable prompt generation with Latent Bayesian Optimization (LBO) to improve adversarial suffix creation in a fully black-box setting. GASP leverages LBO to craft adversarial suffixes by efficiently exploring continuous embedding spaces, gradually optimizing the model to improve attack efficacy while balancing prompt coherence through a targeted iterative refinement procedure. Our experiments show that GASP can generate natural jailbreak prompts, significantly improving attack success rates, reducing training times, and accelerating inference speed, thus making it an efficient and scalable solution for red-teaming LLMs., Comment: 28 pages, 9 tables, 13 figures; under review at CVPR '25
- Published
- 2024
19. SniffySquad: Patchiness-Aware Gas Source Localization with Multi-Robot Collaboration
- Author
-
Cheng, Yuhan, Chen, Xuecheng, Yang, Yixuan, Wang, Haoyang, Xu, Jingao, Hong, Chaopeng, Xu, Susu, Zhang, Xiao-Ping, Liu, Yunhao, and Chen, Xinlei
- Subjects
Computer Science - Robotics ,Computer Science - Multiagent Systems - Abstract
Gas source localization is pivotal for the rapid mitigation of gas leakage disasters, where mobile robots emerge as a promising solution. However, existing methods predominantly schedule robots' movements based on reactive stimuli or simplified gas plume models. These approaches typically excel in idealized, simulated environments but fall short in real-world gas environments characterized by their patchy distribution. In this work, we introduce SniffySquad, a multi-robot olfaction-based system designed to address the inherent patchiness in gas source localization. SniffySquad incorporates a patchiness-aware active sensing approach that enhances the quality of data collection and estimation. Moreover, it features an innovative collaborative role adaptation strategy to boost the efficiency of source-seeking endeavors. Extensive evaluations demonstrate that our system achieves an increase in the success rate by $20\%+$ and an improvement in path efficiency by $30\%+$, outperforming state-of-the-art gas source localization solutions.
- Published
- 2024
20. ResLearn: Transformer-based Residual Learning for Metaverse Network Traffic Prediction
- Author
-
Manjunath, Yoga Suhas Kuruba, Szymanowski, Mathew, Wissborn, Austin, Li, Mushu, Zhao, Lian, and Zhang, Xiao-Ping
- Subjects
Computer Science - Artificial Intelligence ,Electrical Engineering and Systems Science - Signal Processing - Abstract
Our work proposes a comprehensive solution for predicting Metaverse network traffic, addressing the growing demand for intelligent resource management in eXtended Reality (XR) services. We first introduce a state-of-the-art testbed capturing a real-world dataset of virtual reality (VR), augmented reality (AR), and mixed reality (MR) traffic, made openly available for further research. To enhance prediction accuracy, we then propose a novel view-frame (VF) algorithm that accurately identifies video frames from traffic while ensuring privacy compliance, and we develop a Transformer-based progressive error-learning algorithm, referred to as ResLearn for Metaverse traffic prediction. ResLearn significantly improves time-series predictions by using fully connected neural networks to reduce errors, particularly during peak traffic, outperforming prior work by 99%. Our contributions offer Internet service providers (ISPs) robust tools for real-time network management to satisfy Quality of Service (QoS) and enhance user experience in the Metaverse.
- Published
- 2024
21. Discern-XR: An Online Classifier for Metaverse Network Traffic
- Author
-
Manjunath, Yoga Suhas Kuruba, Wissborn, Austin, Szymanowski, Mathew, Li, Mushu, Zhao, Lian, and Zhang, Xiao-Ping
- Subjects
Computer Science - Artificial Intelligence ,Electrical Engineering and Systems Science - Signal Processing - Abstract
In this paper, we design an exclusive Metaverse network traffic classifier, named Discern-XR, to help Internet service providers (ISP) and router manufacturers enhance the quality of Metaverse services. Leveraging segmented learning, the Frame Vector Representation (FVR) algorithm and Frame Identification Algorithm (FIA) are proposed to extract critical frame-related statistics from raw network data having only four application-level features. A novel Augmentation, Aggregation, and Retention Online Training (A2R-OT) algorithm is proposed to find an accurate classification model through online training methodology. In addition, we contribute to the real-world Metaverse dataset comprising virtual reality (VR) games, VR video, VR chat, augmented reality (AR), and mixed reality (MR) traffic, providing a comprehensive benchmark. Discern-XR outperforms state-of-the-art classifiers by 7% while improving training efficiency and reducing false-negative rates. Our work advances Metaverse network traffic classification by standing as the state-of-the-art solution.
- Published
- 2024
22. This took us a Weyl: synthesis of a semimetallic Weyl ferromagnet with point Fermi surface
- Author
-
Belopolski, Ilya, Watanabe, Ryota, Sato, Yuki, Yoshimi, Ryutaro, Kawamura, Minoru, Nagahama, Soma, Zhao, Yilin, Shao, Sen, Jin, Yuanjun, Kato, Yoshihiro, Okamura, Yoshihiro, Zhang, Xiao-Xiao, Fujishiro, Yukako, Takahashi, Youtarou, Hirschberger, Max, Tsukazaki, Atsushi, Takahashi, Kei S., Chiu, Ching-Kai, Chang, Guoqing, Kawasaki, Masashi, Nagaosa, Naoto, and Tokura, Yoshinori
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Materials Science - Abstract
Quantum materials governed by emergent topological fermions have become a cornerstone of physics. Dirac fermions in graphene form the basis for moir\'e quantum matter, and Dirac fermions in magnetic topological insulators enabled the discovery of the quantum anomalous Hall effect. In contrast, there are few materials whose electromagnetic response is dominated by emergent Weyl fermions. Nearly all known Weyl materials are overwhelmingly metallic, and are largely governed by irrelevant, conventional electrons. Here we theoretically predict and experimentally observe a semimetallic Weyl ferromagnet in van der Waals (Cr,Bi)$_2$Te$_3$. In transport, we find a record bulk anomalous Hall angle $> 0.5$ along with non-metallic conductivity, a regime sharply distinct from conventional ferromagnets. Together with symmetry analysis, our data suggest a semimetallic Fermi surface composed of two Weyl points, with a giant separation $> 75\%$ of the linear dimension of the bulk Brillouin zone, and no other electronic states. Using state-of-the-art crystal synthesis techniques, we widely tune the electronic structure, allowing us to annihilate the Weyl state and visualize a unique topological phase diagram exhibiting broad Chern insulating, Weyl semimetallic and magnetic semiconducting regions. Our observation of a semimetallic Weyl ferromagnet offers an avenue toward novel correlated states and non-linear phenomena, as well as zero-magnetic-field Weyl spintronic and optical devices., Comment: Nature, in press
- Published
- 2024
23. Practical, optimal preparation of general quantum state with exponentially improved robustness
- Author
-
Zhang, Xiao-Ming
- Subjects
Quantum Physics ,Computer Science - Computational Complexity ,Computer Science - Data Structures and Algorithms ,Physics - Computational Physics - Abstract
Quantum state preparation, as a general process of loading classical data to quantum device, is essential for end-to-end implementation of quantum algorithms. Yet, existing methods suffer from either high circuit depth or complicated hardware, limiting their practicality and robustness. In this work, we overcome these limitations with a bucket-brigade approach. The tree architectures of our hardware represents the simplest connectivity required for achieving sub-exponential circuit depth. Leveraging the bucket-brigade mechanism that can suppress the error propagation between different branches, our approach exhibit exponential improvement on the robustness compared to existing depth-optimal methods. More specifically, the infidelity scales as $O(\text{polylog}(N))$ with data size $N$, as oppose to $O(N)$ for conventional methods. Moreover, our approach is the first to simultaneously achieve linear Clifford$+T$ circuit depth, gate count number, and space-time allocation. These advancements offer the opportunity for processing big data in both near-term and fault-tolerant quantum devices., Comment: 6+13 pages, 2+2 figures
- Published
- 2024
24. PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting
- Author
-
Wang, Yanlong, Xu, Jian, Ma, Fei, Huang, Shao-Lun, Sun, Danny Dongning, and Zhang, Xiao-Ping
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Time series forecasting remains a critical challenge across various domains, often complicated by high-dimensional data and long-term dependencies. This paper presents a novel transformer architecture for time series forecasting, incorporating two key innovations: parameter sharing (PS) and Spatial-Temporal Segment Attention (SegAtt). We also define the time series segment as the concatenation of sequence patches from the same positions across different variables. The proposed model, PSformer, reduces the number of training parameters through the parameter sharing mechanism, thereby improving model efficiency and scalability. The introduction of SegAtt could enhance the capability of capturing local spatio-temporal dependencies by computing attention over the segments, and improve global representation by integrating information across segments. The combination of parameter sharing and SegAtt significantly improves the forecasting performance. Extensive experiments on benchmark datasets demonstrate that PSformer outperforms popular baselines and other transformer-based approaches in terms of accuracy and scalability, establishing itself as an accurate and scalable tool for time series forecasting., Comment: 21 pages
- Published
- 2024
25. Joint Beamforming for Multi-target Detection and Multi-user Communication in ISAC Systems
- Author
-
Zhao, Zongyao, Liu, Zhenyu, Jiang, Rui, Li, Zhongyi, Zhang, Xiao-Ping, Tang, Xinke, and Dong, Yuhan
- Subjects
Electrical Engineering and Systems Science - Signal Processing - Abstract
Detecting weak targets is one of the main challenges for integrated sensing and communication (ISAC) systems. Sensing and communication suffer from a performance trade-off in ISAC systems. As the communication demand increases, sensing ability, especially weak target detection performance, will inevitably reduce. Traditional approaches fail to address this issue. In this paper, we develop a joint beamforming scheme and formulate it as a max-min problem to maximize the detection probability of the weakest target under the constraint of the signal-to-interference-plus-noise ratio (SINR) of multi-user communication. An alternating optimization (AO) algorithm is developed for solving the complicated non-convex problem to obtain the joint beamformer. The proposed scheme can direct the transmit energy toward the multiple targets properly to ensure robust multi-target detection performance. Numerical results show that the proposed beamforming scheme can effectively increase the detection probability of the weakest target compared to baseline approaches while ensuring communication performance., Comment: 5 pages, 4 figures, submitted to IEEE journal
- Published
- 2024
26. Length-Induced Embedding Collapse in Transformer-based Models
- Author
-
Zhou, Yuqi, Dai, Sunhao, Cao, Zhanshuo, Zhang, Xiao, and Xu, Jun
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Information Retrieval - Abstract
Text embeddings enable various applications, but their performance deteriorates on longer texts. In this paper, we find that the performance degradation is due to a phenomenon called Length Collapse, where longer text embeddings collapse into a narrow space. This collapse results in a distributional inconsistency between embeddings of different text lengths, ultimately hurting the performance of downstream tasks. Theoretically, by considering the self-attention mechanism inherently functions as a low-pass filter, we prove that long sequences increase the attenuation rate of the low-pass filter effect of the self-attention mechanism. With layers going deeper, excessive low-pass filtering causes the token signals to retain only their Direct-Current (DC) component, which means the input token feature maps will collapse into a narrow space, especially in long texts. Based on the above analysis, we propose to mitigate the undesirable length collapse limitation by introducing a temperature in softmax(), which achieves a higher low-filter attenuation rate. The tuning-free method, called TempScale, can be plugged into multiple transformer-based embedding models. Empirically, we demonstrate that TempScale can improve existing embedding models, especially on long text inputs, bringing up to 0.53% performance gains on 40 datasets from Massive Text Embedding Benchmark (MTEB) and 0.82% performance gains on 4 datasets from LongEmbed, which specifically focuses on long context retrieval.
- Published
- 2024
27. DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination
- Author
-
Fu, Jia, Zhang, Xiao, Pashami, Sepideh, Rahimian, Fatemeh, and Holst, Anders
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Machine Learning - Abstract
In the ever-evolving adversarial machine learning landscape, developing effective defenses against patch attacks has become a critical challenge, necessitating reliable solutions to safeguard real-world AI systems. Although diffusion models have shown remarkable capacity in image synthesis and have been recently utilized to counter $\ell_p$-norm bounded attacks, their potential in mitigating localized patch attacks remains largely underexplored. In this work, we propose DiffPAD, a novel framework that harnesses the power of diffusion models for adversarial patch decontamination. DiffPAD first performs super-resolution restoration on downsampled input images, then adopts binarization, dynamic thresholding scheme and sliding window for effective localization of adversarial patches. Such a design is inspired by the theoretically derived correlation between patch size and diffusion restoration error that is generalized across diverse patch attack scenarios. Finally, DiffPAD applies inpainting techniques to the original input images with the estimated patch region being masked. By integrating closed-form solutions for super-resolution restoration and image inpainting into the conditional reverse sampling process of a pre-trained diffusion model, DiffPAD obviates the need for text guidance or fine-tuning. Through comprehensive experiments, we demonstrate that DiffPAD not only achieves state-of-the-art adversarial robustness against patch attacks but also excels in recovering naturalistic images without patch remnants. The source code is available at https://github.com/JasonFu1998/DiffPAD., Comment: Accepted to 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
- Published
- 2024
28. Multi-view clustering integrating anchor attribute and structural information
- Author
-
Li, Xuetong and Zhang, Xiao-Dong
- Subjects
Computer Science - Machine Learning - Abstract
Multisource data has spurred the development of advanced clustering algorithms, such as multi-view clustering, which critically relies on constructing similarity matrices. Traditional algorithms typically generate these matrices from sample attributes alone. However, real-world networks often include pairwise directed topological structures critical for clustering. This paper introduces a novel multi-view clustering algorithm, AAS. It utilizes a two-step proximity approach via anchors in each view, integrating attribute and directed structural information. This approach enhances the clarity of category characteristics in the similarity matrices. The anchor structural similarity matrix leverages strongly connected components of directed graphs. The entire process-from similarity matrices construction to clustering - is consolidated into a unified optimization framework. Comparative experiments on the modified Attribute SBM dataset against eight algorithms affirm the effectiveness and superiority of AAS., Comment: 18 pages, 7 figures
- Published
- 2024
29. Noncanonical warm inflation with nonminimal derivative coupling
- Author
-
Zhang, Xiao-Min, Zhao, Run-Qing, Peng, Zhi-peng, Li, Xi-Bin, Feng, Yun-Cai, Chu, Peng-Cheng, and Xing, Yi-Hang
- Subjects
General Relativity and Quantum Cosmology - Abstract
This study extended noncanonical warm inflation to the nonminimal derivative coupling scenario. The fundamental equations, including the evolution equations and the slow roll equations of this new framework, were derived. The enlarged damping term, which encompasses both gravitationally enhanced friction and thermal damping, resulted in a well overdamped inflationary process, ensuring that the slow roll approximations can be satisfactorily satisfied. A linear stability analysis corroborated the viability of this approach, yielding significantly relaxed slow roll conditions within the context of noncanonical warm inflation with nonminimal derivative coupling. Subsequently, the density fluctuations in this new framework were analyzed, leading to approximately analytic results for the power spectrum, spectral index, and related quantities. Both the energy scale at horizon crossing and the tensor-to-scalar ratio decreased considerably because of the effects of thermal damping and nonminimal derivative coupling. The upper bound for field excursion remained safely sub-Planckian in this inflationary scenario. Thus we reached a successful and meaningful model to broad the scope of warm inflation., Comment: 15 pages, 0 figures
- Published
- 2024
30. Predicting time-varying flux and balance in metabolic systems using structured neural-ODE processes
- Author
-
Rathod, Santanu, Lio, Pietro, and Zhang, Xiao
- Subjects
Computer Science - Machine Learning - Abstract
We develop a novel data-driven framework as an alternative to dynamic flux balance analysis, bypassing the demand for deep domain knowledge and manual efforts to formulate the optimization problem. The proposed framework is end-to-end, which trains a structured neural ODE process (SNODEP) model to estimate flux and balance samples using gene-expression time-series data. SNODEP is designed to circumvent the limitations of the standard neural ODE process model, including restricting the latent and decoder sampling distributions to be normal and lacking structure between context points for calculating the latent, thus more suitable for modeling the underlying dynamics of a metabolic system. Through comprehensive experiments ($156$ in total), we demonstrate that SNODEP not only predicts the unseen time points of real-world gene-expression data and the flux and balance estimates well but can even generalize to more challenging unseen knockout configurations and irregular data sampling scenarios, all essential for metabolic pathway analysis. We hope our work can serve as a catalyst for building more scalable and powerful models for genome-scale metabolic analysis. Our code is available at: \url{https://github.com/TrustMLRG/SNODEP}.
- Published
- 2024
31. UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
- Author
-
Yang, Yuzhe, Zhang, Yifei, Hu, Yan, Guo, Yilin, Gan, Ruoli, He, Yueru, Lei, Mingcong, Zhang, Xiao, Wang, Haining, Xie, Qianqian, Huang, Jimin, Yu, Honghai, and Wang, Benyou
- Subjects
Quantitative Finance - Computational Finance ,Computer Science - Computational Engineering, Finance, and Science ,Computer Science - Computation and Language - Abstract
This paper introduces the UCFE: User-Centric Financial Expertise benchmark, an innovative framework designed to evaluate the ability of large language models (LLMs) to handle complex real-world financial tasks. UCFE benchmark adopts a hybrid approach that combines human expert evaluations with dynamic, task-specific interactions to simulate the complexities of evolving financial scenarios. Firstly, we conducted a user study involving 804 participants, collecting their feedback on financial tasks. Secondly, based on this feedback, we created our dataset that encompasses a wide range of user intents and interactions. This dataset serves as the foundation for benchmarking 12 LLM services using the LLM-as-Judge methodology. Our results show a significant alignment between benchmark scores and human preferences, with a Pearson correlation coefficient of 0.78, confirming the effectiveness of the UCFE dataset and our evaluation approach. UCFE benchmark not only reveals the potential of LLMs in the financial sector but also provides a robust framework for assessing their performance and user satisfaction. The benchmark dataset and evaluation code are available.
- Published
- 2024
32. Latency-Aware Inter-domain Routing
- Author
-
Lin, Shihan, Zhou, Yi, Zhang, Xiao, Arnold, Todd, Govindan, Ramesh, and Yang, Xiaowei
- Subjects
Computer Science - Networking and Internet Architecture - Abstract
Despite efforts from cloud and content providers to lower latency to acceptable levels for current and future services (e.g., augmented reality or cloud gaming), there are still opportunities for improvement. A major reason that traffic engineering efforts are challenged to lower latency is that the Internet's inter-domain routing protocol, the Border Gateway Protocol, is oblivious to any performance metric, and circuitous routing is still pervasive. In this work, we propose two implementation modifications that networks can leverage to make BGP latency-aware and reduce excessive latency inflation. These proposals, latency-proportional AS prepending and local preference neutralization, show promise towards providing a method for propagating abstract latency information with a reasonable increase in routing overhead.
- Published
- 2024
33. ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
- Author
-
Sun, Zhongxiang, Zang, Xiaoxue, Zheng, Kai, Song, Yang, Xu, Jun, Zhang, Xiao, Yu, Weijie, and Li, Han
- Subjects
Computer Science - Computation and Language - Abstract
Retrieval-Augmented Generation (RAG) models are designed to incorporate external knowledge, reducing hallucinations caused by insufficient parametric (internal) knowledge. However, even with accurate and relevant retrieved content, RAG models can still produce hallucinations by generating outputs that conflict with the retrieved information. Detecting such hallucinations requires disentangling how Large Language Models (LLMs) utilize external and parametric knowledge. Current detection methods often focus on one of these mechanisms or without decoupling their intertwined effects, making accurate detection difficult. In this paper, we investigate the internal mechanisms behind hallucinations in RAG scenarios. We discover hallucinations occur when the Knowledge FFNs in LLMs overemphasize parametric knowledge in the residual stream, while Copying Heads fail to effectively retain or integrate external knowledge from retrieved content. Based on these findings, we propose ReDeEP, a novel method that detects hallucinations by decoupling LLM's utilization of external context and parametric knowledge. Our experiments show that ReDeEP significantly improves RAG hallucination detection accuracy. Additionally, we introduce AARF, which mitigates hallucinations by modulating the contributions of Knowledge FFNs and Copying Heads., Comment: 23pages
- Published
- 2024
34. LargePiG: Your Large Language Model is Secretly a Pointer Generator
- Author
-
Sun, Zhongxiang, Si, Zihua, Zang, Xiaoxue, Zheng, Kai, Song, Yang, Zhang, Xiao, and Xu, Jun
- Subjects
Computer Science - Computation and Language - Abstract
Recent research on query generation has focused on using Large Language Models (LLMs), which despite bringing state-of-the-art performance, also introduce issues with hallucinations in the generated queries. In this work, we introduce relevance hallucination and factuality hallucination as a new typology for hallucination problems brought by query generation based on LLMs. We propose an effective way to separate content from form in LLM-generated queries, which preserves the factual knowledge extracted and integrated from the inputs and compiles the syntactic structure, including function words, using the powerful linguistic capabilities of the LLM. Specifically, we introduce a model-agnostic and training-free method that turns the Large Language Model into a Pointer-Generator (LargePiG), where the pointer attention distribution leverages the LLM's inherent attention weights, and the copy probability is derived from the difference between the vocabulary distribution of the model's high layers and the last layer. To validate the effectiveness of LargePiG, we constructed two datasets for assessing the hallucination problems in query generation, covering both document and video scenarios. Empirical studies on various LLMs demonstrated the superiority of LargePiG on both datasets. Additional experiments also verified that LargePiG could reduce hallucination in large vision language models and improve the accuracy of document-based question-answering and factuality evaluation tasks., Comment: 24 pages
- Published
- 2024
35. Fast Second-Order Online Kernel Learning through Incremental Matrix Sketching and Decomposition
- Author
-
Wen, Dongxie, Zhang, Xiao, and Wei, Zhewei
- Subjects
Computer Science - Machine Learning - Abstract
Online Kernel Learning (OKL) has attracted considerable research interest due to its promising predictive performance in streaming environments. Second-order approaches are particularly appealing for OKL as they often offer substantial improvements in regret guarantees. However, existing second-order OKL approaches suffer from at least quadratic time complexity with respect to the pre-set budget, rendering them unsuitable for meeting the real-time demands of large-scale streaming recommender systems. The singular value decomposition required to obtain explicit feature mapping is also computationally expensive due to the complete decomposition process. Moreover, the absence of incremental updates to manage approximate kernel space causes these algorithms to perform poorly in adversarial environments and real-world streaming recommendation datasets. To address these issues, we propose FORKS, a fast incremental matrix sketching and decomposition approach tailored for second-order OKL. FORKS constructs an incremental maintenance paradigm for second-order kernelized gradient descent, which includes incremental matrix sketching for kernel approximation and incremental matrix decomposition for explicit feature mapping construction. Theoretical analysis demonstrates that FORKS achieves a logarithmic regret guarantee on par with other second-order approaches while maintaining a linear time complexity w.r.t. the budget, significantly enhancing efficiency over existing approaches. We validate the performance of FORKS through extensive experiments conducted on real-world streaming recommendation datasets, demonstrating its superior scalability and robustness against adversarial attacks.
- Published
- 2024
36. Matrix Sketching in Bandits: Current Pitfalls and New Framework
- Author
-
Wen, Dongxie, Yin, Hanyan, Zhang, Xiao, and Wei, Zhewei
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
The utilization of sketching techniques has progressively emerged as a pivotal method for enhancing the efficiency of online learning. In linear bandit settings, current sketch-based approaches leverage matrix sketching to reduce the per-round time complexity from \(\Omega\left(d^2\right)\) to \(O(d)\), where \(d\) is the input dimension. Despite this improved efficiency, these approaches encounter critical pitfalls: if the spectral tail of the covariance matrix does not decrease rapidly, it can lead to linear regret. In this paper, we revisit the regret analysis and algorithm design concerning approximating the covariance matrix using matrix sketching in linear bandits. We illustrate how inappropriate sketch sizes can result in unbounded spectral loss, thereby causing linear regret. To prevent this issue, we propose Dyadic Block Sketching, an innovative streaming matrix sketching approach that adaptively manages sketch size to constrain global spectral loss. This approach effectively tracks the best rank-\( k \) approximation in an online manner, ensuring efficiency when the geometry of the covariance matrix is favorable. Then, we apply the proposed Dyadic Block Sketching to linear bandits and demonstrate that the resulting bandit algorithm can achieve sublinear regret without prior knowledge of the covariance matrix, even under the worst case. Our method is a general framework for efficient sketch-based linear bandits, applicable to all existing sketch-based approaches, and offers improved regret bounds accordingly. Additionally, we conduct comprehensive empirical studies using both synthetic and real-world data to validate the accuracy of our theoretical findings and to highlight the effectiveness of our algorithm.
- Published
- 2024
37. Generating Model Parameters for Controlling: Parameter Diffusion for Controllable Multi-Task Recommendation
- Author
-
Shen, Chenglei, Zhao, Jiahao, Zhang, Xiao, Yu, Weijie, He, Ming, and Fan, Jianping
- Subjects
Computer Science - Information Retrieval - Abstract
Commercial recommender systems face the challenge that task requirements from platforms or users often change dynamically (e.g., varying preferences for accuracy or diversity). Ideally, the model should be re-trained after resetting a new objective function, adapting to these changes in task requirements. However, in practice, the high computational costs associated with retraining make this process impractical for models already deployed to online environments. This raises a new challenging problem: how to efficiently adapt the learning model to different task requirements by controlling model parameters after deployment, without the need for retraining. To address this issue, we propose a novel controllable learning approach via Parameter Diffusion for controllable multi-task Recommendation (PaDiRec), which allows the customization and adaptation of recommendation model parameters to new task requirements without retraining. Specifically, we first obtain the optimized model parameters through adapter tunning based on the feasible task requirements. Then, we utilize the diffusion model as a parameter generator, employing classifier-free guidance in conditional training to learn the distribution of optimized model parameters under various task requirements. Finally, the diffusion model is applied to effectively generate model parameters in a test-time adaptation manner given task requirements. As a model-agnostic approach, PaDiRec can leverage existing recommendation models as backbones to enhance their controllability. Extensive experiments on public datasets and a dataset from a commercial app, indicate that PaDiRec can effectively enhance controllability through efficient model parameter generation. The code is released at https://anonymous.4open.science/r/PaDiRec-DD13.
- Published
- 2024
38. Unity is Power: Semi-Asynchronous Collaborative Training of Large-Scale Models with Structured Pruning in Resource-Limited Clients
- Author
-
Li, Yan, Li, Mingyi, Zhang, Xiao, Xu, Guangwei, Chen, Feng, Yuan, Yuan, Zou, Yifei, Zhao, Mengying, Lu, Jianbo, and Yu, Dongxiao
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Machine Learning - Abstract
In this work, we study to release the potential of massive heterogeneous weak computing power to collaboratively train large-scale models on dispersed datasets. In order to improve both efficiency and accuracy in resource-adaptive collaborative learning, we take the first step to consider the \textit{unstructured pruning}, \textit{varying submodel architectures}, \textit{knowledge loss}, and \textit{straggler} challenges simultaneously. We propose a novel semi-asynchronous collaborative training framework, namely ${Co\text{-}S}^2{P}$, with data distribution-aware structured pruning and cross-block knowledge transfer mechanism to address the above concerns. Furthermore, we provide theoretical proof that ${Co\text{-}S}^2{P}$ can achieve asymptotic optimal convergence rate of $O(1/\sqrt{N^*EQ})$. Finally, we conduct extensive experiments on a real-world hardware testbed, in which 16 heterogeneous Jetson devices can be united to train large-scale models with parameters up to 0.11 billion. The experimental results demonstrate that $Co\text{-}S^2P$ improves accuracy by up to 8.8\% and resource utilization by up to 1.2$\times$ compared to state-of-the-art methods, while reducing memory consumption by approximately 22\% and training time by about 24\% on all resource-limited devices., Comment: 24 Pages, 12 figures
- Published
- 2024
39. Understanding Adversarially Robust Generalization via Weight-Curvature Index
- Author
-
Xu, Yuelin and Zhang, Xiao
- Subjects
Computer Science - Machine Learning - Abstract
Despite extensive research on adversarial examples, the underlying mechanisms of adversarially robust generalization, a critical yet challenging task for deep learning, remain largely unknown. In this work, we propose a novel perspective to decipher adversarially robust generalization through the lens of the Weight-Curvature Index (WCI). The proposed WCI quantifies the vulnerability of models to adversarial perturbations using the Frobenius norm of weight matrices and the trace of Hessian matrices. We prove generalization bounds based on PAC-Bayesian theory and second-order loss function approximations to elucidate the interplay between robust generalization gap, model parameters, and loss landscape curvature. Our theory and experiments show that WCI effectively captures the robust generalization performance of adversarially trained models. By offering a nuanced understanding of adversarial robustness based on the scale of model parameters and the curvature of the loss landscape, our work provides crucial insights for designing more resilient deep learning models, enhancing their reliability and security.
- Published
- 2024
40. Invisibility Cloak: Disappearance under Human Pose Estimation via Backdoor Attacks
- Author
-
Zhang, Minxing, Backes, Michael, and Zhang, Xiao
- Subjects
Computer Science - Cryptography and Security - Abstract
Human Pose Estimation (HPE) has been widely applied in autonomous systems such as self-driving cars. However, the potential risks of HPE to adversarial attacks have not received comparable attention with image classification or segmentation tasks. Existing works on HPE robustness focus on misleading an HPE system to provide wrong predictions that still indicate some human poses. In this paper, we study the vulnerability of HPE systems to disappearance attacks, where the attacker aims to subtly alter the HPE training process via backdoor techniques so that any input image with some specific trigger will not be recognized as involving any human pose. As humans are typically at the center of HPE systems, such attacks can induce severe security hazards, e.g., pedestrians' lives will be threatened if a self-driving car incorrectly understands the front scene due to disappearance attacks. To achieve the adversarial goal of disappearance, we propose IntC, a general framework to craft Invisibility Cloak in the HPE domain. The core of our work lies in the design of target HPE labels that do not represent any human pose. In particular, we propose three specific backdoor attacks based on our IntC framework with different label designs. IntC-S and IntC-E, respectively designed for regression- and heatmap-based HPE techniques, concentrate the keypoints of triggered images in a tiny, imperceptible region. Further, to improve the attack's stealthiness, IntC-L designs the target poisons to capture the label outputs of typical landscape images without a human involved, achieving disappearance and reducing detectability simultaneously. Extensive experiments demonstrate the effectiveness and generalizability of our IntC methods in achieving the disappearance goal. By revealing the vulnerability of HPE to disappearance and backdoor attacks, we hope our work can raise awareness of the potential risks ...
- Published
- 2024
41. CreDes: Causal Reasoning Enhancement and Dual-End Searching for Solving Long-Range Reasoning Problems using LLMs
- Author
-
Wang, Kangsheng, Zhang, Xiao, Liu, Hao, Han, Songde, Ma, Huimin, and Hu, Tianyu
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Large language models (LLMs) have demonstrated limitations in handling combinatorial optimization problems involving long-range reasoning, partially due to causal hallucinations and huge search space. As for causal hallucinations, i.e., the inconsistency between reasoning and corresponding state transition, this paper introduces the Causal Relationship Enhancement (CRE) mechanism combining cause-effect interventions and the Individual Treatment Effect (ITE) to guarantee the solid causal rightness between each step of reasoning and state transition. As for the long causal range and huge search space limiting the performances of existing models featuring single-direction search, a Dual-End Searching (DES) approach is proposed to seek solutions by simultaneously starting from both the initial and goal states on the causal probability tree. By integrating CRE and DES (CreDes), our model has realized simultaneous multi-step reasoning, circumventing the inefficiencies from cascading multiple one-step reasoning like the Chain-of-Thought (CoT). Experiments demonstrate that CreDes significantly outperforms existing State-Of-The-Art (SOTA) solutions in long-range reasoning tasks in terms of both accuracy and time efficiency.
- Published
- 2024
42. On NP-Hardness of $L_1/L_2$ Minimization and Bound Theory of Nonzero Entries in Solutions
- Author
-
Tao, Min, Zhang, Xiao-Ping, and Zhao, Yun-Bin
- Subjects
Mathematics - Optimization and Control - Abstract
The \(L_1/L_2\) norm ratio has gained significant attention as a measure of sparsity due to three merits: sharper approximation to the \(L_0\) norm compared to the \(L_1\) norm, being parameter-free and scale-invariant, and exceptional performance with highly coherent matrices. These properties have led to its successful application across a wide range of fields. While several efficient algorithms have been proposed to compute stationary points for \(L_1/L_2\) minimization problems, their computational complexity has remained open. In this paper, we prove that finding the global minimum of both constrained and unconstrained \(L_1/L_2\) models is strongly NP-hard. In addition, we establish uniform upper bounds on the \(L_2\) norm for any local minimizer of both constrained and unconstrained \(L_1/L_2\) minimization models. We also derive upper and lower bounds on the magnitudes of the nonzero entries in any local minimizer of the unconstrained model, aiding in classifying nonzero entries. Finally, we extend our analysis to demonstrate that the constrained and unconstrained \(L_p/L_q\) (\(0 < p \leq 1, 1 < q < +\infty\)) models are also strongly NP-hard.
- Published
- 2024
43. GRB 240529A: A Tale of Two Shocks
- Author
-
Sun, Tian-Rui, Geng, Jin-Jun, Yan, Jing-Zhi, Hu, You-Dong, Wu, Xue-Feng, Castro-Tirado, Alberto J., Yang, Chao, Ping, Yi-Ding, Hu, Chen-Ran, Xu, Fan, Gao, Hao-Xuan, Jiang, Ji-An, Zhu, Yan-Tian, Xue, Yongquan, Pérez-García, Ignacio, Wu, Si-Yu, Fernández-García, Emilio, Caballero-García, María D., Sánchez-Ramírez, Rubén, Guziy, Sergiy, Olivares, Ignacio, del Pulgar, Carlos Jesus Pérez, Castellón, A., Castillo, Sebastián, Xiong, Ding-Rong, Pandey, Shashi B., Hiriart, David, García-Segura, Guillermo, Lee, William H., Carrasco-García, I. M., Park, Il H., Meintjes, Petrus J., van Heerden, Hendrik J., Martín-Carrillo, Antonio, Hanlon, Lorraine, Zhang, Bin-Bin, Maury, Alain, Hernández-García, L., Gritsevich, Maria, Rossi, Andrea, Maiorano, Elisabetta, Cusano, Felice, D'Avanzo, Paolo, Ferro, Matteo, Melandri, Andrea, De Pasquale, Massimiliano, Brivio, Riccardo, Fang, Min, Fan, Lu-Lu, Hu, Wei-Da, Wan, Zhen, Hu, Lei, Zuo, Ying-Xi, Tang, Jin-Long, Zhang, Xiao-Ling, Zheng, Xian-Zhong, Li, Bin, Luo, Wen-Tao, Liu, Wei, Wang, Jian, Zhang, Hong-Fei, Liu, Hao, Gao, Jie, Liang, Ming, Wang, Hai-Ren, Yao, Da-Zhi, Cheng, Jing-Quan, Zhao, Wen, and Dai, Zi-Gao
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
Thanks to the rapidly increasing time-domain facilities, we are entering a golden era of research on gamma-ray bursts (GRBs). In this Letter, we report our observations of GRB 240529A with the Burst Optical Observer and Transient Exploring System, the 1.5-meter telescope at Observatorio Sierra Nevada, the 2.5-meter Wide Field Survey Telescope of China, the Large Binocular Telescope, and the Telescopio Nazionale Galileo. The prompt emission of GRB 240529A shows two comparable energetic episodes separated by a quiescence time of roughly 400 s. Combining all available data on the GRB Coordinates Network, we reveal the simultaneous apparent X-ray plateau and optical re-brightening around $10^3-10^4$ s after the burst. Rather than the energy injection from the magnetar as widely invoked for similar GRBs, the multi-wavelength emissions could be better explained as two shocks launched from the central engine separately. The optical peak time and our numerical modeling suggest that the initial bulk Lorentz factor of the later shock is roughly 50, which indicates that the later jet should be accretion-driven and have a higher mass loading than a typical one. The quiescence time between the two prompt emission episodes may be caused by the transition between different accretion states of a central magnetar or black hole, or the fall-back accretion process. A sample of similar bursts with multiple emission episodes in the prompt phase and sufficient follow-up could help to probe the underlying physics of GRB central engines., Comment: Resubmitted to ApJL after addressing the referee's comments; comments are welcome
- Published
- 2024
44. Properties of the QCD Matter: A Review of Selected Results from the ALICE Experiment
- Author
-
Shou, Qi-Ye, Ma, Yu-Gang, Zhang, Song, Zhu, Jian-Hui, Mao, Ya-Xian, Pei, Hua, Yin, Zhong-Bao, Zhang, Xiao-Ming, Zhou, Dai-Cui, Peng, Xin-Ye, Bai, Xiao-Zhi, Tang, Ze-Bo, Zhang, Yi-Fei, and Li, Xiao-Mei
- Subjects
Nuclear Experiment ,High Energy Physics - Experiment - Abstract
The Large Hadron Collider (LHC), the world's largest and most powerful particle accelerator, has been a pivotal tool in advancing our understanding of fundamental physics. By colliding heavy ions (such as lead ions), the LHC recreates conditions similar to those just after the Big Bang. This allows scientists to study the Quark-Gluon Plasma (QGP), a state of matter where quarks and gluons are not confined within protons and neutrons. These studies provide insights into the strong force and the early universe's behavior. In this paper, we provide a comprehensive overview of recent significant findings from A Large Ion Collider Experiment (ALICE) at LHC. The topics encompass measurements regarding to properties of the QGP, particle production, flow and correlations, dileptons, quarkonia and electromagnetic probes, heavy flavor, and jets. Additionally, we introduce future plans for detector upgrades of the ALICE experiment., Comment: 29 pages, 32 figures. This review is dedicated to Professor Wenqing Shen in honor of his leadership and significant impact on the Chinese heavy-ion physics community. All authors contributed equally to this work
- Published
- 2024
- Full Text
- View/download PDF
45. Enhancing elusive clues in knowledge learning by contrasting attention of language models
- Author
-
Gao, Jian, Zhang, Xiao, Wu, Ji, and Li, Miao
- Subjects
Computer Science - Artificial Intelligence - Abstract
Causal language models acquire vast amount of knowledge from general text corpus during pretraining, but the efficiency of knowledge learning is known to be unsatisfactory, especially when learning from knowledge-dense and small-sized corpora. The deficiency can come from long-distance dependencies which are hard to capture by language models, and overfitting to co-occurrence patterns and distracting clues in the training text. To address these issues, the paper proposes a method to enhance knowledge learning during language model pretraining, by enhancing elusive but important clues in text discovered by the language model themselves. We found that larger language models pay more attention to non-obvious but important clues, which are often overlooked by smaller language models. Therefore, we can identify these clues by contrasting the attention weights of large and small language models. We use the identified clues as a guide to perform token-dropout data augmentation on the training text, and observed a significant boost in both small and large models' performance in fact memorization. This shows that the behavior contrast between more and less-performant language models contains important clues for knowledge learning, and it can be ``amplified" for a straight-forward improvement in knowledge learning efficiency., Comment: 7 pages and 17 figures
- Published
- 2024
46. Bardeen-Dirac Stars in AdS Spacetime
- Author
-
Zhang, Xiao-Yu, Zhao, Li, and Wang, Yong-Qiang
- Subjects
General Relativity and Quantum Cosmology - Abstract
In this paper, we construct a static spherical symmetric Bardeen-Dirac Stars (BDSs) in the four-dimensional Anti-de Sitter (AdS) spacetime, which consists of the electromagnetic field and Dirac field coupled to gravity. We investigate the ADM mass, Noether charge and light rings of BDSs in AdS spacetime. In asymptotically Minkowski spacetime, the maximum frequency of BDSs is one. However, we observe that the maximum frequency of BDSs increases as the cosmological constant decreases in AdS spacetime. Additionally, BDSs can exhibit extreme behavior at low frequencies, refer to as Frozen Bardeen-Dirac stars (FBDSs) in AdS spacetime. FBDSs have a critical event horizon, where the metric function gtt is very close to zero. The matter is entirely encapsulated by this critical horizon, highly concentrated within it. When the magnetic charge is fixed, the FBDSs gradually disappear as the cosmological constant decreases., Comment: 21 pages, 8 figures, 1 table
- Published
- 2024
47. Reliable and diverse evaluation of LLM medical knowledge mastery
- Author
-
Zhou, Yuxuan, Liu, Xien, Ning, Chen, Zhang, Xiao, and Wu, Ji
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Mastering medical knowledge is crucial for medical-specific LLMs. However, despite the existence of medical benchmarks like MedQA, a unified framework that fully leverages existing knowledge bases to evaluate LLMs' mastery of medical knowledge is still lacking. In the study, we propose a novel framework PretexEval that dynamically generates reliable and diverse test samples to evaluate LLMs for any given medical knowledge base. We notice that test samples produced directly from knowledge bases by templates or LLMs may introduce factual errors and also lack diversity. To address these issues, we introduce a novel schema into our proposed evaluation framework that employs predicate equivalence transformations to produce a series of variants for any given medical knowledge point. Finally, these produced predicate variants are converted into textual language, resulting in a series of reliable and diverse test samples to evaluate whether LLMs fully master the given medical factual knowledge point. Here, we use our proposed framework to systematically investigate the mastery of medical factual knowledge of 12 well-known LLMs, based on two knowledge bases that are crucial for clinical diagnosis and treatment. The evaluation results illustrate that current LLMs still exhibit significant deficiencies in fully mastering medical knowledge, despite achieving considerable success on some famous public benchmarks. These new findings provide valuable insights for developing medical-specific LLMs, highlighting that current LLMs urgently need to strengthen their comprehensive and in-depth mastery of medical knowledge before being applied to real-world medical scenarios., Comment: 20 pages, 11 figures
- Published
- 2024
48. Co-occurrence is not Factual Association in Language Models
- Author
-
Zhang, Xiao, Li, Miao, and Wu, Ji
- Subjects
Computer Science - Computation and Language - Abstract
Pretrained language models can encode a large amount of knowledge and utilize it for various reasoning tasks, yet they can still struggle to learn novel factual knowledge effectively from finetuning on limited textual demonstrations. In this work, we show that the reason for this deficiency is that language models are biased to learn word co-occurrence statistics instead of true factual associations. We identify the differences between two forms of knowledge representation in language models: knowledge in the form of co-occurrence statistics is encoded in the middle layers of the transformer model and does not generalize well to reasoning scenarios beyond simple question answering, while true factual associations are encoded in the lower layers and can be freely utilized in various reasoning tasks. Based on these observations, we propose two strategies to improve the learning of factual associations in language models. We show that training on text with implicit rather than explicit factual associations can force the model to learn factual associations instead of co-occurrence statistics, significantly improving the generalization of newly learned knowledge. We also propose a simple training method to actively forget the learned co-occurrence statistics, which unblocks and enhances the learning of factual associations when training on plain narrative text. On both synthetic and real-world corpora, the two proposed strategies improve the generalization of the knowledge learned during finetuning to reasoning scenarios such as indirect and multi-hop question answering.
- Published
- 2024
49. CSCE: Boosting LLM Reasoning by Simultaneous Enhancing of Casual Significance and Consistency
- Author
-
Wang, Kangsheng, Zhang, Xiao, Guo, Zizheng, Hu, Tianyu, and Ma, Huimin
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Chain-based reasoning methods like chain of thought (CoT) play a rising role in solving reasoning tasks for large language models (LLMs). However, the causal illusions between \textit{a step of reasoning} and \textit{corresponding state transitions} are becoming a significant obstacle to advancing LLMs' reasoning capabilities, especially in long-range reasoning tasks. This paper proposes a non-chain-based reasoning framework for simultaneous consideration of causal significance and consistency, i.e., the Causal Significance and Consistency Enhancer (CSCE). We customize LLM's loss function utilizing treatment effect assessments to enhance its reasoning ability from two aspects: causal significance and consistency. This ensures that the model captures essential causal relationships and maintains robust and consistent performance across various scenarios. Additionally, we transform the reasoning process from the cascading multiple one-step reasoning commonly used in Chain-Based methods, like CoT, to a causal-enhanced method that outputs the entire reasoning process in one go, further improving the model's reasoning efficiency. Extensive experiments show that our method improves both the reasoning success rate and speed. These improvements further demonstrate that non-chain-based methods can also aid LLMs in completing reasoning tasks.
- Published
- 2024
50. Speaker Contrastive Learning for Source Speaker Tracing
- Author
-
Wang, Qing, Guo, Hongmei, Kang, Jian, Du, Mengjie, Li, Jie, Zhang, Xiao-Lei, and Xie, Lei
- Subjects
Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
As a form of biometric authentication technology, the security of speaker verification systems is of utmost importance. However, SV systems are inherently vulnerable to various types of attacks that can compromise their accuracy and reliability. One such attack is voice conversion, which modifies a persons speech to sound like another person by altering various vocal characteristics. This poses a significant threat to SV systems. To address this challenge, the Source Speaker Tracing Challenge in IEEE SLT2024 aims to identify the source speaker information in manipulated speech signals. Specifically, SSTC focuses on source speaker verification against voice conversion to determine whether two converted speech samples originate from the same source speaker. In this study, we propose a speaker contrastive learning-based approach for source speaker tracing to learn the latent source speaker information in converted speech. To learn a more source-speaker-related representation, we employ speaker contrastive loss during the training of the embedding extractor. This speaker contrastive loss helps identify the true source speaker embedding among several distractor speaker embeddings, enabling the embedding extractor to learn the potentially possessing source speaker information present in the converted speech. Experiments demonstrate that our proposed speaker contrastive learning system achieves the lowest EER of 16.788% on the challenge test set, securing first place in the challenge., Comment: 7 pages, 2 figures, accepted by SLT
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.