Author: "Zhang, Jie" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zhang, Jie"' showing total 81,607 results

Start Over Author "Zhang, Jie"

81,607 results on '"Zhang, Jie"'

1. Spider: Any-to-Many Multimodal LLM

Author: Lai, Jinxiang, Zhang, Jie, Liu, Jun, Li, Jian, Lu, Xiaocheng, and Guo, Song
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Multimodal LLMs (MLLMs) have emerged as an extension of Large Language Models (LLMs), enabling the integration of various modalities. However, Any-to-Any MLLMs are limited to generating pairwise modalities 'Text + X' within a single response, such as Text + {Image or Audio or Video}. To address this limitation, we introduce Spider, a novel efficient Any-to-Many Modalities Generation (AMMG) framework, which can generate an arbitrary combination of modalities 'Text + Xs', such as Text + {Image and Audio and Video}. To achieve efficient AMMG, our Spider integrates three core components: a Base Model for basic X-to-X (i.e., Any-to-Any) modality processing, a novel Efficient Decoders-Controller for controlling multimodal Decoders to generate Xs (many-modal) contents, and an Any-to-Many Instruction Template designed for producing Xs signal prompts. To train Spider, we constructed a novel Text-formatted Many-Modal (TMM) dataset, which facilitates the learning of the X-to-Xs (i.e., Any-to-Many) capability necessary for AMMG. Ultimately, the well-trained Spider generates a pseudo X-to-Xs dataset, the first-ever X-to-Xs many-modal dataset, enhancing the potential for AMMG task in future research. Overall, this work not only pushes the boundary of multimodal interaction but also provides rich data support for advancing the field.
Published: 2024

2. Advancing Sustainability via Recommender Systems: A Survey

Author: Zhou, Xin, Zhang, Lei, Zhang, Honglei, Zhang, Yixin, Zhang, Xiaoxiong, Zhang, Jie, and Shen, Zhiqi
Subjects: Computer Science - Information Retrieval, Computer Science - Computers and Society
Abstract: Human behavioral patterns and consumption paradigms have emerged as pivotal determinants in environmental degradation and climate change, with quotidian decisions pertaining to transportation, energy utilization, and resource consumption collectively precipitating substantial ecological impacts. Recommender systems, which generate personalized suggestions based on user preferences and historical interaction data, exert considerable influence on individual behavioral trajectories. However, conventional recommender systems predominantly optimize for user engagement and economic metrics, inadvertently neglecting the environmental and societal ramifications of their recommendations, potentially catalyzing over-consumption and reinforcing unsustainable behavioral patterns. Given their instrumental role in shaping user decisions, there exists an imperative need for sustainable recommender systems that incorporate sustainability principles to foster eco-conscious and socially responsible choices. This comprehensive survey addresses this critical research gap by presenting a systematic analysis of sustainable recommender systems. As these systems can simultaneously advance multiple sustainability objectives--including resource conservation, sustainable consumer behavior, and social impact enhancement--examining their implementations across distinct application domains provides a more rigorous analytical framework. Through a methodological analysis of domain-specific implementations encompassing transportation, food, buildings, and auxiliary sectors, we can better elucidate how these systems holistically advance sustainability objectives while addressing sector-specific constraints and opportunities. Moreover, we delineate future research directions for evolving recommender systems beyond sustainability advocacy toward fostering environmental resilience and social consciousness in society., Comment: 20pages, 10 figures. Working paper: https://github.com/enoche/SusRec
Published: 2024

3. RPCAcc: A High-Performance and Reconfigurable PCIe-attached RPC Accelerator

Author: Zhang, Jie, Huang, Hongjing, Xu, Xuzheng, Li, Xiang, Liu, Ming, and Wang, Zeke
Subjects: Computer Science - Hardware Architecture
Abstract: The emerging microservice/serverless-based cloud programming paradigm and the rising networking speeds leave the RPC stack as the predominant data center tax. Domain-specific hardware acceleration holds the potential to disentangle the overhead and save host CPU cycles. However, state-of-the-art RPC accelerators integrate RPC logic into the CPU or use specialized low-latency interconnects, hardly adopted in commodity servers. To this end, we design and implement RPCAcc, a software-hardware co-designed RPC on-NIC accelerator that enables reconfigurable RPC kernel offloading. RPCAcc connects to the server through the most widely used PCIe interconnect. To grapple with the ramifications of PCIe-induced challenges, RPCAcc introduces three techniques:(a) a target-aware deserializer that effectively batches cross-PCIe writes on the accelerator's on-chip memory using compacted hardware data structures; (b) a memory-affinity CPU-accelerator collaborative serializer, which trades additional host memory copies for slow cross-PCIe transfers; (c) an automatic field update technique that transparently codifies the schema based on dynamic reconfigure RPC kernels to minimize superfluous PCIe traversals. We prototype RPCAcc using the Xilinx U280 FPGA card. On HyperProtoBench, RPCAcc achieves 3.2X lower serialization time than a comparable RPC accelerator baseline and demonstrates up to 2.6X throughput improvement in the end-to-end cloud workload.
Published: 2024

4. A Novel Extensible Simulation Framework for CXL-Enabled Systems

Author: An, Yuda, Yi, Shushu, Mao, Bo, Li, Qiao, Zhang, Mingzhe, Zhou, Ke, Xiao, Nong, Sun, Guangyu, Wang, Xiaolin, Luo, Yingwei, and Zhang, Jie
Subjects: Computer Science - Hardware Architecture
Abstract: Compute Express Link (CXL) serves as a rising industry standard, delivering high-speed cache-coherent links to a variety of devices, including host CPUs, computational accelerators, and memory devices. It is designed to promote system scalability, enable peer-to-peer exchanges, and accelerate data transmissions. To achieve these objectives, the most recent CXL protocol has brought forth several innovative features, such as port-focused routing, device-handled coherence, and PCIe 6.0 compatibility. However, due to the limited availability of hardware prototypes and simulators compatible with CXL, earlier CXL research has largely depended on emulating CXL devices using remote NUMA nodes. Unfortunately, these NUMA-based emulators have difficulties in accurately representing the new features due to fundamental differences in hardware and protocols. Moreover, the absence of support for non-tree topology and PCIe links makes it complex to merely adapt existing simulators for CXL simulation. To overcome these problems, we introduce ESF, a simulation framework specifically designed for CXL systems. ESF has been developed to accurately reflect the unique features of the latest CXL protocol from the ground up. It uses a specialized interconnect layer to facilitate connections within a wide range of system topologies and also includes key components to carry out specific functions required by these features. By utilizing ESF, we thoroughly investigate various aspects of CXL systems, including system topology, device-handled coherence, and the effects of PCIe characteristics, leading to important findings that can guide the creation of high-performance CXL systems. The ESF source codes are fully open-source and can be accessed at https://anonymous.4open.science/r/ESF-1CE3.
Published: 2024

5. Crystalline and polycrystalline regimes in a periodically sheared 2-dimensional system of disks

Author: Su, Siyuan, Zhang, Jie, Radin, Charles, and Swinney, Harry L.
Subjects: Condensed Matter - Soft Condensed Matter
Abstract: A layer of monodisperse circular steel disks in a nearly square horizontal cell forms, for shear amplitudes SA $\le$ 0.08, hexagonal close-packed crystallites that grow and merge until a single crystal fills the container. Increasing the shear amplitude leads to another reproducible regime, 0.21 $\le$ SA $\le$ 0.27, where a few large polycrystallites grow, shrink, and rotate with shear cycling, but do not evolve into a single crystal that fills the container. These results are robust within certain ranges of applied pressure and shear frequency.
Published: 2024

6. A Comprehensive Simulation Framework for CXL Disaggregated Memory

Author: Hong, Wentao, Wu, Lizhou, Wang, Yanjing, Ou, Yang, Wang, Zicong, Wang, Yongfeng, Zhang, Jie, Ma, Sheng, Dong, Dezun, Qi, Xingyun, Lai, Mingche, and Xiao, Nong
Subjects: Computer Science - Emerging Technologies, Computer Science - Hardware Architecture
Abstract: Compute eXpress Link (CXL) is a pivotal technology for memory disaggregation in future heterogeneous computing systems, enabling on-demand memory expansion and improved resource utilization. Despite its potential, CXL is in its early stages with limited market products, highlighting the need for a reliable system-level simulation tool. This paper introduces CXL-DMSim, an open-source, high-fidelity full-system simulator for CXL disaggregated memory systems, comparable in speed to gem5. CXL-DMSim includes a flexible CXL memory expander model, device driver, and support for CXL\.io and CXL\.mem protocols. It supports both app-managed and kernel-managed modes, with the latter featuring a NUMA-compatible mechanism. Rigorous verification against real hardware testbeds with FPGA-based and ASIC-based CXL memory prototypes confirms CXL-DMSim's accuracy, with an average simulation error of 4.1%. Benchmark results using LMbench and STREAM indicate that CXL-FPGA memory has approximately ~2.88x higher latency than local DDR, while CXL-ASIC latency is about ~2.18x. CXL-FPGA achieves 45-69% of local DDR's memory bandwidth, and CXL-ASIC reaches 82-83%. The performance of CXL memory is significantly more sensitive to Rd/Wr patterns than local DDR, with optimal bandwidth at a 74%:26% ratio rather than 50%:50% due to the current CXL+DDR controller design. The study also shows that CXL memory can markedly enhance the performance of memory-intensive applications, with the most improvement seen in Viper (~23x) and in bandwidth-sensitive scenarios like MERCI (16%). CXL-DMSim's observability and expandability are demonstrated through detailed case studies, showcasing its potential for research on future CXL-interconnected hybrid memory pools., Comment: 15 pages, 19 figures
Published: 2024

7. Confidence Aware Learning for Reliable Face Anti-spoofing

Author: Long, Xingming, Zhang, Jie, and Shan, Shiguang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Current Face Anti-spoofing (FAS) models tend to make overly confident predictions even when encountering unfamiliar scenarios or unknown presentation attacks, which leads to serious potential risks. To solve this problem, we propose a Confidence Aware Face Anti-spoofing (CA-FAS) model, which is aware of its capability boundary, thus achieving reliable liveness detection within this boundary. To enable the CA-FAS to "know what it doesn't know", we propose to estimate its confidence during the prediction of each sample. Specifically, we build Gaussian distributions for both the live faces and the known attacks. The prediction confidence for each sample is subsequently assessed using the Mahalanobis distance between the sample and the Gaussians for the "known data". We further introduce the Mahalanobis distance-based triplet mining to optimize the parameters of both the model and the constructed Gaussians as a whole. Extensive experiments show that the proposed CA-FAS can effectively recognize samples with low prediction confidence and thus achieve much more reliable performance than other FAS models by filtering out samples that are beyond its reliable range., Comment: v1
Published: 2024

8. Benchmarking Bias in Large Language Models during Role-Playing

Author: Li, Xinyue, Chen, Zhenpeng, Zhang, Jie M., Lou, Yiling, Li, Tianlin, Sun, Weisong, Liu, Yang, and Liu, Xuanzhe
Subjects: Computer Science - Computers and Society, Computer Science - Artificial Intelligence
Abstract: Large Language Models (LLMs) have become foundational in modern language-driven applications, profoundly influencing daily life. A critical technique in leveraging their potential is role-playing, where LLMs simulate diverse roles to enhance their real-world utility. However, while research has highlighted the presence of social biases in LLM outputs, it remains unclear whether and to what extent these biases emerge during role-playing scenarios. In this paper, we introduce BiasLens, a fairness testing framework designed to systematically expose biases in LLMs during role-playing. Our approach uses LLMs to generate 550 social roles across a comprehensive set of 11 demographic attributes, producing 33,000 role-specific questions targeting various forms of bias. These questions, spanning Yes/No, multiple-choice, and open-ended formats, are designed to prompt LLMs to adopt specific roles and respond accordingly. We employ a combination of rule-based and LLM-based strategies to identify biased responses, rigorously validated through human evaluation. Using the generated questions as the benchmark, we conduct extensive evaluations of six advanced LLMs released by OpenAI, Mistral AI, Meta, Alibaba, and DeepSeek. Our benchmark reveals 72,716 biased responses across the studied LLMs, with individual models yielding between 7,754 and 16,963 biased responses, underscoring the prevalence of bias in role-playing contexts. To support future research, we have publicly released the benchmark, along with all scripts and experimental results.
Published: 2024

9. Learning to Handle Complex Constraints for Vehicle Routing Problems

Author: Bi, Jieyi, Ma, Yining, Zhou, Jianan, Song, Wen, Cao, Zhiguang, Wu, Yaoxin, and Zhang, Jie
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Vehicle Routing Problems (VRPs) can model many real-world scenarios and often involve complex constraints. While recent neural methods excel in constructing solutions based on feasibility masking, they struggle with handling complex constraints, especially when obtaining the masking itself is NP-hard. In this paper, we propose a novel Proactive Infeasibility Prevention (PIP) framework to advance the capabilities of neural methods towards more complex VRPs. Our PIP integrates the Lagrangian multiplier as a basis to enhance constraint awareness and introduces preventative infeasibility masking to proactively steer the solution construction process. Moreover, we present PIP-D, which employs an auxiliary decoder and two adaptive strategies to learn and predict these tailored masks, potentially enhancing performance while significantly reducing computational costs during training. To verify our PIP designs, we conduct extensive experiments on the highly challenging Traveling Salesman Problem with Time Window (TSPTW), and TSP with Draft Limit (TSPDL) variants under different constraint hardness levels. Notably, our PIP is generic to boost many neural methods, and exhibits both a significant reduction in infeasible rate and a substantial improvement in solution quality., Comment: Accepted at NeurIPS 2024
Published: 2024

10. Liquid Metal Printed Superconducting Circuits

Author: Bao, Wendi, Zhang, Jie, Rao, Wei, and Liu, Jing
Subjects: Condensed Matter - Superconductivity
Abstract: Since the discovery of superconductor one hundred years ago, tremendous theoretical and technological progresses have been achieved. The zero resistance and complete diamagnetism of superconducting materials promise many possibilities in diverse fields. However, the complexity and expensive manufacturing costs associated with the time-consuming superconductor fabrication process may retard their practices in a large extent. Here, via liquid metal printing we proposed to quickly fabricate superconducting electronics which can work at the prescribed cryogenic temperatures. By way of the room temperature fluidity of liquid metal composite inks, such one-step printing allows to pattern various superconducting circuits on the desired substrate. As the first-ever conceptual trial, the most easily available gallium-based liquid alloy inks were particularly adopted to composite with copper particles to achieve superconductivity under specific temperatures around 6.4K. Further, a series of liquid metal alloy and particles loaded composites were screened out and comparatively interpreted regarding their superconducting properties and potential values as printable inks in fabricating superconducting devices. The cost-effective feature and straightforward adaptability of the fabrication principle were evaluated. This work suggests an easy-going way for fabricating ending user superconducting devices, which may warrant more promising investigations and practices in the coming time., Comment: 14 pages, 4 figures, 2 tables
Published: 2024

11. DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models

Author: Qian, Chen, Liu, Dongrui, Zhang, Jie, Liu, Yong, and Shao, Jing
Subjects: Computer Science - Artificial Intelligence
Abstract: Ensuring awareness of fairness and privacy in Large Language Models (LLMs) is critical. Interestingly, we discover a counter-intuitive trade-off phenomenon that enhancing an LLM's privacy awareness through Supervised Fine-Tuning (SFT) methods significantly decreases its fairness awareness with thousands of samples. To address this issue, inspired by the information theory, we introduce a training-free method to \textbf{DEA}ctivate the fairness and privacy coupled \textbf{N}eurons (\textbf{DEAN}), which theoretically and empirically decrease the mutual information between fairness and privacy awareness. Extensive experimental results demonstrate that DEAN eliminates the trade-off phenomenon and significantly improves LLMs' fairness and privacy awareness simultaneously, \eg improving Qwen-2-7B-Instruct's fairness awareness by 12.2\% and privacy awareness by 14.0\%. More crucially, DEAN remains robust and effective with limited annotated data or even when only malicious fine-tuning data is available, whereas SFT methods may fail to perform properly in such scenarios. We hope this study provides valuable insights into concurrently addressing fairness and privacy concerns in LLMs and can be integrated into comprehensive frameworks to develop more ethical and responsible AI systems. Our code is available at \url{https://github.com/ChnQ/DEAN}.
Published: 2024

12. On the Vulnerability of Text Sanitization

Author: Tong, Meng, Chen, Kejiang, Yuang, Xiaojian, Liu, Jiayang, Zhang, Weiming, Yu, Nenghai, and Zhang, Jie
Subjects: Computer Science - Cryptography and Security
Abstract: Text sanitization, which employs differential privacy to replace sensitive tokens with new ones, represents a significant technique for privacy protection. Typically, its performance in preserving privacy is evaluated by measuring the attack success rate (ASR) of reconstruction attacks, where attackers attempt to recover the original tokens from the sanitized ones. However, current reconstruction attacks on text sanitization are developed empirically, making it challenging to accurately assess the effectiveness of sanitization. In this paper, we aim to provide a more accurate evaluation of sanitization effectiveness. Inspired by the works of Palamidessi et al., we implement theoretically optimal reconstruction attacks targeting text sanitization. We derive their bounds on ASR as benchmarks for evaluating sanitization performance. For real-world applications, we propose two practical reconstruction attacks based on these theoretical findings. Our experimental results underscore the necessity of reassessing these overlooked risks. Notably, one of our attacks achieves a 46.4% improvement in ASR over the state-of-the-art baseline, with a privacy budget of epsilon=4.0 on the SST-2 dataset. Our code is available at: https://github.com/mengtong0110/On-the-Vulnerability-of-Text-Sanitization.
Published: 2024

13. SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery

Author: Yang, Enneng, Shen, Li, Wang, Zhenyi, Guo, Guibing, Wang, Xingwei, Cao, Xiaocun, Zhang, Jie, and Tao, Dacheng
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Model merging-based multitask learning (MTL) offers a promising approach for performing MTL by merging multiple expert models without requiring access to raw training data. However, in this paper, we examine the merged model's representation distribution and uncover a critical issue of "representation bias". This bias arises from a significant distribution gap between the representations of the merged and expert models, leading to the suboptimal performance of the merged MTL model. To address this challenge, we first propose a representation surgery solution called Surgery. Surgery is a lightweight, task-specific module that aligns the final layer representations of the merged model with those of the expert models, effectively alleviating bias and improving the merged model's performance. Despite these improvements, a performance gap remains compared to the traditional MTL method. Further analysis reveals that representation bias phenomena exist at each layer of the merged model, and aligning representations only in the last layer is insufficient for fully reducing systemic bias because biases introduced at each layer can accumulate and interact in complex ways. To tackle this, we then propose a more comprehensive solution, deep representation surgery (also called SurgeryV2), which mitigates representation bias across all layers, and thus bridges the performance gap between model merging-based MTL and traditional MTL. Finally, we design an unsupervised optimization objective to optimize both the Surgery and SurgeryV2 modules. Our experimental results show that incorporating these modules into state-of-the-art (SOTA) model merging schemes leads to significant performance gains. Notably, our SurgeryV2 scheme reaches almost the same level as individual expert models or the traditional MTL model. The code is available at \url{https://github.com/EnnengYang/SurgeryV2}., Comment: This paper is an extended version of our previous work [arXiv:2402.02705] presented at ICML 2024
Published: 2024

14. REEF: Representation Encoding Fingerprints for Large Language Models

Author: Zhang, Jie, Liu, Dongrui, Qian, Chen, Zhang, Linfeng, Liu, Yong, Qiao, Yu, and Shao, Jing
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security
Abstract: Protecting the intellectual property of open-source Large Language Models (LLMs) is very important, because training LLMs costs extensive computational resources and data. Therefore, model owners and third parties need to identify whether a suspect model is a subsequent development of the victim model. To this end, we propose a training-free REEF to identify the relationship between the suspect and victim models from the perspective of LLMs' feature representations. Specifically, REEF computes and compares the centered kernel alignment similarity between the representations of a suspect model and a victim model on the same samples. This training-free REEF does not impair the model's general capabilities and is robust to sequential fine-tuning, pruning, model merging, and permutations. In this way, REEF provides a simple and effective way for third parties and models' owners to protect LLMs' intellectual property together. The code is available at https://github.com/tmylla/REEF.
Published: 2024

15. Personality-Guided Code Generation Using Large Language Models

Author: Guo, Yaoqi, Chen, Zhenpeng, Zhang, Jie M., Liu, Yang, and Ma, Yun
Subjects: Computer Science - Software Engineering, Computer Science - Artificial Intelligence
Abstract: Code generation, the automatic creation of source code from natural language descriptions, has garnered significant attention due to its potential to streamline software development. Inspired by research that links task-personality alignment with improved development outcomes, we conduct an empirical study on personality-guided code generation using large language models (LLMs). Specifically, we investigate how emulating personality traits appropriate to the coding tasks affects LLM performance. We extensively evaluate this approach using seven widely adopted LLMs across four representative datasets. Our results show that personality guidance significantly enhances code generation accuracy, with improved pass rates in 23 out of 28 LLM-dataset combinations. Notably, in 11 cases, the improvement exceeds 5%, and in 5 instances, it surpasses 10%, with the highest gain reaching 12.9%. Additionally, personality guidance can be easily integrated with other prompting strategies to further boost performance.
Published: 2024

16. Using Protected Attributes to Consider Fairness in Multi-Agent Systems

Author: La Malfa, Gabriele, Zhang, Jie M., Luck, Michael, and Black, Elizabeth
Subjects: Computer Science - Multiagent Systems, Computer Science - Artificial Intelligence
Abstract: Fairness in Multi-Agent Systems (MAS) has been extensively studied, particularly in reward distribution among agents in scenarios such as goods allocation, resource division, lotteries, and bargaining systems. Fairness in MAS depends on various factors, including the system's governing rules, the behaviour of the agents, and their characteristics. Yet, fairness in human society often involves evaluating disparities between disadvantaged and privileged groups, guided by principles of Equality, Diversity, and Inclusion (EDI). Taking inspiration from the work on algorithmic fairness, which addresses bias in machine learning-based decision-making, we define protected attributes for MAS as characteristics that should not disadvantage an agent in terms of its expected rewards. We adapt fairness metrics from the algorithmic fairness literature -- namely, demographic parity, counterfactual fairness, and conditional statistical parity -- to the multi-agent setting, where self-interested agents interact within an environment. These metrics allow us to evaluate the fairness of MAS, with the ultimate aim of designing MAS that do not disadvantage agents based on protected attributes.
Published: 2024

17. SiFiSinger: A High-Fidelity End-to-End Singing Voice Synthesizer based on Source-filter Model

Author: Cui, Jianwei, Gu, Yu, Weng, Chao, Zhang, Jie, Chen, Liping, and Dai, Lirong
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Machine Learning, Computer Science - Sound
Abstract: This paper presents an advanced end-to-end singing voice synthesis (SVS) system based on the source-filter mechanism that directly translates lyrical and melodic cues into expressive and high-fidelity human-like singing. Similarly to VISinger 2, the proposed system also utilizes training paradigms evolved from VITS and incorporates elements like the fundamental pitch (F0) predictor and waveform generation decoder. To address the issue that the coupling of mel-spectrogram features with F0 information may introduce errors during F0 prediction, we consider two strategies. Firstly, we leverage mel-cepstrum (mcep) features to decouple the intertwined mel-spectrogram and F0 characteristics. Secondly, inspired by the neural source-filter models, we introduce source excitation signals as the representation of F0 in the SVS system, aiming to capture pitch nuances more accurately. Meanwhile, differentiable mcep and F0 losses are employed as the waveform decoder supervision to fortify the prediction accuracy of speech envelope and pitch in the generated speech. Experiments on the Opencpop dataset demonstrate efficacy of the proposed model in synthesis quality and intonation accuracy., Comment: Accepted by ICASSP 2024, Synthesized audio samples are available at: https://sounddemos.github.io/sifisinger
Published: 2024

18. Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models

Author: Li, Boheng, Wei, Yanhao, Fu, Yankai, Wang, Zhenting, Li, Yiming, Zhang, Jie, Wang, Run, and Zhang, Tianwei
Subjects: Computer Science - Computers and Society, Computer Science - Computer Vision and Pattern Recognition
Abstract: Text-to-image diffusion models are pushing the boundaries of what generative AI can achieve in our lives. Beyond their ability to generate general images, new personalization techniques have been proposed to customize the pre-trained base models for crafting images with specific themes or styles. Such a lightweight solution, enabling AI practitioners and developers to easily build their own personalized models, also poses a new concern regarding whether the personalized models are trained from unauthorized data. A promising solution is to proactively enable data traceability in generative models, where data owners embed external coatings (e.g., image watermarks or backdoor triggers) onto the datasets before releasing. Later the models trained over such datasets will also learn the coatings and unconsciously reproduce them in the generated mimicries, which can be extracted and used as the data usage evidence. However, we identify the existing coatings cannot be effectively learned in personalization tasks, making the corresponding verification less reliable. In this paper, we introduce SIREN, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models. Our approach optimizes the coating in a delicate way to be recognized by the model as a feature relevant to the personalization task, thus significantly improving its learnability. We also utilize a human perceptual-aware constraint, a hypersphere classification technique, and a hypothesis-testing-guided verification method to enhance the stealthiness and detection accuracy of the coating. The effectiveness of SIREN is verified through extensive experiments on a diverse set of benchmark datasets, models, and learning algorithms. SIREN is also effective in various real-world scenarios and evaluated against potential countermeasures. Our code is publicly available., Comment: To appear in the IEEE Symposium on Security & Privacy, May 2025
Published: 2024

19. Effi-Code: Unleashing Code Efficiency in Language Models

Author: Huang, Dong, Zeng, Guangtao, Dai, Jianbo, Luo, Meng, Weng, Han, Qing, Yuhao, Cui, Heming, Guo, Zhijiang, and Zhang, Jie M.
Subjects: Computer Science - Computation and Language, Computer Science - Software Engineering
Abstract: As the use of large language models (LLMs) for code generation becomes more prevalent in software development, it is critical to enhance both the efficiency and correctness of the generated code. Existing methods and models primarily focus on the correctness of LLM-generated code, ignoring efficiency. In this work, we present Effi-Code, an approach to enhancing code generation in LLMs that can improve both efficiency and correctness. We introduce a Self-Optimization process based on Overhead Profiling that leverages open-source LLMs to generate a high-quality dataset of correct and efficient code samples. This dataset is then used to fine-tune various LLMs. Our method involves the iterative refinement of generated code, guided by runtime performance metrics and correctness checks. Extensive experiments demonstrate that models fine-tuned on the Effi-Code show significant improvements in both code correctness and efficiency across task types. For example, the pass@1 of DeepSeek-Coder-6.7B-Instruct generated code increases from \textbf{43.3\%} to \textbf{76.8\%}, and the average execution time for the same correct tasks decreases by \textbf{30.5\%}. Effi-Code offers a scalable and generalizable approach to improving code generation in AI systems, with potential applications in software development, algorithm design, and computational problem-solving. The source code of Effi-Code was released in \url{https://github.com/huangd1999/Effi-Code}., Comment: Under Review
Published: 2024

20. A Hybrid Sampling and Multi-Objective Optimization Approach for Enhanced Software Defect Prediction

Author: Zhang, Jie, Li, Dongcheng, Wong, W. Eric, and Wang, Shengrong
Subjects: Computer Science - Software Engineering
Abstract: Accurate early prediction of software defects is essential to maintain software quality and reduce maintenance costs. However, the field of software defect prediction (SDP) faces challenges such as class imbalances, high-dimensional feature spaces, and suboptimal prediction accuracy. To mitigate these challenges, this paper introduces a novel SDP framework that integrates hybrid sampling techniques, specifically Borderline SMOTE and Tomek Links, with a suite of multi-objective optimization algorithms, including NSGA-II, MOPSO, and MODE. The proposed model applies feature fusion through multi-objective optimization, enhancing both the generalization capability and stability of the predictions. Furthermore, the integration of parallel processing for these optimization algorithms significantly boosts the computational efficiency of the model. Comprehensive experiments conducted on datasets from NASA and PROMISE repositories demonstrate that the proposed hybrid sampling and multi-objective optimization approach improves data balance, eliminates redundant features, and enhances prediction accuracy. The experimental results also highlight the robustness of the feature fusion approach, confirming its superiority over existing state-of-the-art techniques in terms of predictive performance and applicability across diverse datasets.
Published: 2024

21. Stackelberg vs. Nash in the Lottery Colonel Blotto Game

Author: Liu, Yan, Ni, Bonan, Shen, Weiran, Wang, Zihe, and Zhang, Jie
Subjects: Computer Science - Computer Science and Game Theory
Abstract: Resource competition problems are often modeled using Colonel Blotto games. However, Colonel Blotto games only simulate scenarios where players act simultaneously. In many real-life scenarios, competition is sequential, such as cybersecurity, cloud services, Web investments, and more. To model such sequential competition, we model the Lottery Colonel Blotto game as a Stackelberg game. We solve the Stackelberg equilibrium in the Lottery Colonel Blotto game in which the first mover's strategy is actually a solution to a bi-level optimization problem. We develop a constructive method that allows for a series of game reductions. This method enables us to compute the leader's optimal commitment strategy in a polynomial number of iterations. Furthermore, we identify the conditions under which the Stackelberg equilibrium aligns with the Nash equilibria. Finally, we show that by making the optimal first move, the leader can improve utility by an infinite factor compared to its utility in the Nash equilibria. We find that the player with a smaller budget has a greater incentive to become the leader in the game. Surprisingly, even when the leader adopts the optimal commitment strategy, the follower's utility may improve compared to that in Nash equilibria.
Published: 2024

22. Simplified radar architecture based on information metasurface

Author: Wang, Si Ran, Chen, Zhan Ye, Chen, Shao Nan, Dai, Jun Yan, Zhang, Jun Wei, Qi, Zhen Jie, Wu, Li Jie, Sun, Meng Ke, Zhou, Qun Yan, Li, Hui Dong, Luo, Zhang Jie, Cheng, Qiang, and Cui, Tie Jun
Subjects: Physics - Applied Physics
Abstract: Modern radar typically employs a chain architecture that consists of radio-frequency (RF) and intermediate frequency (IF) units, baseband digital signal processor, and information display. However, this architecture often results in high costs, significant hardware demands, and integration challenges. Here we propose a simplified radar architecture based on space-time-coding (STC) information metasurfaces. With their powerful capabilities to generate multiple harmonic frequencies and customize their phases, the STC metasurfaces play a key role in chirp signal generation, transmission, and echo reception. Remarkably, the receiving STC metasurface can implement dechirp processing directly on the RF level and realize the digital information outputs, which are beneficial to lower the hardware requirement at the receiving end while potentially shortening the time needed for conventional digital processing. As a proof of concept, the proposed metasurface radar is tested in a series of experiments for target detection and range/speed measurement, yielding results comparable to those obtained by conventional methods. This study provides valuable inspiration for a new radar system paradigm to combine the RF front ends and signal processors on the information metasurface platform that offers essential functionalities while significantly reducing the system complexity and cost., Comment: 25 pages, 10 figures
Published: 2024

23. Wireless-Friendly Window Position Optimization for RIS-Aided Outdoor-to-Indoor Networks based on Multi-Modal Large Language Model

Author: Hou, Jinbo, Qiu, Kehai, Zhang, Zitian, Yu, Yong, Wang, Kezhi, Capolongo, Stefano, Zhang, Jiliang, Li, Zeyang, and Zhang, Jie
Subjects: Computer Science - Networking and Internet Architecture, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Signal Processing
Abstract: This paper aims to simultaneously optimize indoor wireless and daylight performance by adjusting the positions of windows and the beam directions of window-deployed reconfigurable intelligent surfaces (RISs) for RIS-aided outdoor-to-indoor (O2I) networks utilizing large language models (LLM) as optimizers. Firstly, we illustrate the wireless and daylight system models of RIS-aided O2I networks and formulate a joint optimization problem to enhance both wireless traffic sum rate and daylight illumination performance. Then, we present a multi-modal LLM-based window optimization (LMWO) framework, accompanied by a prompt construction template to optimize the overall performance in a zero-shot fashion, functioning as both an architect and a wireless network planner. Finally, we analyze the optimization performance of the LMWO framework and the impact of the number of windows, room size, number of RIS units, and daylight factor. Numerical results demonstrate that our proposed LMWO framework can achieve outstanding optimization performance in terms of initial performance, convergence speed, final outcomes, and time complexity, compared with classic optimization methods. The building's wireless performance can be significantly enhanced while ensuring indoor daylight performance.
Published: 2024

24. Collaboration! Towards Robust Neural Methods for Routing Problems

Author: Zhou, Jianan, Wu, Yaoxin, Cao, Zhiguang, Song, Wen, Zhang, Jie, and Shen, Zhiqi
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Despite enjoying desirable efficiency and reduced reliance on domain expertise, existing neural methods for vehicle routing problems (VRPs) suffer from severe robustness issues -- their performance significantly deteriorates on clean instances with crafted perturbations. To enhance robustness, we propose an ensemble-based Collaborative Neural Framework (CNF) w.r.t. the defense of neural VRP methods, which is crucial yet underexplored in the literature. Given a neural VRP method, we adversarially train multiple models in a collaborative manner to synergistically promote robustness against attacks, while boosting standard generalization on clean instances. A neural router is designed to adeptly distribute training instances among models, enhancing overall load balancing and collaborative efficacy. Extensive experiments verify the effectiveness and versatility of CNF in defending against various attacks across different neural VRP methods. Notably, our approach also achieves impressive out-of-distribution generalization on benchmark instances., Comment: Accepted at NeurIPS 2024
Published: 2024

25. Multiscale Latent Diffusion Model for Enhanced Feature Extraction from Medical Images

Author: Sadia, Rabeya Tus, Zhang, Jie, and Chen, Jin
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Various imaging modalities are used in patient diagnosis, each offering unique advantages and valuable insights into anatomy and pathology. Computed Tomography (CT) is crucial in diagnostics, providing high-resolution images for precise internal organ visualization. CT's ability to detect subtle tissue variations is vital for diagnosing diseases like lung cancer, enabling early detection and accurate tumor assessment. However, variations in CT scanner models and acquisition protocols introduce significant variability in the extracted radiomic features, even when imaging the same patient. This variability poses considerable challenges for downstream research and clinical analysis, which depend on consistent and reliable feature extraction. Current methods for medical image feature extraction, often based on supervised learning approaches, including GAN-based models, face limitations in generalizing across different imaging environments. In response to these challenges, we propose LTDiff++, a multiscale latent diffusion model designed to enhance feature extraction in medical imaging. The model addresses variability by standardizing non-uniform distributions in the latent space, improving feature consistency. LTDiff++ utilizes a UNet++ encoder-decoder architecture coupled with a conditional Denoising Diffusion Probabilistic Model (DDPM) at the latent bottleneck to achieve robust feature extraction and standardization. Extensive empirical evaluations on both patient and phantom CT datasets demonstrate significant improvements in image standardization, with higher Concordance Correlation Coefficients (CCC) across multiple radiomic feature categories. Through these advancements, LTDiff++ represents a promising solution for overcoming the inherent variability in medical imaging data, offering improved reliability and accuracy in feature extraction processes., Comment: version_2
Published: 2024

26. Three-dimensional simulation of film boiling on a horizontal surface with magnetic field

Author: Gu, Hao-Tao, Sahu, Kirti Chandra, Zhang, Jie, and Ni, Ming-Jiu
Subjects: Physics - Fluid Dynamics
Abstract: This study conducts a numerical investigation into the three-dimensional film boiling of liquid under the influence of external magnetic fields. The numerical method incorporates a sharp phase-change model based on the volume-of-fluid approach to track the liquid-vapor interface. Additionally, a consistent and conservative scheme is employed to calculate the induced current densities and electromagnetic forces. We investigate the magnetohydrodynamic effects on film boiling, particularly examining the pattern transition of the vapor bubble and the evolution of heat transfer characteristics, exposed to either a vertical or horizontal magnetic field. In single-mode scenarios, film boiling under a vertical magnetic field displays an isotropic flow structure, forming a columnar vapor jet at higher magnetic field intensities. In contrast, horizontal magnetic fields result in anisotropic flow, creating a two-dimensional vapor sheet as the magnetic strength increases. In multi-mode scenarios, the patterns observed in single-mode film boiling persist, with the interaction of vapor bubbles introducing additional complexity to the magnetohydrodynamic flow. More importantly, our comprehensive analysis reveals how and why distinct boiling effects are generated by various orientations of magnetic fields, which induce directional electromagnetic forces to suppress flow vortices within the cross-sectional plane., Comment: 37 pages, 28 figures, Journal of Fluid Mechanics
Published: 2024

27. Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data

Author: Zhang, Jie, Das, Debeshee, Kamath, Gautam, and Tramèr, Florian
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: We consider the problem of a training data proof, where a data creator or owner wants to demonstrate to a third party that some machine learning model was trained on their data. Training data proofs play a key role in recent lawsuits against foundation models trained on web-scale data. Many prior works suggest to instantiate training data proofs using membership inference attacks. We argue that this approach is fundamentally unsound: to provide convincing evidence, the data creator needs to demonstrate that their attack has a low false positive rate, i.e., that the attack's output is unlikely under the null hypothesis that the model was not trained on the target data. Yet, sampling from this null hypothesis is impossible, as we do not know the exact contents of the training set, nor can we (efficiently) retrain a large foundation model. We conclude by offering two paths forward, by showing that data extraction attacks and membership inference on special canary data can be used to create sound training data proofs.
Published: 2024

28. Leveraging Surgical Activity Grammar for Primary Intention Prediction in Laparoscopy Procedures

Author: Zhang, Jie, Zhou, Song, Wang, Yiwei, Wan, Chidan, Zhao, Huan, Cai, Xiong, and Ding, Han
Subjects: Computer Science - Robotics
Abstract: Surgical procedures are inherently complex and dynamic, with intricate dependencies and various execution paths. Accurate identification of the intentions behind critical actions, referred to as Primary Intentions (PIs), is crucial to understanding and planning the procedure. This paper presents a novel framework that advances PI recognition in instructional videos by combining top-down grammatical structure with bottom-up visual cues. The grammatical structure is based on a rich corpus of surgical procedures, offering a hierarchical perspective on surgical activities. A grammar parser, utilizing the surgical activity grammar, processes visual data obtained from laparoscopic images through surgical action detectors, ensuring a more precise interpretation of the visual information. Experimental results on the benchmark dataset demonstrate that our method outperforms existing surgical activity detectors that rely solely on visual features. Our research provides a promising foundation for developing advanced robotic surgical systems with enhanced planning and automation capabilities., Comment: Submitted to ICRA 2025
Published: 2024

29. Facility Location Problem with Aleatory Agents

Author: Auricchio, Gennaro and Zhang, Jie
Subjects: Computer Science - Computer Science and Game Theory, Computer Science - Multiagent Systems, 91A68 68W25
Abstract: In this paper, we introduce and study the Facility Location Problem with Aleatory Agents (FLPAA), where the facility accommodates n agents larger than the number of agents reporting their preferences, namely n_r. The spare capacity is used by n_u=n-n_r aleatory agents sampled from a probability distribution \mu. The goal of FLPAA is to find a location that minimizes the ex-ante social cost, which is the expected cost of the n_u agents sampled from \mu plus the cost incurred by the agents reporting their position. We investigate the mechanism design aspects of the FLPAA under the assumption that the Mechanism Designer (MD) lacks knowledge of the distribution $\mu$ but can query k quantiles of \mu. We explore the trade-off between acquiring more insights into the probability distribution and designing a better-performing mechanism, which we describe through the strong approximation ratio (SAR). The SAR of a mechanism measures the highest ratio between the cost of the mechanisms and the cost of the optimal solution on the worst-case input x and worst-case distribution \mu, offering a metric for efficiency that does not depend on \mu. We divide our study into four different information settings: the zero information case, in which the MD has access to no quantiles; the median information case, in which the MD has access to the median of \mu; the n_u-quantile information case, in which the MD has access to n_u quantiles of its choice, and the k-quantile information case, in which the MD has access to k
Published: 2024

30. Face Forgery Detection with Elaborate Backbone

Author: Guo, Zonghui, Liu, Yingjie, Zhang, Jie, Zheng, Haiyong, and Shan, Shiguang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Face Forgery Detection (FFD), or Deepfake detection, aims to determine whether a digital face is real or fake. Due to different face synthesis algorithms with diverse forgery patterns, FFD models often overfit specific patterns in training datasets, resulting in poor generalization to other unseen forgeries. This severe challenge requires FFD models to possess strong capabilities in representing complex facial features and extracting subtle forgery cues. Although previous FFD models directly employ existing backbones to represent and extract facial forgery cues, the critical role of backbones is often overlooked, particularly as their knowledge and capabilities are insufficient to address FFD challenges, inevitably limiting generalization. Therefore, it is essential to integrate the backbone pre-training configurations and seek practical solutions by revisiting the complete FFD workflow, from backbone pre-training and fine-tuning to inference of discriminant results. Specifically, we analyze the crucial contributions of backbones with different configurations in FFD task and propose leveraging the ViT network with self-supervised learning on real-face datasets to pre-train a backbone, equipping it with superior facial representation capabilities. We then build a competitive backbone fine-tuning framework that strengthens the backbone's ability to extract diverse forgery cues within a competitive learning mechanism. Moreover, we devise a threshold optimization mechanism that utilizes prediction confidence to improve the inference reliability. Comprehensive experiments demonstrate that our FFD model with the elaborate backbone achieves excellent performance in FFD and extra face-related tasks, i.e., presentation attack detection. Code and models are available at https://github.com/zhenglab/FFDBackbone.
Published: 2024

31. Mean Age of Information in Partial Offloading Mobile Edge Computing Networks

Author: Dong, Ying, Xiao, Hang, Hu, Haonan, Zhang, Jiliang, Chen, Qianbin, and Zhang, Jie
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: The age of information (AoI) performance analysis is essential for evaluating the information freshness in the large-scale mobile edge computing (MEC) networks. This work proposes the earliest analysis of the mean AoI (MAoI) performance of large-scale partial offloading MEC networks. Firstly, we derive and validate the closed-form expressions of MAoI by using queueing theory and stochastic geometry. Based on these expressions, we analyse the effects of computing offloading ratio (COR) and task generation rate (TGR) on the MAoI performance and compare the MAoI performance under the local computing, remote computing, and partial offloading schemes. The results show that by jointly optimising the COR and TGR, the partial offloading scheme outperforms the local and remote computing schemes in terms of the MAoI, which can be improved by up to 51% and 61%, respectively. This encourages the MEC networks to adopt the partial offloading scheme to improve the MAoI performance.
Published: 2024

32. CamLoPA: A Hidden Wireless Camera Localization Framework via Signal Propagation Path Analysis

Author: Zhang, Xiang, Zhang, Jie, Ma, Zehua, Huang, Jinyang, Li, Meng, Yan, Huan, Zhao, Peng, Zhang, Zijian, Guo, Qing, Zhang, Tianwei, Liu, Bin, and Yu, Nenghai
Subjects: Computer Science - Cryptography and Security, Computer Science - Human-Computer Interaction
Abstract: Hidden wireless cameras pose significant privacy threats, necessitating effective detection and localization methods. However, existing solutions often require spacious activity areas, expensive specialized devices, or pre-collected training data, limiting their practical deployment. To address these limitations, we introduce CamLoPA, a training-free wireless camera detection and localization framework that operates with minimal activity space constraints using low-cost commercial-off-the-shelf (COTS) devices. CamLoPA can achieve detection and localization in just 45 seconds of user activities with a Raspberry Pi board. During this short period, it analyzes the causal relationship between the wireless traffic and user movement to detect the presence of a snooping camera. Upon detection, CamLoPA employs a novel azimuth location model based on wireless signal propagation path analysis. Specifically, this model leverages the time ratio of user paths crossing the First Fresnel Zone (FFZ) to determine the azimuth angle of the camera. Then CamLoPA refines the localization by identifying the camera's quadrant. We evaluate CamLoPA across various devices and environments, demonstrating that it achieves 95.37% snooping camera detection accuracy and an average localization error of 17.23, under the significantly reduced activity space requirements. Our demo are available at https://www.youtube.com/watch?v=GKam04FzeM4.
Published: 2024

33. Cambricon-LLM: A Chiplet-Based Hybrid Architecture for On-Device Inference of 70B LLM

Author: Yu, Zhongkai, Liang, Shengwen, Ma, Tianyun, Cai, Yunke, Nan, Ziyuan, Huang, Di, Song, Xinkai, Hao, Yifan, Zhang, Jie, Zhi, Tian, Zhao, Yongwei, Du, Zidong, Hu, Xing, Guo, Qi, and Chen, Tianshi
Subjects: Computer Science - Hardware Architecture
Abstract: Deploying advanced large language models on edge devices, such as smartphones and robotics, is a growing trend that enhances user data privacy and network connectivity resilience while preserving intelligent capabilities. However, such a task exhibits single-batch computing with incredibly low arithmetic intensity, which poses the significant challenges of huge memory footprint and bandwidth demands on limited edge resources. To address these issues, we introduce Cambricon-LLM, a chiplet-based hybrid architecture with NPU and a dedicated NAND flash chip to enable efficient on-device inference of 70B LLMs. Such a hybrid architecture utilizes both the high computing capability of NPU and the data capacity of the NAND flash chip, with the proposed hardware-tiling strategy that minimizes the data movement overhead between NPU and NAND flash chip. Specifically, the NAND flash chip, enhanced by our innovative in-flash computing and on-die ECC techniques, excels at performing precise lightweight on-die processing. Simultaneously, the NPU collaborates with the flash chip for matrix operations and handles special function computations beyond the flash's on-die processing capabilities. Overall, Cambricon-LLM enables the on-device inference of 70B LLMs at a speed of 3.44 token/s, and 7B LLMs at a speed of 36.34 token/s, which is over 22X to 45X faster than existing flash-offloading technologies, showing the potentiality of deploying powerful LLMs in edge devices., Comment: 15 pages, 16 figures
Published: 2024

34. Hyper-parameter Optimization for Wireless Network Traffic Prediction Models with A Novel Meta-Learning Framework

Author: Wang, Liangzhi, Zhang, Jie, Gao, Yuan, Zhang, Jiliang, Wei, Guiyi, Zhou, Haibo, Zhuge, Bin, and Zhang, Zitian
Subjects: Computer Science - Networking and Internet Architecture
Abstract: In this paper, we propose a novel meta-learning based hyper-parameter optimization framework for wireless network traffic prediction models. An attention-based deep neural network (ADNN) is adopted as the prediction model, i.e., base-learner, for each wireless network traffic prediction task, namely base-task, and a meta-learner is employed to automatically generate the optimal hyper-parameters for a given base-learner according to the corresponding base-task's intrinsic characteristics or properties, i.e., meta-features. Based on our observation from real-world traffic records that base-tasks possessing similar meta-features tend to favour similar hyper-parameters for their base-learners, the meta-learner exploits a K-nearest neighbor (KNN) learning method to obtain a set of candidate hyper-parameter selection strategies for a new base-learner, which are then utilized by an advanced genetic algorithm with intelligent chromosome screening to finally acquire the best hyper-parameter selection strategy. Extensive experiments demonstrate that base-learners in the proposed framework have high potential prediction ability for wireless network traffic prediction task, and the meta-learner can enormously elevate the base-learners' performance by providing them the optimal hyper-parameters., Comment: This work has been submitted to the IEEE for possible publication
Published: 2024

35. Interpretable Nonroutine Network Traffic Prediction with a Case Study

Author: Wang, Liangzhi, Zhu, Haoyuan, Zhang, Jiliang, Zhang, Zitian, and Zhang, Jie
Subjects: Computer Science - Networking and Internet Architecture
Abstract: This paper pioneers a nonroutine network traffic prediction (NNTP) method to prospectively provide a theoretical basis for avoiding large-scale network disruption by accurately predicting bursty traffic. Certain events that impact user behavior subsequently trigger nonroutine traffic, which significantly constrains the performance of network traffic prediction (NTP) models. By analyzing nonroutine traffic and the corresponding events, the NNTP method is pioneered to construct interpretable NTP model. Based on the real-world traffic data, the network traffic generated during soccer games serves as a case study to validate the performance of the NNTP method. The numerical results indicate that our prediction closely fits the traffic pattern. In comparison to existing researches, the NNTP method is at the forefront of finding a balance among interpretability, accuracy, and computational complexity., Comment: This work has been submitted to the IEEE for possible publication
Published: 2024

36. LiSenNet: Lightweight Sub-band and Dual-Path Modeling for Real-Time Speech Enhancement

Author: Yan, Haoyin, Zhang, Jie, Fan, Cunhang, Zhou, Yeping, and Liu, Peiqi
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound, Electrical Engineering and Systems Science - Signal Processing
Abstract: Speech enhancement (SE) aims to extract the clean waveform from noise-contaminated measurements to improve the speech quality and intelligibility. Although learning-based methods can perform much better than traditional counterparts, the large computational complexity and model size heavily limit the deployment on latency-sensitive and low-resource edge devices. In this work, we propose a lightweight SE network (LiSenNet) for real-time applications. We design sub-band downsampling and upsampling blocks and a dual-path recurrent module to capture band-aware features and time-frequency patterns, respectively. A noise detector is developed to detect noisy regions in order to perform SE adaptively and save computational costs. Compared to recent higher-resource-dependent baseline models, the proposed LiSenNet can achieve a competitive performance with only 37k parameters (half of the state-of-the-art model) and 56M multiply-accumulate (MAC) operations per second., Comment: 5 pages, submitted to 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025)
Published: 2024

37. Geometry-Constrained EEG Channel Selection for Brain-Assisted Speech Enhancement

Author: Zuo, Keying, Xu, Qingtian, Zhang, Jie, and Ling, Zhenhua
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: Brain-assisted speech enhancement (BASE) aims to extract the target speaker in complex multi-talker scenarios using electroencephalogram (EEG) signals as an assistive modality, as the auditory attention of the listener can be decoded from electroneurographic signals of the brain. This facilitates a potential integration of EEG electrodes with listening devices to improve the speech intelligibility of hearing-impaired listeners, which was shown by the recently-proposed BASEN model. As in general the multichannel EEG signals are highly correlated and some are even irrelevant to listening, blindly incorporating all EEG channels would lead to a high economic and computational cost. In this work, we therefore propose a geometry-constrained EEG channel selection approach for BASE. We design a new weighted multi-dilation temporal convolutional network (WDTCN) as the backbone to replace the Conv-TasNet in BASEN. Given a raw channel set that is defined by the electrode geometry for feasible integration, we then propose a geometry-constrained convolutional regularization selection (GC-ConvRS) module for WD-TCN to find an informative EEG subset. Experimental results on a public dataset show the superiority of the proposed WD-TCN over BASEN. The GC-ConvRS can further refine the useful EEG subset subject to the geometry constraint, resulting in a better trade-off between performance and integration cost.
Published: 2024

38. A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation

Author: Wang, Jingyuan, Zhang, Jie, Chen, Shihao, and Sun, Miao
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Binaural speech enhancement (BSE) aims to jointly improve the speech quality and intelligibility of noisy signals received by hearing devices and preserve the spatial cues of the target for natural listening. Existing methods often suffer from the compromise between noise reduction (NR) capacity and spatial cues preservation (SCP) accuracy and a high computational demand in complex acoustic scenes. In this work, we present a learning-based lightweight binaural complex convolutional network (LBCCN), which excels in NR by filtering low-frequency bands and keeping the rest. Additionally, our approach explicitly incorporates the estimation of interchannel relative acoustic transfer function to ensure the spatial cues fidelity and speech clarity. Results show that the proposed LBCCN can achieve a comparable NR performance to state-of-the-art methods under various noise conditions, but with a much lower computational cost and a better SCP. The reproducible code and audio examples are available at https://github.com/jywanng/LBCCN.
Published: 2024

39. Rethinking the Influence of Source Code on Test Case Generation

Author: Huang, Dong, Zhang, Jie M., Du, Mingzhe, Harman, Mark, and Cui, Heming
Subjects: Computer Science - Software Engineering, Computer Science - Computation and Language
Abstract: Large language models (LLMs) have been widely applied to assist test generation with the source code under test provided as the context. This paper aims to answer the question: If the source code under test is incorrect, will LLMs be misguided when generating tests? The effectiveness of test cases is measured by their accuracy, coverage, and bug detection effectiveness. Our evaluation results with five open- and six closed-source LLMs on four datasets demonstrate that incorrect code can significantly mislead LLMs in generating correct, high-coverage, and bug-revealing tests. For instance, in the HumanEval dataset, LLMs achieve 80.45% test accuracy when provided with task descriptions and correct code, but only 57.12% when given task descriptions and incorrect code. For the APPS dataset, prompts with correct code yield tests that detect 39.85% of the bugs, while prompts with incorrect code detect only 19.61%. These findings have important implications for the deployment of LLM-based testing: using it on mature code may help protect against future regression, but on early-stage immature code, it may simply bake in errors. Our findings also underscore the need for further research to improve LLMs resilience against incorrect code in generating reliable and bug-revealing tests., Comment: 23 pages
Published: 2024

40. InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference

Author: Pan, Xiurui, Li, Endian, Li, Qiao, Liang, Shengwen, Shan, Yizhou, Zhou, Ke, Luo, Yingwei, Wang, Xiaolin, and Zhang, Jie
Subjects: Computer Science - Hardware Architecture, Computer Science - Computation and Language
Abstract: The widespread of Large Language Models (LLMs) marks a significant milestone in generative AI. Nevertheless, the increasing context length and batch size in offline LLM inference escalate the memory requirement of the key-value (KV) cache, which imposes a huge burden on the GPU VRAM, especially for resource-constraint scenarios (e.g., edge computing and personal devices). Several cost-effective solutions leverage host memory or SSDs to reduce storage costs for offline inference scenarios and improve the throughput. Nevertheless, they suffer from significant performance penalties imposed by intensive KV cache accesses due to limited PCIe bandwidth. To address these issues, we propose InstInfer, a novel LLM inference system that offloads the most performance-critical computation (i.e., attention in decoding phase) and data (i.e., KV cache) parts to Computational Storage Drives (CSDs), which minimize the enormous KV transfer overheads. InstInfer designs a dedicated flash-aware in-storage attention engine with KV cache management mechanisms to exploit the high internal bandwidths of CSDs instead of being limited by the PCIe bandwidth. The optimized P2P transmission between GPU and CSDs further reduces data migration overheads. Experimental results demonstrate that for a 13B model using an NVIDIA A6000 GPU, InstInfer improves throughput for long-sequence inference by up to 11.1$\times$, compared to existing SSD-based solutions such as FlexGen.
Published: 2024

41. Multiple types of spin textures and robust valley physics in MP$_2$X$_6$

Author: Liang, Li, Zhou, Zhichao, Zhang, Jie, and Li, Xiao
Subjects: Condensed Matter - Materials Science, Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: Both spin textures and multiple valleys in the momentum space have attracted great attentions due to their versatile applications in spintronics and valleytronics. It is highly desirable to realize multiple types of spin textures in a single material and further couple the spin textures to valley degree of freedom. Here, we study electronic properties of SnP$_{2}$Se$_{6}$ monolayer by first-principles calculations. The monolayer exhibits rare Weyl-type and Ising-type spin textures at different valleys, which can be conveniently expressed by electron and hole dopings, respectively. Besides valley-contrasting spin textures, Berry-curvature-driven anomalous Hall currents and optical selectivity are found to be valley dependent as well. These valley-related properties also have generalizations to SnP$_{2}$Se$_{6}$ few-layers and other MP$_{2}$X$_{6}$. Our findings open new avenue for exploring appealing interplay between spin textures and multiple valleys, and designing advanced device paradigms based on spin and valley degrees of freedom., Comment: 5 pages, 4 figures
Published: 2024

42. Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management

Author: Wang, Yujie, Zhu, Shenhan, Fu, Fangcheng, Miao, Xupeng, Zhang, Jie, Zhu, Juan, Hong, Fan, Li, Yong, and Cui, Bin
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Machine Learning
Abstract: Recent foundation models are capable of handling multiple machine learning (ML) tasks and multiple data modalities with the unified base model structure and several specialized model components. However, the development of such multi-task (MT) multi-modal (MM) models poses significant model management challenges to existing training systems. Due to the sophisticated model architecture and the heterogeneous workloads of different ML tasks and data modalities, training these models usually requires massive GPU resources and suffers from sub-optimal system efficiency. In this paper, we investigate how to achieve high-performance training of large-scale MT MM models through data heterogeneity-aware model management optimization. The key idea is to decompose the model execution into stages and address the joint optimization problem sequentially, including both heterogeneity-aware workload parallelization and dependency-driven execution scheduling. Based on this, we build a prototype system and evaluate it on various large MT MM models. Experiments demonstrate the superior performance and efficiency of our system, with speedup ratio up to 71% compared to state-of-the-art training systems.
Published: 2024

43. Shuffle Mamba: State Space Models with Random Shuffle for Multi-Modal Image Fusion

Author: Cao, Ke, He, Xuanhua, Hu, Tao, Xie, Chengjun, Zhang, Jie, Zhou, Man, and Hong, Danfeng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Multi-modal image fusion integrates complementary information from different modalities to produce enhanced and informative images. Although State-Space Models, such as Mamba, are proficient in long-range modeling with linear complexity, most Mamba-based approaches use fixed scanning strategies, which can introduce biased prior information. To mitigate this issue, we propose a novel Bayesian-inspired scanning strategy called Random Shuffle, supplemented by an theoretically-feasible inverse shuffle to maintain information coordination invariance, aiming to eliminate biases associated with fixed sequence scanning. Based on this transformation pair, we customized the Shuffle Mamba Framework, penetrating modality-aware information representation and cross-modality information interaction across spatial and channel axes to ensure robust interaction and an unbiased global receptive field for multi-modal image fusion. Furthermore, we develop a testing methodology based on Monte-Carlo averaging to ensure the model's output aligns more closely with expected results. Extensive experiments across multiple multi-modal image fusion tasks demonstrate the effectiveness of our proposed method, yielding excellent fusion quality over state-of-the-art alternatives. Code will be available upon acceptance.
Published: 2024

44. Political Elite Selection in Contemporary Chinese Higher Education

Author: Chai, Ling, Wei, Jianwen, Han, Yang, Zhang, Jie, and Hennessy, Dwight
Published: 2020
Full Text: View/download PDF

45. Do Only-Children Communicate Better Than Non-Only Children? A Study of Medical Students in China

Author: Wang, Wei, Zhang, Jie, Hennessy, Dwight A, and Yin, Wenqiang
Published: 2020

46. Synthesis of barium alkylbenzene sulfonate and its behavior as a flow improver for crude oil

Author: Zhou, Zhichao, Zhang, Wangyuan, Zhang, Futian, Zhang, Xuanming, Dong, Sanbao, Zhang, Jie, and Gang, Chen
Subjects: Crude oil, Flow improver, Barium dodecylbenzenesulfonate, Viscosity, Pour point, Biochemistry, QD415-436, Physical and theoretical chemistry, QD450-801, Mathematics, QA1-939
Abstract: The low flow characteristics of heavy oil have brought many challenges to its exploitation. Looking for a cost-effective crude oil flow improver to reduce the viscosity of heavy oil is currently the most important challenge. Small-molecule crude oil flow improvers meet the above requirements and become an ideal target for oilfield heavy oil production. In this article, three kinds of barium salts of alkylbenzene sulfonates with different lengths of alkyl chains were synthesized from alkylbenzene sulfonic acid and barium hydroxide. Among them, the viscosity reduction effect of barium dodecylbenzenesulfonate (BaDBS) is best as the dosage is 900 mg/L, the viscosity reduction rate is 89.0%, and the pour point is reduced by 5 °C. Optical microscopy revealed the eutectic effect of crude oil flow improver and saturated hydrocarbon in heavy oil. FTIR and DSC were used to research the mechanism of small-molecule crude oil flow improver to reduce the viscosity of heavy oil.
Published: 2021
Full Text: View/download PDF

47. Catalytic oxidation of polymer used in oilfield by supported Co(II) complex within a high pH range

Author: Ma, Liwa, Zhao, Furong, Zhang, Jianqing, Ma, Guoyan, Zhao, Yifei, Zhang, Jie, and Chen, Gang
Subjects: Bentonite supported complex, Catalytic oxidation, Oilfield wastewater, Hydroxypropyl guar gum, Fenton like process, Biochemistry, QD415-436, Physical and theoretical chemistry, QD450-801, Mathematics, QA1-939
Abstract: In this study, a clean oxidation process for the treatment of wastewater containing hydroxypropyl guar gum (HPGG) and other polymers under a high pH range was designed. For that, 5-sulfosalicylic acid (L)-Co(II) complex supported on bentonite (B) (B@Co(II)L) was prepared for treatment of wastewater by hydrogen peroxide ($\mathrm{H}_{2}\mathrm{O}_{2}$). The morphology and pore structure of B@Co(II)L was first characterized by scanning electron microscopy (SEM), powder X-ray diffraction (XRD), Fourier infrared spectrometer (FT-IR), and $\mathrm{N}_{2}$ adsorption–desorption isotherms, after which the catalytic performance was investigated for the treatment of polymer wastewater. Results show that B@Co(II)L performed high catalytic performance in a wide range of 7.0 to 13.0. The viscosity of the HPGG can be decreased effectively from 22 to 2.5 mm$^{2}$/s under the optimal conditions of 45 °C, pH 10.0, 10% $\mathrm{H}_{2}\mathrm{O}_{2}$ (mass ratio to HPGG), and 10% B@Co(II)L (mass ratio to $\mathrm{H}_{2}\mathrm{O}_{2}$), and the removal rate for chemical oxygen demand (COD) of HPGG, CMC, and PAM reached to 95.9%, 94.8%, and 93.7%, respectively, within 240 min. Most of all, by applying the catalyst in the oilfield, it was found that the catalyst has high performance and the removal rate for COD of oilfield wastewater, fracturing fluids, and drilling fluid can be achieved by 92.1%, 94.2%, and 90.7%, respectively.
Published: 2021
Full Text: View/download PDF

48. Synergistic effect of octadecyl ammonium oxide and oleate amide propyl betaine and development of a foam drainage reagent for natural gas production

Author: Gao, Minlan, Jia, Yijing, Lv, Shiyi, Dong, Sanbao, Wang, Manxue, Zhu, Shidong, Zhang, Jie, and Chen, Gang
Subjects: Surfactant, Oleate amide propyl betaine, Octadecyl ammonium oxide, Foaming, Synergistic effect, Biochemistry, QD415-436, Physical and theoretical chemistry, QD450-801, Mathematics, QA1-939
Abstract: Betaine surfactants are used widely in oil field chemistry as well as other industrial applications, but their foaming ability is very poor so that it cannot be used in foaming. In this work, the effect of octadecyl ammonium oxide on the foam properties of oleate amide propyl betaine, a new compound foaming reagent, is studied based on foam performance. Then, a foam drainage reagent of 0.5 wt% oleate amide propyl betaine and 0.1 wt% octadecyl ammonium oxide is developed for natural gas production. Its salt resistance, methanol resistance, high temperature resistance, anti-condensate oil performance, and emulsification ability are systematically evaluated. Furthermore, the factors affecting foam performance are analyzed. The results show that the compound foaming reagent has good anti-salt, anti-methanol, and anti-condensate oil properties for meeting application requirements. The microstructures of foams derived from different reagents reveal the stability mechanism. All results reflect the fact that compounding can expand their application range in different environments to various extents, which benefits the design and use of compound surfactants.
Published: 2021
Full Text: View/download PDF

49. Modification of sodium dodecyl sulfate and evaluation of foaming activity

Author: Gao, Minlan, Chen, Gang, Bai, Yun, Zhang, Rongjun, Zhang, Jie, Zhu, Shidong, Zhang, Zhifang, and Dong, Sanbao
Subjects: Surfactant, Synthesis, Modified sodium dodecyl sulfate, Foaming, Mannich reaction, Biochemistry, QD415-436, Physical and theoretical chemistry, QD450-801, Mathematics, QA1-939
Abstract: In this study, to optimize the foaming activity of sodium dodecyl sulfate (SDS), modified sodium dodecyl sulfate surfactants (MSDS-1 and MSDS-2) are prepared by using methanol and diethanol amine as modifiers by the Mannich reaction. The foaming properties and foam stability of the products are evaluated by the Ross–Miles method and the Waring blender method. The microstructures of the foams produced by three surfactants are compared. The effects of temperature, inorganic salt, methanol, and condensate oil on the foaming activity of SDS, MSDS-1, and MSDS-2 are studied. The results obtained show that the best foaming concentration of all three products is 0.5%. Compared with SDS, the temperature resistance, methanol resistance, salt resistance and anti-condensate oil performance of MSDS-1 and MSDS-2 are improved. Among them, the temperature resistance, salt resistance, and methanol resistance of the MSDS-1 solution are the best. The MSDS-2 solution has the best anti-condensate performance. Besides, the foam size becomes smaller, the foam wall thickens, and the foam stability is improved after modification. The overall performance of SDS as a foaming agent can be improved by the Mannich modification.
Published: 2020
Full Text: View/download PDF

50. Optimal Dispatch Strategy for a Multi-microgrid Cooperative Alliance Using a Two-Stage Pricing Mechanism

Author: Nie, Yonghui, Li, Zhi, Zhang, Jie, Gao, Lei, Li, Yang, and Zhou, Hengyu
Subjects: Electrical Engineering and Systems Science - Systems and Control
Abstract: To coordinate resources among multi-level stakeholders and enhance the integration of electric vehicles (EVs) into multi-microgrids, this study proposes an optimal dispatch strategy within a multi-microgrid cooperative alliance using a nuanced two-stage pricing mechanism. Initially, the strategy assesses electric energy interactions between microgrids and distribution networks to establish a foundation for collaborative scheduling. The two-stage pricing mechanism initiates with a leader-follower game, wherein the microgrid operator acts as the leader and users as followers. Subsequently, it adjusts EV tariffs based on the game's equilibrium, taking into account factors such as battery degradation and travel needs to optimize EVs' electricity consumption. Furthermore, a bi-level optimization model refines power interactions and pricing strategies across the network, significantly enhancing demand response capabilities and economic outcomes. Simulation results demonstrate that this strategy not only increases renewable energy consumption but also reduces energy costs, thereby improving the overall efficiency and sustainability of the system., Comment: Accepted by IEEE Transactions on Sustainable Energy, Paper no. TSTE-00122-2024
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

81,607 results on '"Zhang, Jie"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources