55,477 results on '"liu, Yi"'
Search Results
2. SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes
- Author
-
Fan, Cheng-De, Chang, Chen-Wei, Liu, Yi-Ruei, Lee, Jie-Ying, Huang, Jiun-Long, Tseng, Yu-Chee, and Liu, Yu-Lun
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
We present SpectroMotion, a novel approach that combines 3D Gaussian Splatting (3DGS) with physically-based rendering (PBR) and deformation fields to reconstruct dynamic specular scenes. Previous methods extending 3DGS to model dynamic scenes have struggled to accurately represent specular surfaces. Our method addresses this limitation by introducing a residual correction technique for accurate surface normal computation during deformation, complemented by a deformable environment map that adapts to time-varying lighting conditions. We implement a coarse-to-fine training strategy that significantly enhances both scene geometry and specular color prediction. We demonstrate that our model outperforms prior methods for view synthesis of scenes containing dynamic specular objects and that it is the only existing 3DGS method capable of synthesizing photorealistic real-world dynamic specular scenes, outperforming state-of-the-art methods in rendering complex, dynamic, and specular scenes., Comment: Project page: https://cdfan0627.github.io/spectromotion/
- Published
- 2024
3. Inter-Cation Charge Transfer Mediated Antiferromagnetism in Co$_{1+x}$Ir$_{2-x}$S$_4$
- Author
-
Ji, Liang-Wen, Wu, Si-Qi, Li, Bai-Zhuo, Yang, Wu-Zhang, Song, Shi-Jie, Liu, Yi, Li, Jing, Ren, Zhi, and Cao, Guang-Han
- Subjects
Condensed Matter - Materials Science ,Condensed Matter - Strongly Correlated Electrons - Abstract
The antiferromagnetism in transition metal compounds is mostly mediated by the bridging anions through a so-called superexchange mechanism. However, in materials like normal spinels $AB_2X_4$ with local moments only at the $A$ site, such an anion-mediated superexchange needs to be modified. Here we report a new spinel compound Co$_{1+x}$Ir$_{2-x}$S$_4$ ($x$ = 0.3). The physical property measurements strongly suggest an antiferromagnetic-like transition at 292 K in the Co($A$) diamond sublattice. The first-principle calculations reveal that the nearest-neighbor Co($A$) spins align antiferromagnetically with an ordered magnetic moment of 1.67 $\mu_\mathrm{B}$, smaller than the expected $S = 3/2$ for Co$^{2+}$. In the antiferromagnetic state, there exists an inter-cation charge-transfer gap between the non-bonding Ir-$t_\mathrm{2g}$ orbitals at the valence band maximum and the Co-S antibonding molecular orbitals at the conduction band minimum. The small charge transfer energy significantly enhances the virtual hopping between these two states, facilitating a robust long-range superexchange interaction between two neighboring CoS$_4$ complexes, which accounts for the high N\'{e}el temperature in Co$_{1+x}$Ir$_{2-x}$S$_4$. This inter-cation charge transfer mediated magnetic interaction expands the traditional superexchange theory, which could be applicable in complex magnetic materials with multiple cations., Comment: 10 pages, 7 figures
- Published
- 2024
- Full Text
- View/download PDF
4. Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
- Author
-
Liu, Yi, Li, Chengxin, Xu, Shoukun, and Han, Jungong
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Multi-modal fusion has played a vital role in multi-modal scene understanding. Most existing methods focus on cross-modal fusion involving two modalities, often overlooking more complex multi-modal fusion, which is essential for real-world applications like autonomous driving, where visible, depth, event, LiDAR, etc., are used. Besides, few attempts for multi-modal fusion, \emph{e.g.}, simple concatenation, cross-modal attention, and token selection, cannot well dig into the intrinsic shared and specific details of multiple modalities. To tackle the challenge, in this paper, we propose a Part-Whole Relational Fusion (PWRF) framework. For the first time, this framework treats multi-modal fusion as part-whole relational fusion. It routes multiple individual part-level modalities to a fused whole-level modality using the part-whole relational routing ability of Capsule Networks (CapsNets). Through this part-whole routing, our PWRF generates modal-shared and modal-specific semantics from the whole-level modal capsules and the routing coefficients, respectively. On top of that, modal-shared and modal-specific details can be employed to solve the issue of multi-modal scene understanding, including synthetic multi-modal segmentation and visible-depth-thermal salient object detection in this paper. Experiments on several datasets demonstrate the superiority of the proposed PWRF framework for multi-modal scene understanding. The source code has been released on https://github.com/liuyi1989/PWRF.
- Published
- 2024
5. Shining Light on the Dark Sector: Search for Axion-like Particles and Other New Physics in Photonic Final States with FASER
- Author
-
FASER collaboration, Abraham, Roshan Mammen, Ai, Xiaocong, Anders, John, Antel, Claire, Ariga, Akitaka, Ariga, Tomoko, Atkinson, Jeremy, Bernlochner, Florian U., Bianchi, Emma, Boeckh, Tobias, Boyd, Jamie, Brenner, Lydia, Burger, Angela, Cadoux, Franck, Cardella, Roberto, Casper, David W., Cavanagh, Charlotte, Chen, Xin, Cho, Eunhyung, Chouhan, Dhruv, Coccaro, Andrea, Débieux, Stephane, D'Onofrio, Monica, Desai, Ansh, Dmitrievsky, Sergey, Dobre, Radu, Eley, Sinead, Favre, Yannick, Fellers, Deion, Feng, Jonathan L., Fenoglio, Carlo Alberto, Ferrere, Didier, Fieg, Max, Filali, Wissal, Firu, Elena, Garabaglu, Ali, Gibson, Stephen, Gonzalez-Sevilla, Sergio, Gornushkin, Yuri, Gwilliam, Carl, Hayakawa, Daiki, Holzbock, Michael, Hsu, Shih-Chieh, Hu, Zhen, Iacobucci, Giuseppe, Inada, Tomohiro, Iodice, Luca, Jakobsen, Sune, Joos, Hans, Kajomovitz, Enrique, Kawahara, Hiroaki, Keyken, Alex, Kling, Felix, Köck, Daniela, Kontaxakis, Pantelis, Kose, Umut, Kotitsa, Rafaella, Kuehn, Susanne, Kugathasan, Thanushan, Levinson, Lorne, Li, Ke, Liu, Jinfeng, Liu, Yi, Lutz, Margaret S., MacDonald, Jack, Magliocca, Chiara, Mäkelä, Toni, McCoy, Lawson, McFayden, Josh, Medina, Andrea Pizarro, Milanesio, Matteo, Moretti, Théo, Nakamura, Mitsuhiro, Nakano, Toshiyuki, Nevay, Laurie, Ohashi, Ken, Otono, Hidetoshi, Paolozzi, Lorenzo, Petersen, Brian, Preda, Titi, Prim, Markus, Queitsch-Maitland, Michaela, Rokujo, Hiroki, Rubbia, André, Sabater-Iglesias, Jorge, Sato, Osamu, Scampoli, Paola, Schmieden, Kristof, Schott, Matthias, Sfyrla, Anna, Sgalaberna, Davide, Shamim, Mansoora, Shively, Savannah, Takubo, Yosuke, Tarannum, Noshin, Theiner, Ondrej, Torrence, Eric, Martinez, Oscar Ivan Valdes, Vasina, Svetlana, Vormwald, Benedikt, Wang, Di, Wang, Yuxiao, Welch, Eli, Xu, Yue, Zahorec, Samuel, Zambito, Stefano, and Zhang, Shunliang
- Subjects
High Energy Physics - Experiment - Abstract
The first FASER search for a light, long-lived particle decaying into a pair of photons is reported. The search uses LHC proton-proton collision data at $\sqrt{s}=13.6~\text{TeV}$ collected in 2022 and 2023, corresponding to an integrated luminosity of $57.7\text{fb}^{-1}$. A model with axion-like particles (ALPs) dominantly coupled to weak gauge bosons is the primary target. Signal events are characterised by high-energy deposits in the electromagnetic calorimeter and no signal in the veto scintillators. One event is observed, compared to a background expectation of $0.44 \pm 0.39$ events, which is entirely dominated by neutrino interactions. World-leading constraints on ALPs are obtained for masses up to $300~\text{MeV}$ and couplings to the Standard Model W gauge boson, $g_{aWW}$, around $10^{-4}$ GeV$^{-1}$, testing a previously unexplored region of parameter space. Other new particle models that lead to the same experimental signature, including ALPs coupled to gluons or photons, U(1)$_B$ gauge bosons, up-philic scalars, and a Type-I two-Higgs doublet model, are also considered for interpretation, and new constraints on previously viable parameter space are presented in this paper., Comment: 37 pages, 22 figures
- Published
- 2024
6. PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs
- Author
-
Chen, Mengzhao, Liu, Yi, Wang, Jiahao, Bin, Yi, Shao, Wenqi, and Luo, Ping
- Subjects
Computer Science - Machine Learning ,Computer Science - Computation and Language - Abstract
Quantization is essential for deploying Large Language Models (LLMs) by enhancing memory efficiency and inference speed. Existing methods for activation quantization mainly address channel-wise outliers, often neglecting token-wise outliers, leading to reliance on costly per-token dynamic quantization. To address this, we introduce PrefixQuant, a novel technique that isolates outlier tokens offline without re-training. Specifically, PrefixQuant identifies high-frequency outlier tokens and prefixes them in the KV cache, preventing the generation of outlier tokens during inference and simplifying quantization. To our knowledge, PrefixQuant is the first to enable efficient per-tensor static quantization to outperform expensive per-token dynamic quantization. For instance, in W4A4KV4 (4- bit weight, 4-bit activation, and 4-bit KV cache) Llama-3-8B, PrefixQuant with per-tensor static quantization achieves a 7.43 WikiText2 perplexity and 71.08% average accuracy on 5 common-sense reasoning tasks, outperforming previous per-token dynamic quantization methods like QuaRot with 0.98 perplexity improvement and +5.98 points accuracy. Additionally, the inference speed of W4A4 quantized models using PrefixQuant is 1.60x to 2.81x faster than FP16 models and exceeds QuaRot models by 1.2x to 1.3x. Our code is available at \url{https://github.com/ChenMnZ/PrefixQuant}., Comment: A PTQ method to significantly boost the performance of static activation quantization
- Published
- 2024
7. Mamba Capsule Routing Towards Part-Whole Relational Camouflaged Object Detection
- Author
-
Zhang, Dingwen, Cheng, Liangbo, Liu, Yi, Wang, Xinggang, and Han, Junwei
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
The part-whole relational property endowed by Capsule Networks (CapsNets) has been known successful for camouflaged object detection due to its segmentation integrity. However, the previous Expectation Maximization (EM) capsule routing algorithm with heavy computation and large parameters obstructs this trend. The primary attribution behind lies in the pixel-level capsule routing. Alternatively, in this paper, we propose a novel mamba capsule routing at the type level. Specifically, we first extract the implicit latent state in mamba as capsule vectors, which abstract type-level capsules from pixel-level versions. These type-level mamba capsules are fed into the EM routing algorithm to get the high-layer mamba capsules, which greatly reduce the computation and parameters caused by the pixel-level capsule routing for part-whole relationships exploration. On top of that, to retrieve the pixel-level capsule features for further camouflaged prediction, we achieve this on the basis of the low-layer pixel-level capsules with the guidance of the correlations from adjacent-layer type-level mamba capsules. Extensive experiments on three widely used COD benchmark datasets demonstrate that our method significantly outperforms state-of-the-arts. Code has been available on https://github.com/Liangbo-Cheng/mamba\_capsule.
- Published
- 2024
8. Agent-Driven Large Language Models for Mandarin Lyric Generation
- Author
-
Liu, Hong-Hsiang and Liu, Yi-Wen
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Generative Large Language Models have shown impressive in-context learning abilities, performing well across various tasks with just a prompt. Previous melody-to-lyric research has been limited by scarce high-quality aligned data and unclear standard for creativeness. Most efforts focused on general themes or emotions, which are less valuable given current language model capabilities. In tonal contour languages like Mandarin, pitch contours are influenced by both melody and tone, leading to variations in lyric-melody fit. Our study, validated by the Mpop600 dataset, confirms that lyricists and melody writers consider this fit during their composition process. In this research, we developed a multi-agent system that decomposes the melody-to-lyric task into sub-tasks, with each agent controlling rhyme, syllable count, lyric-melody alignment, and consistency. Listening tests were conducted via a diffusion-based singing voice synthesizer to evaluate the quality of lyrics generated by different agent groups., Comment: 6 pages, figures, Accepted at O-COCOSDA 2024
- Published
- 2024
9. High-order primal mixed finite element method for boundary-value correction on curved domain
- Author
-
Hou, Yongli, Liu, Yi, and Zhao, Tengjin
- Subjects
Mathematics - Numerical Analysis ,65N15, 65N30 ,G.1.8 - Abstract
This paper addresses the non-homogeneous Neumann boundary condition on domains with curved boundaries. We consider the Raviart-Thomas element (RTk ) of degree $k \geq 1 $on triangular mesh. on a triangular mesh. A key feature of our boundary value correction method is the shift from the true boundary to a surrogate boundary. We present a high-order version of the method, achieving an $O(h^k+1/2)$ convergence in $L^2$-norm estimate for the velocity field and an $O(h^k )$ convergence in $H^1$-norm estimate for the pressure. Finally, numerical experiments validate our theoretical results., Comment: 21 pages,2 figures
- Published
- 2024
10. FlipGuard: Defending Preference Alignment against Update Regression with Constrained Optimization
- Author
-
Zhu, Mingye, Liu, Yi, Wang, Quan, Guo, Junbo, and Mao, Zhendong
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Recent breakthroughs in preference alignment have significantly improved Large Language Models' ability to generate texts that align with human preferences and values. However, current alignment metrics typically emphasize the post-hoc overall improvement, while overlooking a critical aspect: regression, which refers to the backsliding on previously correctly-handled data after updates. This potential pitfall may arise from excessive fine-tuning on already well-aligned data, which subsequently leads to over-alignment and degeneration. To address this challenge, we propose FlipGuard, a constrained optimization approach to detect and mitigate update regression with focal attention. Specifically, FlipGuard identifies performance degradation using a customized reward characterization and strategically enforces a constraint to encourage conditional congruence with the pre-aligned model during training. Comprehensive experiments demonstrate that FlipGuard effectively alleviates update regression while demonstrating excellent overall performance, with the added benefit of knowledge preservation while aligning preferences., Comment: Accepted by EMNLP 2024 Main track
- Published
- 2024
11. Orbital-FFLO State and Josephson Vortex Lattice Melting in Layered Ising Superconductors
- Author
-
Yan, Hongyi, Liu, Haiwen, Liu, Yi, Zhang, Ding, and Xie, X. C.
- Subjects
Condensed Matter - Superconductivity ,Condensed Matter - Statistical Mechanics - Abstract
This study explores the impact of in-plane magnetic fields on the superconducting state in layered Ising superconductors, resulting in the emergence of the orbital Fulde-Ferrell-Larkin-Ovchinnikov (FFLO) state coupled with Josephson vortices. Recent experiments have revealed an unexpected first-order phase transition in these superconductors under strong in-plane magnetic fields. Our theoretical analysis demonstrates that this phase transition is primarily driven by the formation and subsequent melting of a Josephson vortex lattice within the superconducting layers. As the magnetic field increases, the vortex lattice undergoes a transition from a solid to a liquid state, triggering the observed first-order phase transition. We calculate both the melting line and the in-plane critical field in the phase diagram, showing strong agreement with experimental results.
- Published
- 2024
12. On the Euler class one conjecture for fillable contact structures
- Author
-
Liu, Yi
- Subjects
Mathematics - Geometric Topology ,Primary 57K32, 57K18, Secondary 57K33 - Abstract
In this paper, it is proved that every oriented closed hyperbolic $3$--manifold $N$ admits some finite cover $M$ with the following property. There exists some even lattice point $w$ on the boundary of the dual Thurston norm unit ball of $M$, such that $w$ is not the real Euler class of any weakly symplectically fillable contact structure on $M$. In particular, $w$ is not the real Euler class of any transversely oriented, taut foliation on $M$. This supplies new counter-examples to Thurston's Euler class one conjecture., Comment: 20 pages; comments welcome
- Published
- 2024
13. Cost-Effective Community-Hierarchy-Based Mutual Voting Approach for Influence Maximization in Complex Networks
- Author
-
Liu, Yi, Tang, Xiaoan, Pedrycz, Witold, and Zhang, Qiang
- Subjects
Computer Science - Social and Information Networks ,Computer Science - Information Retrieval - Abstract
Various types of promising techniques have come into being for influence maximization whose aim is to identify influential nodes in complex networks. In essence, real-world applications usually have high requirements on the balance between time complexity and accuracy of influential nodes identification. To address the challenges of imperfect node influence measurement and inefficient seed nodes selection mechanism in such class of foregoing techniques, this article proposes a novel approach called Cost-Effective Community-Hierarchy-Based Mutual Voting for influence maximization in complex networks. First, we develop a method for measuring the importance of different nodes in networks based on an original concept of Dual-Scale Community-Hierarchy Information that synthesizes both hierarchy structural information and community structural information of nodes. The community structural information contained in the nodes is measured by a new notion of Hierarchical-Community Entropy. Second, we develop a method named Cost-Effective Mutual-Influence-based Voting for seed nodes selection. Hereinto, a low-computational-cost mutual voting mechanism and an updating strategy called Lazy Score Updating Strategy are newly constructed for optimizing the selecting of seed nodes. Third, we develop a balance index to evaluate the performance of different methods in striking the tradeoff between time complexity and the accuracy of influential nodes identification. Finally, we demonstrate the approach performance over ten public datasets. The extensive experiments show that the proposed approach outperforms 16 state-of-the-art techniques on the balance between time complexity and accuracy of influential nodes identification. Compared with the method with the second highest value of the balance index, our approach can be improved by at most 9.29%.
- Published
- 2024
14. Bridging the Gap: GRB 230812B -- A Three-Second Supernova-Associated Burst Detected by the GRID Mission
- Author
-
Wang, Chen-Yu, Yin, Yi-Han Iris, Zhang, Bin-Bin, Feng, Hua, Zeng, Ming, Xiong, Shao-Lin, Pan, Xiao-Fan, Yang, Jun, Zhang, Yan-Qiu, Li, Chen, Yan, Zhen-Yu, Wang, Chen-Wei, Zheng, Xu-Tao, Liu, Jia-Cong, Wang, Qi-Dong, Yang, Zi-Rui, Li, Long-Hao, Liu, Qi-Ze, Zhao, Zheng-Yang, Hu, Bo, Liu, Yi-Qi, Lu, Si-Yuan, Luo, Zi-You, Cang, Ji-Rong, Cao, De-Zhi, Han, Wen-Tao, Jia, Li-Ping, Pan, Xing-Yu, Tian, Yang, Xu, Ben-Da, Yang, Xiao, and Zeng, Zhi
- Subjects
Astrophysics - High Energy Astrophysical Phenomena - Abstract
GRB 230812B, detected by the Gamma-Ray Integrated Detectors (GRID) constellation mission, is an exceptionally bright gamma-ray burst (GRB) with a duration of only 3 seconds. Sitting near the traditional boundary ($\sim$ 2 s) between long and short GRBs, GRB 230812B is notably associated with a supernova (SN), indicating a massive star progenitor. This makes it a rare example of a short-duration GRB resulting from stellar collapse. Our analysis, using a time-evolving synchrotron model, suggests that the burst has an emission radius of approximately $10^{14.5}$~cm. We propose that the short duration of GRB 230812B is due to the combined effects of the central engine's activity time and the time required for the jet to break through the stellar envelope. Our findings provide another case that challenges the conventional view that short-duration GRBs originate exclusively from compact object mergers, demonstrating that a broader range of durations exists for GRBs arising from the collapse of massive stars., Comment: 10 pages, 3 tables, 11 figures
- Published
- 2024
15. Noise-aware Dynamic Image Denoising and Positron Range Correction for Rubidium-82 Cardiac PET Imaging via Self-supervision
- Author
-
Xie, Huidong, Guo, Liang, Velo, Alexandre, Liu, Zhao, Liu, Qiong, Guo, Xueqi, Zhou, Bo, Chen, Xiongchao, Tsai, Yu-Jung, Miao, Tianshun, Xia, Menghua, Liu, Yi-Hwa, Armstrong, Ian S., Wang, Ge, Carson, Richard E., Sinusas, Albert J., and Liu, Chi
- Subjects
Electrical Engineering and Systems Science - Image and Video Processing - Abstract
Rb-82 is a radioactive isotope widely used for cardiac PET imaging. Despite numerous benefits of 82-Rb, there are several factors that limits its image quality and quantitative accuracy. First, the short half-life of 82-Rb results in noisy dynamic frames. Low signal-to-noise ratio would result in inaccurate and biased image quantification. Noisy dynamic frames also lead to highly noisy parametric images. The noise levels also vary substantially in different dynamic frames due to radiotracer decay and short half-life. Existing denoising methods are not applicable for this task due to the lack of paired training inputs/labels and inability to generalize across varying noise levels. Second, 82-Rb emits high-energy positrons. Compared with other tracers such as 18-F, 82-Rb travels a longer distance before annihilation, which negatively affect image spatial resolution. Here, the goal of this study is to propose a self-supervised method for simultaneous (1) noise-aware dynamic image denoising and (2) positron range correction for 82-Rb cardiac PET imaging. Tested on a series of PET scans from a cohort of normal volunteers, the proposed method produced images with superior visual quality. To demonstrate the improvement in image quantification, we compared image-derived input functions (IDIFs) with arterial input functions (AIFs) from continuous arterial blood samples. The IDIF derived from the proposed method led to lower AUC differences, decreasing from 11.09% to 7.58% on average, compared to the original dynamic frames. The proposed method also improved the quantification of myocardium blood flow (MBF), as validated against 15-O-water scans, with mean MBF differences decreased from 0.43 to 0.09, compared to the original dynamic frames. We also conducted a generalizability experiment on 37 patient scans obtained from a different country using a different scanner., Comment: 15 Pages, 10 Figures, 5 tables. Paper Under review. Oral Presentation at IEEE MIC 2023
- Published
- 2024
16. Range-SLAM: Ultra-Wideband-Based Smoke-Resistant Real-Time Localization and Mapping
- Author
-
Liu, Yi, Jian, Zhuozhu, Zheng, Shengtao, Liu, Houde, Wang, Xueqian, Chen, Xinlei, and Liang, Bin
- Subjects
Computer Science - Robotics - Abstract
This paper presents Range-SLAM, a real-time, lightweight SLAM system designed to address the challenges of localization and mapping in environments with smoke and other harsh conditions using Ultra-Wideband (UWB) signals. While optical sensors like LiDAR and cameras struggle in low-visibility environments, UWB signals provide a robust alternative for real-time positioning. The proposed system uses general UWB devices to achieve accurate mapping and localization without relying on expensive LiDAR or other dedicated hardware. By utilizing only the distance and Received Signal Strength Indicator (RSSI) provided by UWB sensors in relation to anchors, we combine the motion of the tag-carrying agent with raycasting algorithm to construct a 2D occupancy grid map in real time. To enhance localization in challenging conditions, a Weighted Least Squares (WLS) method is employed. Extensive real-world experiments, including smoke-filled environments and simulated
- Published
- 2024
17. Optically-Validated Microvascular Phantom for Super-Resolution Ultrasound Imaging
- Author
-
Raad, Jaime Parra, Lock, Daniel, Liu, Yi-Yi, Solomon, Mark, Peralta, Laura, and Christensen-Jeffries, Kirsten
- Subjects
Physics - Medical Physics - Abstract
Super-resolution ultrasound (SRUS) visualises microvasculature beyond the ultrasound diffraction limit (wavelength($\lambda$)/2) by localising and tracking spatially isolated microbubble contrast agents. SRUS phantoms typically consist of simple tube structures, where diameter channels below 100 $\mu$m are not available. Furthermore, these phantoms are generally fragile and unstable, have limited ground truth validation, and their simple structure limits the evaluation of SRUS algorithms. To aid SRUS development, robust and durable phantoms with known and physiologically relevant microvasculature are needed for repeatable SRUS testing. This work proposes a method to fabricate durable microvascular phantoms that allow optical gauging for SRUS validation. The methodology used a microvasculature negative print embedded in a Polydimethylsiloxane to fabricate a microvascular phantom. Branching microvascular phantoms with variable microvascular density were demonstrated with optically validated vessel diameters down to $\sim$ 60 $\mu$m ($\lambda$/5.8; $\lambda$ =$\sim$ 350 $\mu$m). SRUS imaging was performed and validated with optical measurements. The average SRUS error was 15.61 $\mu$m ($\lambda$/22) with a standard deviation error of 11.44 $\mu$m. The average error decreased to 7.93 $\mu$m ($\lambda$/44) once the number of localised microbubbles surpassed 1000 per estimated diameter. In addition, the less than 10$\%$ variance of acoustic and optical properties and the mechanical toughness of the phantoms measured a year after fabrication demonstrated their long-term durability. This work presents a method to fabricate durable and optically validated complex microvascular phantoms which can be used to quantify SRUS performance and facilitate its further development., Comment: This work has been submitted to the IEEE for possible publication
- Published
- 2024
18. Knowledge Distillation via Query Selection for Detection Transformer
- Author
-
Liu, Yi, Wang, Luting, Tang, Zongheng, Liao, Yue, Sun, Yifan, Zhang, Lijun, and Liu, Si
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Transformers have revolutionized the object detection landscape by introducing DETRs, acclaimed for their simplicity and efficacy. Despite their advantages, the substantial size of these models poses significant challenges for practical deployment, particularly in resource-constrained environments. This paper addresses the challenge of compressing DETR by leveraging knowledge distillation, a technique that holds promise for maintaining model performance while reducing size. A critical aspect of DETRs' performance is their reliance on queries to interpret object representations accurately. Traditional distillation methods often focus exclusively on positive queries, identified through bipartite matching, neglecting the rich information present in hard-negative queries. Our visual analysis indicates that hard-negative queries, focusing on foreground elements, are crucial for enhancing distillation outcomes. To this end, we introduce a novel Group Query Selection strategy, which diverges from traditional query selection in DETR distillation by segmenting queries based on their Generalized Intersection over Union (GIoU) with ground truth objects, thereby uncovering valuable hard-negative queries for distillation. Furthermore, we present the Knowledge Distillation via Query Selection for DETR (QSKD) framework, which incorporates Attention-Guided Feature Distillation (AGFD) and Local Alignment Prediction Distillation (LAPD). These components optimize the distillation process by focusing on the most informative aspects of the teacher model's intermediate features and output. Our comprehensive experimental evaluation of the MS-COCO dataset demonstrates the effectiveness of our approach, significantly improving average precision (AP) across various DETR architectures without incurring substantial computational costs. Specifically, the AP of Conditional DETR ResNet-18 increased from 35.8 to 39.9.
- Published
- 2024
19. Towards Practical Overlay Networks for Decentralized Federated Learning
- Author
-
Hua, Yifan, Pang, Jinlong, Zhang, Xiaoxue, Liu, Yi, Shi, Xiaofeng, Wang, Bao, Liu, Yang, and Qian, Chen
- Subjects
Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Networking and Internet Architecture - Abstract
Decentralized federated learning (DFL) uses peer-to-peer communication to avoid the single point of failure problem in federated learning and has been considered an attractive solution for machine learning tasks on distributed devices. We provide the first solution to a fundamental network problem of DFL: what overlay network should DFL use to achieve fast training of highly accurate models, low communication, and decentralized construction and maintenance? Overlay topologies of DFL have been investigated, but no existing DFL topology includes decentralized protocols for network construction and topology maintenance. Without these protocols, DFL cannot run in practice. This work presents an overlay network, called FedLay, which provides fast training and low communication cost for practical DFL. FedLay is the first solution for constructing near-random regular topologies in a decentralized manner and maintaining the topologies under node joins and failures. Experiments based on prototype implementation and simulations show that FedLay achieves the fastest model convergence and highest accuracy on real datasets compared to existing DFL solutions while incurring small communication costs and being resilient to node joins and failures.
- Published
- 2024
20. How Privacy-Savvy Are Large Language Models? A Case Study on Compliance and Privacy Technical Review
- Author
-
Zhu, Xichou, Liu, Yang, Shen, Zhou, Liu, Yi, Li, Min, Chen, Yujun, John, Benzi, Ma, Zhenzhen, Hu, Tao, Li, Zhi, Yang, Bolong, Wang, Manman, Xie, Zongxing, Liu, Peng, Cai, Dan, and Wang, Junhui
- Subjects
Computer Science - Computation and Language - Abstract
The recent advances in large language models (LLMs) have significantly expanded their applications across various fields such as language generation, summarization, and complex question answering. However, their application to privacy compliance and technical privacy reviews remains under-explored, raising critical concerns about their ability to adhere to global privacy standards and protect sensitive user data. This paper seeks to address this gap by providing a comprehensive case study evaluating LLMs' performance in privacy-related tasks such as privacy information extraction (PIE), legal and regulatory key point detection (KPD), and question answering (QA) with respect to privacy policies and data protection regulations. We introduce a Privacy Technical Review (PTR) framework, highlighting its role in mitigating privacy risks during the software development life-cycle. Through an empirical assessment, we investigate the capacity of several prominent LLMs, including BERT, GPT-3.5, GPT-4, and custom models, in executing privacy compliance checks and technical privacy reviews. Our experiments benchmark the models across multiple dimensions, focusing on their precision, recall, and F1-scores in extracting privacy-sensitive information and detecting key regulatory compliance points. While LLMs show promise in automating privacy reviews and identifying regulatory discrepancies, significant gaps persist in their ability to fully comply with evolving legal standards. We provide actionable recommendations for enhancing LLMs' capabilities in privacy compliance, emphasizing the need for robust model improvements and better integration with legal and regulatory requirements. This study underscores the growing importance of developing privacy-aware LLMs that can both support businesses in compliance efforts and safeguard user privacy rights., Comment: 8 pages, 4 figures
- Published
- 2024
21. Do Large Language Models Possess Sensitive to Sentiment?
- Author
-
Liu, Yang, Zhu, Xichou, Shen, Zhou, Liu, Yi, Li, Min, Chen, Yujun, John, Benzi, Ma, Zhenzhen, Hu, Tao, Li, Zhi, Xu, Zhiyang, Luo, Wei, and Wang, Junhui
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large Language Models (LLMs) have recently displayed their extraordinary capabilities in language understanding. However, how to comprehensively assess the sentiment capabilities of LLMs continues to be a challenge. This paper investigates the ability of LLMs to detect and react to sentiment in text modal. As the integration of LLMs into diverse applications is on the rise, it becomes highly critical to comprehend their sensitivity to emotional tone, as it can influence the user experience and the efficacy of sentiment-driven tasks. We conduct a series of experiments to evaluate the performance of several prominent LLMs in identifying and responding appropriately to sentiments like positive, negative, and neutral emotions. The models' outputs are analyzed across various sentiment benchmarks, and their responses are compared with human evaluations. Our discoveries indicate that although LLMs show a basic sensitivity to sentiment, there are substantial variations in their accuracy and consistency, emphasizing the requirement for further enhancements in their training processes to better capture subtle emotional cues. Take an example in our findings, in some cases, the models might wrongly classify a strongly positive sentiment as neutral, or fail to recognize sarcasm or irony in the text. Such misclassifications highlight the complexity of sentiment analysis and the areas where the models need to be refined. Another aspect is that different LLMs might perform differently on the same set of data, depending on their architecture and training datasets. This variance calls for a more in-depth study of the factors that contribute to the performance differences and how they can be optimized., Comment: 10 pages, 2 figures
- Published
- 2024
22. Efficient Detection of Toxic Prompts in Large Language Models
- Author
-
Liu, Yi, Yu, Junzhe, Sun, Huijia, Shi, Ling, Deng, Gelei, Chen, Yuqi, and Liu, Yang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Software Engineering - Abstract
Large language models (LLMs) like ChatGPT and Gemini have significantly advanced natural language processing, enabling various applications such as chatbots and automated content generation. However, these models can be exploited by malicious individuals who craft toxic prompts to elicit harmful or unethical responses. These individuals often employ jailbreaking techniques to bypass safety mechanisms, highlighting the need for robust toxic prompt detection methods. Existing detection techniques, both blackbox and whitebox, face challenges related to the diversity of toxic prompts, scalability, and computational efficiency. In response, we propose ToxicDetector, a lightweight greybox method designed to efficiently detect toxic prompts in LLMs. ToxicDetector leverages LLMs to create toxic concept prompts, uses embedding vectors to form feature vectors, and employs a Multi-Layer Perceptron (MLP) classifier for prompt classification. Our evaluation on various versions of the LLama models, Gemma-2, and multiple datasets demonstrates that ToxicDetector achieves a high accuracy of 96.39\% and a low false positive rate of 2.00\%, outperforming state-of-the-art methods. Additionally, ToxicDetector's processing time of 0.0780 seconds per prompt makes it highly suitable for real-time applications. ToxicDetector achieves high accuracy, efficiency, and scalability, making it a practical method for toxic prompt detection in LLMs., Comment: Accepted by the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE 2024)
- Published
- 2024
23. Image-Based Geolocation Using Large Vision-Language Models
- Author
-
Liu, Yi, Ding, Junchen, Deng, Gelei, Li, Yuekang, Zhang, Tianwei, Sun, Weisong, Zheng, Yaowen, Ge, Jingquan, and Liu, Yang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Geolocation is now a vital aspect of modern life, offering numerous benefits but also presenting serious privacy concerns. The advent of large vision-language models (LVLMs) with advanced image-processing capabilities introduces new risks, as these models can inadvertently reveal sensitive geolocation information. This paper presents the first in-depth study analyzing the challenges posed by traditional deep learning and LVLM-based geolocation methods. Our findings reveal that LVLMs can accurately determine geolocations from images, even without explicit geographic training. To address these challenges, we introduce \tool{}, an innovative framework that significantly enhances image-based geolocation accuracy. \tool{} employs a systematic chain-of-thought (CoT) approach, mimicking human geoguessing strategies by carefully analyzing visual and contextual cues such as vehicle types, architectural styles, natural landscapes, and cultural elements. Extensive testing on a dataset of 50,000 ground-truth data points shows that \tool{} outperforms both traditional models and human benchmarks in accuracy. It achieves an impressive average score of 4550.5 in the GeoGuessr game, with an 85.37\% win rate, and delivers highly precise geolocation predictions, with the closest distances as accurate as 0.3 km. Furthermore, our study highlights issues related to dataset integrity, leading to the creation of a more robust dataset and a refined framework that leverages LVLMs' cognitive capabilities to improve geolocation precision. These findings underscore \tool{}'s superior ability to interpret complex visual data, the urgent need to address emerging security vulnerabilities posed by LVLMs, and the importance of responsible AI development to ensure user privacy protection.
- Published
- 2024
24. Characterizing and Evaluating the Reliability of LLMs against Jailbreak Attacks
- Author
-
Chen, Kexin, Liu, Yi, Wang, Dongxia, Chen, Jiaying, and Wang, Wenhai
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Software Engineering - Abstract
Large Language Models (LLMs) have increasingly become pivotal in content generation with notable societal impact. These models hold the potential to generate content that could be deemed harmful.Efforts to mitigate this risk include implementing safeguards to ensure LLMs adhere to social ethics.However, despite such measures, the phenomenon of "jailbreaking" -- where carefully crafted prompts elicit harmful responses from models -- persists as a significant challenge. Recognizing the continuous threat posed by jailbreaking tactics and their repercussions for the trustworthy use of LLMs, a rigorous assessment of the models' robustness against such attacks is essential. This study introduces an comprehensive evaluation framework and conducts an large-scale empirical experiment to address this need. We concentrate on 10 cutting-edge jailbreak strategies across three categories, 1525 questions from 61 specific harmful categories, and 13 popular LLMs. We adopt multi-dimensional metrics such as Attack Success Rate (ASR), Toxicity Score, Fluency, Token Length, and Grammatical Errors to thoroughly assess the LLMs' outputs under jailbreak. By normalizing and aggregating these metrics, we present a detailed reliability score for different LLMs, coupled with strategic recommendations to reduce their susceptibility to such vulnerabilities. Additionally, we explore the relationships among the models, attack strategies, and types of harmful content, as well as the correlations between the evaluation metrics, which proves the validity of our multifaceted evaluation framework. Our extensive experimental results demonstrate a lack of resilience among all tested LLMs against certain strategies, and highlight the need to concentrate on the reliability facets of LLMs. We believe our study can provide valuable insights into enhancing the security evaluation of LLMs against jailbreak within the domain.
- Published
- 2024
25. Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions
- Author
-
Liu, Quan, Zhou, Zhenhong, He, Longzhu, Liu, Yi, Zhang, Wei, and Su, Sen
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Large language models are susceptible to jailbreak attacks, which can result in the generation of harmful content. While prior defenses mitigate these risks by perturbing or inspecting inputs, they ignore competing objectives, the underlying cause of alignment failures. In this paper, we propose Alignment-Enhanced Decoding (AED), a novel defense that employs adaptive decoding to address the root causes of jailbreak issues. We first define the Competitive Index to quantify alignment failures and utilize feedback from self-evaluation to compute post-alignment logits. Then, AED adaptively combines AED and post-alignment logits with the original logits to obtain harmless and helpful distributions. Consequently, our method enhances safety alignment while maintaining helpfulness. We conduct experiments across five models and four common jailbreaks, with the results validating the effectiveness of our approach. Code is available at https://github.com/GIGABaozi/AED.git., Comment: 15 pages, 5 figures
- Published
- 2024
26. MooER: LLM-based Speech Recognition and Translation Models from Moore Threads
- Author
-
Xu, Junhao, Liang, Zhenlin, Liu, Yi, Hu, Yichao, Li, Jian, Zheng, Yajun, Cai, Meng, and Wang, Hua
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
In this paper, we present MooER, a LLM-based large-scale automatic speech recognition (ASR) / automatic speech translation (AST) model of Moore Threads. A 5000h pseudo labeled dataset containing open source and self collected speech data is used for training. We achieve performance comparable to other open source models trained with up to hundreds of thousands of hours of labeled speech data. Meanwhile, experiments conducted on Covost2 Zh2en testset suggest that our model outperforms other open source Speech LLMs. A BLEU score of 25.2 can be obtained. The main contributions of this paper are summarized as follows. First, this paper presents a training strategy for encoders and LLMs on speech related tasks (including ASR and AST) using a small size of pseudo labeled data without any extra manual annotation and selection. Second, we release our ASR and AST models and plan to open-source our training code and strategy in the near future. Moreover, a model trained on 8wh scale training data is planned to be released later on.
- Published
- 2024
27. Integrated Dynamic Phenological Feature for Remote Sensing Image Land Cover Change Detection
- Author
-
Liu, Yi, Sun, Chenhao, Ye, Hao, Liu, Xiangying, and Ju, Weilong
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Remote sensing image change detection (CD) is essential for analyzing land surface changes over time, with a significant challenge being the differentiation of actual changes from complex scenes while filtering out pseudo-changes. A primary contributor to this challenge is the intra-class dynamic changes due to phenological characteristics in natural areas. To overcome this, we introduce the InPhea model, which integrates phenological features into a remote sensing image CD framework. The model features a detector with a differential attention module for improved feature representation of change information, coupled with high-resolution feature extraction and spatial pyramid blocks to enhance performance. Additionally, a constrainer with four constraint modules and a multi-stage contrastive learning approach is employed to aid in the model's understanding of phenological characteristics. Experiments on the HRSCD, SECD, and PSCD-Wuhan datasets reveal that InPhea outperforms other models, confirming its effectiveness in addressing phenological pseudo-changes and its overall model superiority.
- Published
- 2024
28. Resilience-Runtime Tradeoff Relations for Quantum Algorithms
- Author
-
García-Pintos, Luis Pedro, O'Leary, Tom, Biswas, Tanmoy, Bringewatt, Jacob, Cincio, Lukasz, Brady, Lucas T., and Liu, Yi-Kai
- Subjects
Quantum Physics - Abstract
A leading approach to algorithm design aims to minimize the number of operations in an algorithm's compilation. One intuitively expects that reducing the number of operations may decrease the chance of errors. This paradigm is particularly prevalent in quantum computing, where gates are hard to implement and noise rapidly decreases a quantum computer's potential to outperform classical computers. Here, we find that minimizing the number of operations in a quantum algorithm can be counterproductive, leading to a noise sensitivity that induces errors when running the algorithm in non-ideal conditions. To show this, we develop a framework to characterize the resilience of an algorithm to perturbative noises (including coherent errors, dephasing, and depolarizing noise). Some compilations of an algorithm can be resilient against certain noise sources while being unstable against other noises. We condense these results into a tradeoff relation between an algorithm's number of operations and its noise resilience. We also show how this framework can be leveraged to identify compilations of an algorithm that are better suited to withstand certain noises.
- Published
- 2024
29. On the Effect of Driving Turbulent-like Fluctuations on a Harris-Current Sheet Configuration and the Formation of Plasmoids
- Author
-
Rueda, Jeffersson Andres Agudelo, Liu, Yi-Hsin, Germaschewski, Kai, Hesse, Michael, and Bessho, Naoki
- Subjects
Physics - Plasma Physics - Abstract
Energy dissipation in collisionless plasmas is one of the most outstanding open questions in plasma physics. Magnetic reconnection and turbulence are two phenomena that can produce the conditions for energy dissipation. These two phenomena are closely related to each other in a wide range of plasmas. Turbulent fluctuations can emerge in critical regions of reconnection events, and magnetic reconnection can occur as a product of the turbulent cascade. In this study, we perform 2D particle-in-cell simulations of a reconnecting Harris current sheet in the presence of turbulent fluctuations to explore the effect of turbulence on the reconnection process in collisionless non-relativistic pair-plasmas. We find that the presence of a turbulent field can affect the onset and evolution of magnetic reconnection. Moreover, we observe the existence of a scale dependent amplitude of magnetic field fluctuations above which these fluctuations are able to disrupt the growing of magnetic islands. These fluctuations provide thermal energy to the particles within the current sheet and preferential perpendicular thermal energy to the background population.
- Published
- 2024
- Full Text
- View/download PDF
30. Enhanced Family Tree: Evolving Research and Expression
- Author
-
Xiang, Fan, Zhu, Shunshan, Wang, Zhigang, Maher, Kevin, Liu, Yi, Zhu, Yilin, Chen, Kaixi, and Liang, Zhiqiang
- Published
- 2020
31. Mechanochemical in Situ Encapsulation of Palladium in Covalent Organic Frameworks
- Author
-
Brown, Normanda, Zhang, Qingsong, Alsudairy, Ziad, Dun, Chaochao, Nailwal, Yogendra, Campbell, Allea, Harrod, Chelsea, Chen, Linfeng, Williams, Spirit, Urban, Jeffrey J, Liu, Yi, and Li, Xinle
- Subjects
Inorganic Chemistry ,Macromolecular and Materials Chemistry ,Organic Chemistry ,Chemical Sciences ,mechanochemistry ,in situ metal encapsulation ,covalent organic frameworks ,palladium ,heterogeneouscatalysis ,Analytical Chemistry ,Environmental Science and Management ,Chemical Engineering ,Analytical chemistry ,Chemical engineering - Abstract
Palladium-encapsulated covalent organic frameworks (Pd/COFs) have garnered enormous attention in heterogeneous catalysis. However, the dominant ex situ encapsulation synthesis is tedious (multistep), time-consuming (typically 4 days or more), and involves the use of noxious solvents. Here we develop a mechanochemical in situ encapsulation strategy that enables the one-step, time-efficient, and environmentally benign synthesis of Pd/COFs. By ball milling COF precursors along with palladium acetate (Pd(OAc)2) in one pot under air at room temperature, Pd/COF hybrids were readily synthesized within an hour, exhibiting high crystallinity, uniform Pd dispersion, and superb scalability up to gram scale. Moreover, this versatile strategy can be extended to the synthesis of three Pd/COFs. Remarkably, the resulting Pd/DMTP-TPB showcases extraordinary activity (96-99% yield in 1 h at room temperature) and broad substrate scope (>10 functionalized biaryls) for the Suzuki-Miyaura coupling reaction of aryl bromides and arylboronic acids. Furthermore, the heterogeneity of Pd/DMTP-TPB is verified by recycling and leaching tests. The mechanochemical in situ encapsulation strategy disclosed herein paves a facile, rapid, scalable, and environmentally benign avenue to access metal/COF catalysts for efficient heterogeneous catalysis.
- Published
- 2024
32. FTuner: A Fast Dynamic Shape Tensors Program Auto-Tuner for Deep Learning Compilers
- Author
-
Mu, Pengyu, Wei, Linquan, Liu, Yi, and Wang, Rui
- Subjects
Computer Science - Machine Learning ,Computer Science - Distributed, Parallel, and Cluster Computing ,68M20 (Primary) - Abstract
Many artificial intelligence models process input data of different lengths and resolutions, making the shape of the tensors dynamic. The performance of these models depends on the shape of the tensors, which makes it difficult to optimize the tensors before the model runs. There are two common solutions to this problem. The first is to add useless data to the input to match a pre-optimized tensor library. The second is to use small basic tensors to create a tensor that is closest in size to the input data and then tune it to minimize padding. However, this second solution can be time-consuming. This paper proposes a new technique for deep learning compilers called FTuner. Instead of using a large design space or training a cost model, we use an abstract computational unit called the uKernel to patch together small, various-sized tensors to match the shape of the input tensor. We determine the shape of the uKernel using an analytic hardware information model. Experiments show that the FTuner can achieve comparable operators and end-to-end performance to vendor libraries and achieves 3\% speedup on existing auto-tuner with the model-training compiler while reducing tuning time by two orders of magnitude., Comment: 14 pages, 16 figures, 6 tables
- Published
- 2024
33. Steering laser-produced THz radiation in air with superluminal ionization fronts
- Author
-
Fu, Silin, Groussin, Baptiste, Liu, Yi, Mysyrowicz, Andre, Tikhonchuk, Vladimir, and Houard, Aurelien
- Subjects
Physics - Optics ,Physics - Plasma Physics - Abstract
We demonstrate that a single-color ultrashort optical pulse propagating in air can emit THz radiation along any direction with respect to its propagation axis. The emission angle can be adjusted by the flying focus technique which determines the speed and direction of the ionization front. When the ionization front velocity becomes superluminal, the THz emission corresponds to classical Cherenkov radiation., Comment: 8 pages, 5 figures
- Published
- 2024
34. SPOLRE: Semantic Preserving Object Layout Reconstruction for Image Captioning System Testing
- Author
-
Liu, Yi, Wang, Guanyu, Zheng, Xinyi, Deng, Gelei, Wang, Kailong, Liu, Yang, and Wang, Haoyu
- Subjects
Computer Science - Software Engineering - Abstract
Image captioning (IC) systems, such as Microsoft Azure Cognitive Service, translate image content into descriptive language but can generate inaccuracies leading to misinterpretations. Advanced testing techniques like MetaIC and ROME aim to address these issues but face significant challenges. These methods require intensive manual labor for detailed annotations and often produce unrealistic images, either by adding unrelated objects or failing to remove existing ones. Additionally, they generate limited test suites, with MetaIC restricted to inserting specific objects and ROME limited to a narrow range of variations. We introduce SPOLRE, a novel automated tool for semantic-preserving object layout reconstruction in IC system testing. SPOLRE leverages four transformation techniques to modify object layouts without altering the image's semantics. This automated approach eliminates the need for manual annotations and creates realistic, varied test suites. Our tests show that over 75% of survey respondents find SPOLRE-generated images more realistic than those from state-of-the-art methods. SPOLRE excels in identifying caption errors, detecting 31,544 incorrect captions across seven IC systems with an average precision of 91.62%, surpassing other methods which average 85.65% accuracy and identify 17,160 incorrect captions. Notably, SPOLRE identified 6,236 unique issues within Azure, demonstrating its effectiveness against one of the most advanced IC systems.
- Published
- 2024
35. RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer
- Author
-
Lv, Wenyu, Zhao, Yian, Chang, Qinyao, Huang, Kui, Wang, Guanzhong, and Liu, Yi
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
In this report, we present RT-DETRv2, an improved Real-Time DEtection TRansformer (RT-DETR). RT-DETRv2 builds upon the previous state-of-the-art real-time detector, RT-DETR, and opens up a set of bag-of-freebies for flexibility and practicality, as well as optimizing the training strategy to achieve enhanced performance. To improve the flexibility, we suggest setting a distinct number of sampling points for features at different scales in the deformable attention to achieve selective multi-scale feature extraction by the decoder. To enhance practicality, we propose an optional discrete sampling operator to replace the grid_sample operator that is specific to RT-DETR compared to YOLOs. This removes the deployment constraints typically associated with DETRs. For the training strategy, we propose dynamic data augmentation and scale-adaptive hyperparameters customization to improve performance without loss of speed. Source code and pre-trained models will be available at https://github.com/lyuwenyu/RT-DETR.
- Published
- 2024
36. Studying Critical Parameters of Superconductor via Diamond Quantum Sensors
- Author
-
Ho, Kin On, Leung, Wai Kuen, Pang, Yiu Yung, Yip, King Yau, Xie, Jianyu, Liu, Yi Man, Rotelli, Aliki Sofia, Leung, Man Yin, Chow, Ho Yin, Lai, Kwing To, Denisenko, Andrej, Keimer, B., Wrachtrup, Jörg, and Yang, Sen
- Subjects
Condensed Matter - Mesoscale and Nanoscale Physics ,Condensed Matter - Superconductivity ,Quantum Physics - Abstract
Critical parameters are the key to superconductivity research, and reliable instrumentations can facilitate the study. Traditionally, one has to use several different measurement techniques to measure critical parameters separately. In this work, we develop the use of a single species of quantum sensor to determine and estimate several critical parameters with the help of independent simulation data. We utilize the nitrogen-vacancy (NV) center in the diamond, which recently emerged as a promising candidate for probing exotic features in condensed matter physics. The non-invasive and highly stable nature provides extraordinary opportunities to solve scientific problems in various systems. Using a high-quality single-crystalline YBa$_{2}$Cu$_{4}$O$_{8}$ (YBCO) as a platform, we demonstrate the use of diamond particles and a bulk diamond to probe the Meissner effect. The evolution of the vector magnetic field, the $H-T$ phase diagram, and the map of fluorescence contour are studied via NV sensing. Our results reveal different critical parameters, including lower critical field $H_{c1}$, upper critical field $H_{c2}$, and critical current density $j_{c}$, as well as verifying the unconventional nature of this high-temperature superconductor YBCO. Therefore, NV-based quantum sensing techniques have huge potential in condensed matter research.
- Published
- 2024
37. Arondight: Red Teaming Large Vision Language Models with Auto-generated Multi-modal Jailbreak Prompts
- Author
-
Liu, Yi, Cai, Chengjun, Zhang, Xiaoli, Yuan, Xingliang, and Wang, Cong
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Cryptography and Security ,Computer Science - Multimedia - Abstract
Large Vision Language Models (VLMs) extend and enhance the perceptual abilities of Large Language Models (LLMs). Despite offering new possibilities for LLM applications, these advancements raise significant security and ethical concerns, particularly regarding the generation of harmful content. While LLMs have undergone extensive security evaluations with the aid of red teaming frameworks, VLMs currently lack a well-developed one. To fill this gap, we introduce Arondight, a standardized red team framework tailored specifically for VLMs. Arondight is dedicated to resolving issues related to the absence of visual modality and inadequate diversity encountered when transitioning existing red teaming methodologies from LLMs to VLMs. Our framework features an automated multi-modal jailbreak attack, wherein visual jailbreak prompts are produced by a red team VLM, and textual prompts are generated by a red team LLM guided by a reinforcement learning agent. To enhance the comprehensiveness of VLM security evaluation, we integrate entropy bonuses and novelty reward metrics. These elements incentivize the RL agent to guide the red team LLM in creating a wider array of diverse and previously unseen test cases. Our evaluation of ten cutting-edge VLMs exposes significant security vulnerabilities, particularly in generating toxic images and aligning multi-modal prompts. In particular, our Arondight achieves an average attack success rate of 84.5\% on GPT-4 in all fourteen prohibited scenarios defined by OpenAI in terms of generating toxic text. For a clearer comparison, we also categorize existing VLMs based on their safety levels and provide corresponding reinforcement recommendations. Our multimodal prompt dataset and red team code will be released after ethics committee approval. CONTENT WARNING: THIS PAPER CONTAINS HARMFUL MODEL RESPONSES., Comment: To be published in ACM MM 2024
- Published
- 2024
38. STS MICCAI 2023 Challenge: Grand challenge on 2D and 3D semi-supervised tooth segmentation
- Author
-
Wang, Yaqi, Zhang, Yifan, Chen, Xiaodiao, Wang, Shuai, Qian, Dahong, Ye, Fan, Xu, Feng, Zhang, Hongyuan, Zhang, Qianni, Wu, Chengyu, Li, Yunxiang, Cui, Weiwei, Luo, Shan, Wang, Chengkai, Li, Tianhao, Liu, Yi, Feng, Xiang, Zhou, Huiyu, Liu, Dongyun, Wang, Qixuan, Lin, Zhouhao, Song, Wei, Li, Yuanlin, Wang, Bing, Wang, Chunshi, Chen, Qiupu, and Li, Mingqian
- Subjects
Computer Science - Computer Vision and Pattern Recognition - Abstract
Computer-aided design (CAD) tools are increasingly popular in modern dental practice, particularly for treatment planning or comprehensive prognosis evaluation. In particular, the 2D panoramic X-ray image efficiently detects invisible caries, impacted teeth and supernumerary teeth in children, while the 3D dental cone beam computed tomography (CBCT) is widely used in orthodontics and endodontics due to its low radiation dose. However, there is no open-access 2D public dataset for children's teeth and no open 3D dental CBCT dataset, which limits the development of automatic algorithms for segmenting teeth and analyzing diseases. The Semi-supervised Teeth Segmentation (STS) Challenge, a pioneering event in tooth segmentation, was held as a part of the MICCAI 2023 ToothFairy Workshop on the Alibaba Tianchi platform. This challenge aims to investigate effective semi-supervised tooth segmentation algorithms to advance the field of dentistry. In this challenge, we provide two modalities including the 2D panoramic X-ray images and the 3D CBCT tooth volumes. In Task 1, the goal was to segment tooth regions in panoramic X-ray images of both adult and pediatric teeth. Task 2 involved segmenting tooth sections using CBCT volumes. Limited labelled images with mostly unlabelled ones were provided in this challenge prompt using semi-supervised algorithms for training. In the preliminary round, the challenge received registration and result submission by 434 teams, with 64 advancing to the final round. This paper summarizes the diverse methods employed by the top-ranking teams in the STS MICCAI 2023 Challenge.
- Published
- 2024
39. Continuous Embedding Attacks via Clipped Inputs in Jailbreaking Large Language Models
- Author
-
Xu, Zihao, Liu, Yi, Deng, Gelei, Wang, Kailong, Li, Yuekang, Shi, Ling, and Picek, Stjepan
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Security concerns for large language models (LLMs) have recently escalated, focusing on thwarting jailbreaking attempts in discrete prompts. However, the exploration of jailbreak vulnerabilities arising from continuous embeddings has been limited, as prior approaches primarily involved appending discrete or continuous suffixes to inputs. Our study presents a novel channel for conducting direct attacks on LLM inputs, eliminating the need for suffix addition or specific questions provided that the desired output is predefined. We additionally observe that extensive iterations often lead to overfitting, characterized by repetition in the output. To counteract this, we propose a simple yet effective strategy named CLIP. Our experiments show that for an input length of 40 at iteration 1000, applying CLIP improves the ASR from 62% to 83%
- Published
- 2024
40. Proton drip line of deformed hypernuclei
- Author
-
Liu, Yi-Xiu, Xue, Huai-Tong, Chen, Q. B., Zhou, Xian-Rong, and Schulze, H. -J.
- Subjects
Nuclear Theory - Abstract
The proton drip line of (hyper)nuclei is examined within the framework of the deformed Skyrme-Hartree Fock approach by adjusting the nuclear force parameters to exactly reproduce the core binding energies. The impact of adding a {\Lambda} hyperon in a s or p state is studied, and it is found that in some cases the deformation effect facilitates the extension of the drip line by an added p-state hyperon. However, no extension of the drip line is found for s-state hypernuclei., Comment: 7 pages, 2 Figures
- Published
- 2024
41. DistillSeq: A Framework for Safety Alignment Testing in Large Language Models using Knowledge Distillation
- Author
-
Yang, Mingke, Chen, Yuqi, Liu, Yi, and Shi, Ling
- Subjects
Computer Science - Software Engineering - Abstract
Large Language Models (LLMs) have showcased their remarkable capabilities in diverse domains, encompassing natural language understanding, translation, and even code generation. The potential for LLMs to generate harmful content is a significant concern. This risk necessitates rigorous testing and comprehensive evaluation of LLMs to ensure safe and responsible use. However, extensive testing of LLMs requires substantial computational resources, making it an expensive endeavor. Therefore, exploring cost-saving strategies during the testing phase is crucial to balance the need for thorough evaluation with the constraints of resource availability. To address this, our approach begins by transferring the moderation knowledge from an LLM to a small model. Subsequently, we deploy two distinct strategies for generating malicious queries: one based on a syntax tree approach, and the other leveraging an LLM-based method. Finally, our approach incorporates a sequential filter-test process designed to identify test cases that are prone to eliciting toxic responses. Our research evaluated the efficacy of DistillSeq across four LLMs: GPT-3.5, GPT-4.0, Vicuna-13B, and Llama-13B. In the absence of DistillSeq, the observed attack success rates on these LLMs stood at 31.5% for GPT-3.5, 21.4% for GPT-4.0, 28.3% for Vicuna-13B, and 30.9% for Llama-13B. However, upon the application of DistillSeq, these success rates notably increased to 58.5%, 50.7%, 52.5%, and 54.4%, respectively. This translated to an average escalation in attack success rate by a factor of 93.0% when compared to scenarios without the use of DistillSeq. Such findings highlight the significant enhancement DistillSeq offers in terms of reducing the time and resource investment required for effectively testing LLMs.
- Published
- 2024
- Full Text
- View/download PDF
42. Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation
- Author
-
Gao, Ge, Kim, Jongin, Paik, Sejin, Novozhilova, Ekaterina, Liu, Yi, Bonna, Sarah T., Betke, Margrit, and Wijaya, Derry Tanti
- Subjects
Computer Science - Computation and Language ,I.2.7 - Abstract
Predicting emotions elicited by news headlines can be challenging as the task is largely influenced by the varying nature of people's interpretations and backgrounds. Previous works have explored classifying discrete emotions directly from news headlines. We provide a different approach to tackling this problem by utilizing people's explanations of their emotion, written in free-text, on how they feel after reading a news headline. Using the dataset BU-NEmo+ (Gao et al., 2022), we found that for emotion classification, the free-text explanations have a strong correlation with the dominant emotion elicited by the headlines. The free-text explanations also contain more sentimental context than the news headlines alone and can serve as a better input to emotion classification models. Therefore, in this work we explored generating emotion explanations from headlines by training a sequence-to-sequence transformer model and by using pretrained large language model, ChatGPT (GPT-4). We then used the generated emotion explanations for emotion classification. In addition, we also experimented with training the pretrained T5 model for the intermediate task of explanation generation before fine-tuning it for emotion classification. Using McNemar's significance test, methods that incorporate GPT-generated free-text emotion explanations demonstrated significant improvement (P-value < 0.05) in emotion classification from headlines, compared to methods that only use headlines. This underscores the value of using intermediate free-text explanations for emotion prediction tasks with headlines., Comment: published at LREC-COLING 2024
- Published
- 2024
43. Leptonic CP-violation in the sneutrino sector of the BLSSM with Inverse Seesaw
- Author
-
Basu, Arindam, Chakraborty, Amit, Liu, Yi, Moretti, Stefano, and Waltari, Harri
- Subjects
High Energy Physics - Phenomenology ,High Energy Physics - Experiment - Abstract
We study CP violation (CPV) in the sneutrino sector within the B-L extension of the Minimal Supersymmetric Standard Model (BLSSM), wherein an inverse seesaw mechanism has been implemented. CPV arises from the new superpotential couplings in the (s)neutrino sector, which can be complex and the mixing of CP-eigenstates induced by those couplings. CPV leads to asymmetries in so called T-odd observables, but we argue that such asymmetries also lead to a wider distribution of those observables. We look at a final state where a sneutrino decays to a lepton, two jets and missing transverse momentum at the Future Circular Collider operating in hadron-hadron mode at $100$ TeV and with a luminosity of 3~ab$^{-1}$. In order to exclude the CP conserving scenario we need to improve traditional analysis by introducing boosted decision trees using both standard kinematic variables and T-odd observables and we need $Z^{\prime}$ boson not too much above current bounds as a portal to produce sneutrinos efficiently., Comment: 27 pages, 21 figures, 10 tables
- Published
- 2024
44. Hyperfine-to-rotational energy transfer in ultracold atom-molecule collisions
- Author
-
Liu, Yi-Xiang, Zhu, Lingbang, Luke, Jeshurun, Babin, Mark C., Tscherbul, Timur V., Gronowski, Marcin, Ladjimi, Hela, Tomza, Michał, Bohn, John L., and Ni, Kang-Kuen
- Subjects
Physics - Atomic Physics ,Physics - Chemical Physics ,Quantum Physics - Abstract
Energy transfer between different mechanical degrees of freedom in atom-molecule collisions has been widely studied and largely understood. However, systems involving spins remain less explored, especially with a state-to-state precision. Here, we directly observed the energy transfer from atomic hyperfine to molecular rotation in the $^{87}$Rb ($|F_a,M_{F_a}\rangle = |2,2\rangle$) + $^{40}$K$^{87}$Rb (in the rovibronic ground state $N=0$) $\longrightarrow$ Rb ($ |1,1\rangle$) + KRb ($N=0,1,2$) exothermic collision. We probed the quantum states of the collision products using resonance-enhanced multi-photon ionization followed by time-of-flight mass spectrometry. We also carried out state-of-the-art quantum scattering calculations, which rigorously take into account the coupling between the spin and rotational degrees of freedom at short range, and assume that the KRb monomer can be treated as a rigid rotor moving on a single potential energy surface. The calculated product rotational state distribution deviates from the observations even after extensive tuning of the atom-molecule potential energy surface, suggesting that vibrational degrees of freedom and conical intersections play an important part in ultracold Rb + KRb collisions. Additionally, our ab initio calculations indicate that spin-rotation coupling is dramatically enhanced near a conical intersection, which is energetically accessible at short range. The observations confirm that spin is coupled to mechanical rotation at short range and establish a benchmark for future theoretical studies.
- Published
- 2024
45. Source Code Summarization in the Era of Large Language Models
- Author
-
Sun, Weisong, Miao, Yun, Li, Yuekang, Zhang, Hongyu, Fang, Chunrong, Liu, Yi, Deng, Gelei, Liu, Yang, and Chen, Zhenyu
- Subjects
Computer Science - Software Engineering ,Computer Science - Artificial Intelligence ,D.2.3 ,I.2.7 - Abstract
To support software developers in understanding and maintaining programs, various automatic (source) code summarization techniques have been proposed to generate a concise natural language summary (i.e., comment) for a given code snippet. Recently, the emergence of large language models (LLMs) has led to a great boost in the performance of code-related tasks. In this paper, we undertake a systematic and comprehensive study on code summarization in the era of LLMs, which covers multiple aspects involved in the workflow of LLM-based code summarization. Specifically, we begin by examining prevalent automated evaluation methods for assessing the quality of summaries generated by LLMs and find that the results of the GPT-4 evaluation method are most closely aligned with human evaluation. Then, we explore the effectiveness of five prompting techniques (zero-shot, few-shot, chain-of-thought, critique, and expert) in adapting LLMs to code summarization tasks. Contrary to expectations, advanced prompting techniques may not outperform simple zero-shot prompting. Next, we investigate the impact of LLMs' model settings (including top\_p and temperature parameters) on the quality of generated summaries. We find the impact of the two parameters on summary quality varies by the base LLM and programming language, but their impacts are similar. Moreover, we canvass LLMs' abilities to summarize code snippets in distinct types of programming languages. The results reveal that LLMs perform suboptimally when summarizing code written in logic programming languages compared to other language types. Finally, we unexpectedly find that CodeLlama-Instruct with 7B parameters can outperform advanced GPT-4 in generating summaries describing code implementation details and asserting code properties. We hope that our findings can provide a comprehensive understanding of code summarization in the era of LLMs., Comment: Just accepted to the 47th International Conference on Software Engineering (ICSE 2025)
- Published
- 2024
46. Balancing events, not patients, maximizes power of the logrank test: and other insights on unequal randomization in survival trials
- Author
-
Yung, Godwin, Rufibach, Kaspar, Wolbers, Marcel, Lin, Ray, and Liu, Yi
- Subjects
Statistics - Methodology ,Statistics - Applications - Abstract
We revisit the question of what randomization ratio (RR) maximizes power of the logrank test in event-driven survival trials under proportional hazards (PH). By comparing three approximations of the logrank test (Schoenfeld, Freedman, Rubinstein) to empirical simulations, we find that the RR that maximizes power is the RR that balances number of events across treatment arms at the end of the trial. This contradicts the common misconception implied by Schoenfeld's approximation that 1:1 randomization maximizes power. Besides power, we consider other factors that might influence the choice of RR (accrual, trial duration, sample size, etc.). We perform simulations to better understand how unequal randomization might impact these factors in practice. Altogether, we derive 6 insights to guide statisticians in the design of survival trials considering unequal randomization., Comment: 17 pages, 3 figures, 2 tables
- Published
- 2024
47. Evolution of High-energy Electron Distribution in Pulsar Wind Nebulae
- Author
-
Liu, Yi-Ming, Zeng, Hou-Dun, Xin, Yu-Liang, Liu, Si-Ming, and Zhang, Yi
- Subjects
Astrophysics - High Energy Astrophysical Phenomena ,High Energy Physics - Phenomenology - Abstract
In this paper, we analyze the spectral energy distributions (SEDs) of 17 powerful (with a spin-down luminosity greater than $10^{35}$ erg s$^{-1}$) young (with an age less than 15000 yrs) pulsar wind nebulae (PWNe) using a simple time-independent one-zone emission model. Our aim is to investigate correlations between model parameters and the ages of the corresponding PWNe, thereby revealing the evolution of high-energy electron distributions within PWNe. Our findings are as follows: (1) The electron distributions in PWNe can be characterized by a double power-law with a superexponential cutoff; (2) As PWNe evolve, the high-energy end of the electron distribution spectrum becomes harder with the index decreasing from approximately 3.5 to 2.5, while the low-energy end spectrum index remains constant near 1.5; (3) There is no apparent correlation between the break energy or cutoff energy and the age of PWNe. (4) The average magnetic field within PWNe decreases with age, leading to a positive correlation between the energy loss timescale of electrons at the break energy or the high-energy cutoff, and the age of the PWN. (5) The total electron energy within PWNe remains constant near $2 \times 10^{48}$ erg, while the total magnetic energy decreases with age., Comment: 19 pages, 20 figures, 1 table accepted for publication in RAA
- Published
- 2024
48. Task Oriented In-Domain Data Augmentation
- Author
-
Liang, Xiao, Hu, Xinyu, Zuo, Simiao, Gong, Yeyun, Lou, Qiang, Liu, Yi, Huang, Shao-Lun, and Jiao, Jian
- Subjects
Computer Science - Computation and Language - Abstract
Large Language Models (LLMs) have shown superior performance in various applications and fields. To achieve better performance on specialized domains such as law and advertisement, LLMs are often continue pre-trained on in-domain data. However, existing approaches suffer from two major issues. First, in-domain data are scarce compared with general domain-agnostic data. Second, data used for continual pre-training are not task-aware, such that they may not be helpful to downstream applications. We propose TRAIT, a task-oriented in-domain data augmentation framework. Our framework is divided into two parts: in-domain data selection and task-oriented synthetic passage generation. The data selection strategy identifies and selects a large amount of in-domain data from general corpora, and thus significantly enriches domain knowledge in the continual pre-training data. The synthetic passages contain guidance on how to use domain knowledge to answer questions about downstream tasks. By training on such passages, the model aligns with the need of downstream applications. We adapt LLMs to two domains: advertisement and math. On average, TRAIT improves LLM performance by 8% in the advertisement domain and 7.5% in the math domain.
- Published
- 2024
49. Repairing Catastrophic-Neglect in Text-to-Image Diffusion Models via Attention-Guided Feature Enhancement
- Author
-
Chang, Zhiyuan, Li, Mingyang, Wang, Junjie, Liu, Yi, Wang, Qing, and Liu, Yang
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence - Abstract
Text-to-Image Diffusion Models (T2I DMs) have garnered significant attention for their ability to generate high-quality images from textual descriptions. However, these models often produce images that do not fully align with the input prompts, resulting in semantic inconsistencies. The most prominent issue among these semantic inconsistencies is catastrophic-neglect, where the images generated by T2I DMs miss key objects mentioned in the prompt. We first conduct an empirical study on this issue, exploring the prevalence of catastrophic-neglect, potential mitigation strategies with feature enhancement, and the insights gained. Guided by the empirical findings, we propose an automated repair approach named Patcher to address catastrophic-neglect in T2I DMs. Specifically, Patcher first determines whether there are any neglected objects in the prompt, and then applies attention-guided feature enhancement to these neglected objects, resulting in a repaired prompt. Experimental results on three versions of Stable Diffusion demonstrate that Patcher effectively repairs the issue of catastrophic-neglect, achieving 10.1%-16.3% higher Correct Rate in image generation compared to baselines., Comment: 12 pages, 3 figures
- Published
- 2024
50. BadSampler: Harnessing the Power of Catastrophic Forgetting to Poison Byzantine-robust Federated Learning
- Author
-
Liu, Yi, Wang, Cong, and Yuan, Xingliang
- Subjects
Computer Science - Cryptography and Security ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Federated Learning (FL) is susceptible to poisoning attacks, wherein compromised clients manipulate the global model by modifying local datasets or sending manipulated model updates. Experienced defenders can readily detect and mitigate the poisoning effects of malicious behaviors using Byzantine-robust aggregation rules. However, the exploration of poisoning attacks in scenarios where such behaviors are absent remains largely unexplored for Byzantine-robust FL. This paper addresses the challenging problem of poisoning Byzantine-robust FL by introducing catastrophic forgetting. To fill this gap, we first formally define generalization error and establish its connection to catastrophic forgetting, paving the way for the development of a clean-label data poisoning attack named BadSampler. This attack leverages only clean-label data (i.e., without poisoned data) to poison Byzantine-robust FL and requires the adversary to selectively sample training data with high loss to feed model training and maximize the model's generalization error. We formulate the attack as an optimization problem and present two elegant adversarial sampling strategies, Top-$\kappa$ sampling, and meta-sampling, to approximately solve it. Additionally, our formal error upper bound and time complexity analysis demonstrate that our design can preserve attack utility with high efficiency. Extensive evaluations on two real-world datasets illustrate the effectiveness and performance of our proposed attacks., Comment: In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD' 24), August 25-29, 2024, Barcelona, Spain
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.