Author: "Wang, Hongyu" / Search Limiters: Available in Library Collection - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wang, Hongyu"' showing total 2,623 results

Start Over Author "Wang, Hongyu" Search Limiters Available in Library Collection

2,623 results on '"Wang, Hongyu"'

1. BitNet a4.8: 4-bit Activations for 1-bit LLMs

Author: Wang, Hongyu, Ma, Shuming, and Wei, Furu
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Recent research on the 1-bit Large Language Models (LLMs), such as BitNet b1.58, presents a promising direction for reducing the inference cost of LLMs while maintaining their performance. In this work, we introduce BitNet a4.8, enabling 4-bit activations for 1-bit LLMs. BitNet a4.8 employs a hybrid quantization and sparsification strategy to mitigate the quantization errors introduced by the outlier channels. Specifically, we utilize 4-bit activations for inputs to the attention and feed-forward network layers, while sparsifying intermediate states followed with 8-bit quantization. Extensive experiments demonstrate that BitNet a4.8 achieves performance comparable to BitNet b1.58 with equivalent training costs, while being faster in inference with enabling 4-bit (INT4/FP4) kernels. Additionally, BitNet a4.8 activates only 55% of parameters and supports 3-bit KV cache, further enhancing the efficiency of large-scale LLM deployment and inference., Comment: Work in progress
Published: 2024

2. Faber-Krahn type inequality for supertrees

Author: Wang, Hongyu and Hou, Xinmin
Subjects: Mathematics - Combinatorics
Abstract: The Faber-Krahn inequality states that the first Dirichlet eigenvalue among all bounded domains is no less than a Euclidean ball with the same volume in $\mathbb{R}^n$ \cite{Chavel FB}. B{\i}y{\i}ko\u{g}lu and Leydold (J. Comb. Theory, Ser. B., 2007) demonstrated that the Faber-Krahn inequality also holds for the class of trees with boundary with the same degree sequence and characterized the unique extremal tree. B{\i}y{\i}ko\u{g}lu and Leydold (2007) also posed a question as follows: Give a characterization of all graphs in a given class $\mathcal{C}$ with the Faber-Krahn property. In this paper, we address this question specifically for $k$-uniform supertrees with boundary. We introduce a spiral-like ordering (SLO-ordering) of vertices for supertrees, an extension of the SLO-ordering for trees initially proposed by Pruss [ Duke Math. J., 1998], and prove that the SLO-supertree has the Faber-Krahn property among all supertrees with a given degree sequence. Furthermore, among degree sequences that have a minimum degree $d$ for interior vertices, the SLO-supertree with degree sequence $(d,\ldots,d, d', 1, \dots, 1)$ possesses the Faber-Krahn property., Comment: 18 pages, 1 figure
Published: 2024

3. 1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs

Author: Wang, Jinheng, Zhou, Hansong, Song, Ting, Mao, Shaoguang, Ma, Shuming, Wang, Hongyu, Xia, Yan, and Wei, Furu
Subjects: Computer Science - Computation and Language
Abstract: Recent advances in 1-bit Large Language Models (LLMs), such as BitNet and BitNet b1.58, present a promising approach to enhancing the efficiency of LLMs in terms of speed and energy consumption. These developments also enable local LLM deployment across a broad range of devices. In this work, we introduce bitnet.cpp, a tailored software stack designed to unlock the full potential of 1-bit LLMs. Specifically, we develop a set of kernels to support fast and lossless inference of ternary BitNet b1.58 LLMs on CPUs. Extensive experiments demonstrate that bitnet.cpp achieves significant speedups, ranging from 2.37x to 6.17x on x86 CPUs and from 1.37x to 5.07x on ARM CPUs, across various model sizes. The code is available at https://github.com/microsoft/BitNet.
Published: 2024

4. Experiments in cross-domain few-shot learning for image classification

Author: Wang, Hongyu
Published: 2023

5. Construction of Smart Data toward Dunhuang Grottoes

Author: Wang, Xiaoguang, Wang, Hongyu, Chang, Wanli, Zhang, Chen, and Xu, Lei
Published: 2020
Full Text: View/download PDF

6. Cross Fusion RGB-T Tracking with Bi-directional Adapter

Author: Zeng, Zhirong, Liu, Xiaotao, Sun, Meng, Wang, Hongyu, and Liu, Jing
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Many state-of-the-art RGB-T trackers have achieved remarkable results through modality fusion. However, these trackers often either overlook temporal information or fail to fully utilize it, resulting in an ineffective balance between multi-modal and temporal information. To address this issue, we propose a novel Cross Fusion RGB-T Tracking architecture (CFBT) that ensures the full participation of multiple modalities in tracking while dynamically fusing temporal information. The effectiveness of CFBT relies on three newly designed cross spatio-temporal information fusion modules: Cross Spatio-Temporal Augmentation Fusion (CSTAF), Cross Spatio-Temporal Complementarity Fusion (CSTCF), and Dual-Stream Spatio-Temporal Adapter (DSTA). CSTAF employs a cross-attention mechanism to enhance the feature representation of the template comprehensively. CSTCF utilizes complementary information between different branches to enhance target features and suppress background features. DSTA adopts the adapter concept to adaptively fuse complementary information from multiple branches within the transformer layer, using the RGB modality as a medium. These ingenious fusions of multiple perspectives introduce only less than 0.3\% of the total modal parameters, but they indeed enable an efficient balance between multi-modal and temporal information. Extensive experiments on three popular RGB-T tracking benchmarks demonstrate that our method achieves new state-of-the-art performance.
Published: 2024

7. Enhancement of Co-located Shared VR Experiences: Representing Non-HMD Observers on Both HMD and 2D Screen

Author: Guo, Zixuan, Xu, Wenge, Wang, Hongyu, Wan, Tingjie, Baghaei, Nilufar, Lo, Cheng-Hung, and Liang, Hai-Ning
Subjects: Computer Science - Human-Computer Interaction
Abstract: Virtual reality (VR) not only allows head-mounted display (HMD) users to immerse themselves in virtual worlds but also to share them with others. When designed correctly, this shared experience can be enjoyable. However, in typical scenarios, HMD users are isolated by their devices, and non-HMD observers lack connection with the virtual world. To address this, our research investigates visually representing observers on both HMD and 2D screens to enhance shared experiences. The study, including five representation conditions, reveals that incorporating observer representation positively impacts both HMD users and observers. For how to design and represent them, our work shows that HMD users prefer methods displaying real-world visuals, while observers exhibit diverse preferences regarding being represented with real or virtual images. We provide design guidelines tailored to both displays, offering valuable insights to enhance co-located shared VR experiences for HMD users and non-HMD observers.
Published: 2024

8. Exploring the Impact of Passthrough on VR Exergaming in Public Environments: A Field Study

Author: Guo, Zixuan, Deng, Hanxiao, Wang, Hongyu, Tan, Angel J. Y., Xu, Wenge, and Liang, Hai-Ning
Subjects: Computer Science - Human-Computer Interaction
Abstract: Sedentary behavior is becoming increasingly prevalent in daily work and study environments. VR exergaming has emerged as a promising solution in these places of work and study. However, private spaces in these environments are not easy, and engaging in VR exergaming in public settings presents its own set of challenges (e.g., safety, social acceptance, isolation, and privacy protection). The recent development of Passthrough functionality in VR headsets allows users to maintain awareness of their surroundings, enhancing safety and convenience. Despite its potential benefits, little is known about how Passthrough could affect user performance and experience and solve the challenges of playing VR exergames in real-world public environments. To our knowledge, this work is the first to conduct a field study in an underground passageway on a university campus to explore the use of Passthrough in a real-world public environment, with a disturbance-free closed room as a baseline. Results indicate that enabling Passthrough in a public environment improves performance without compromising presence. Moreover, Passthrough can increase social acceptance, especially among individuals with higher levels of self-consciousness. These findings highlight Passthrough's potential to encourage VR exergaming adoption in public environments, with promising implications for overall health and well-being.
Published: 2024

9. Q-Sparse: All Large Language Models can be Fully Sparsely-Activated

Author: Wang, Hongyu, Ma, Shuming, Wang, Ruiping, and Wei, Furu
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: We introduce, Q-Sparse, a simple yet effective approach to training sparsely-activated large language models (LLMs). Q-Sparse enables full sparsity of activations in LLMs which can bring significant efficiency gains in inference. This is achieved by applying top-K sparsification to the activations and the straight-through-estimator to the training. We also introduce Block Q-Sparse for batch training and inference. The key results from this work are, (1) Q-Sparse can achieve results comparable to those of baseline LLMs while being much more efficient at inference time; (2) We present an inference-optimal scaling law for sparsely-activated LLMs; (3) Q-Sparse is effective in different settings, including training-from-scratch, continue-training of off-the-shelf LLMs, and finetuning; (4) Q-Sparse works for both full-precision and 1-bit LLMs (e.g., BitNet b1.58). Particularly, the synergy of BitNet b1.58 and Q-Sparse (can be equipped with MoE) provides the cornerstone and a clear path to revolutionize the efficiency, including cost and energy consumption, of future LLMs., Comment: Work in progress
Published: 2024

10. HTD-Mamba: Efficient Hyperspectral Target Detection with Pyramid State Space Model

Author: Shen, Dunbin, Zhu, Xuanbing, Tian, Jiacheng, Liu, Jianjun, Du, Zhenrong, Wang, Hongyu, and Ma, Xiaorui
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Hyperspectral target detection (HTD) identifies objects of interest from complex backgrounds at the pixel level, playing a vital role in Earth observation. However, HTD faces challenges due to limited prior knowledge and spectral variation, leading to underfitting models and unreliable performance. To address these challenges, this paper proposes an efficient self-supervised HTD method with a pyramid state space model (SSM), named HTD-Mamba, which employs spectrally contrastive learning to distinguish between target and background based on the similarity measurement of intrinsic features. Specifically, to obtain sufficient training samples and leverage spatial contextual information, we propose a spatial-encoded spectral augmentation technique that encodes all surrounding pixels within a patch into a transformed view of the center pixel. Additionally, to explore global band correlations, we divide pixels into continuous group-wise spectral embeddings and introduce Mamba to HTD for the first time to model long-range dependencies of the spectral sequence with linear complexity. Furthermore, to alleviate spectral variation and enhance robust representation, we propose a pyramid SSM as a backbone to capture and fuse multiresolution spectral-wise intrinsic features. Extensive experiments conducted on four public datasets demonstrate that the proposed method outperforms state-of-the-art methods in both quantitative and qualitative evaluations. Code is available at \url{https://github.com/shendb2022/HTD-Mamba}., Comment: 13 pages,6 figures, 5 tables
Published: 2024

11. M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models

Author: Wang, Hongyu, Xu, Jiayu, Xie, Senwei, Wang, Ruiping, Li, Jialin, Xie, Zhaojie, Zhang, Bin, Xiong, Chuyan, and Chen, Xilin
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: Multilingual multimodal reasoning is a core component in achieving human-level intelligence. However, most existing benchmarks for multilingual multimodal reasoning struggle to differentiate between models of varying performance; even language models without visual capabilities can easily achieve high scores. This leaves a comprehensive evaluation of leading multilingual multimodal models largely unexplored. In this work, we introduce M4U, a novel and challenging benchmark for assessing the capability of multi-discipline multilingual multimodal understanding and reasoning. M4U contains 8,931 samples covering 64 disciplines across 16 subfields in Science, Engineering, and Healthcare in Chinese, English, and German. Using M4U, we conduct extensive evaluations of 21 leading Large Multimodal Models (LMMs) and Large Language Models (LLMs) with external tools. The evaluation results show that the state-of-the-art model, GPT-4o, achieves only 47.6% average accuracy on M4U. Additionally, we observe that the leading LMMs exhibit significant language preferences. Our in-depth analysis indicates that leading LMMs, including GPT-4o, suffer performance degradation when prompted with cross-lingual multimodal questions, such as images with key textual information in Chinese while the question is in German. We believe that M4U can serve as a crucial tool for systematically evaluating LMMs based on their multilingual multimodal reasoning capabilities and monitoring their development. The homepage, codes and data are public available., Comment: Work in progress
Published: 2024

12. Real-Time and Accurate: Zero-shot High-Fidelity Singing Voice Conversion with Multi-Condition Flow Synthesis

Author: Li, Hui, Wang, Hongyu, Chen, Zhijin, Sun, Bohan, and Li, Bo
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Singing voice conversion is to convert the source singing voice into the target singing voice except for the content. Currently, flow-based models can complete the task of voice conversion, but they struggle to effectively extract latent variables in the more rhythmically rich and emotionally expressive task of singing voice conversion, while also facing issues with low efficiency in speech processing. In this paper, we propose a high-fidelity flow-based model based on multi-decoupling feature constraints called RASVC, which enhances the capture of vocal details by integrating multiple latent attribute encoders. We also use Multi-stream inverse short-time Fourier transform(MS-iSTFT) to enhance the speed of speech processing by skipping some complicated decoder processing steps. We compare the synthesized singing voice with other models from multiple dimensions, and our proposed model is highly consistent with the current state-of-the-art, with the demo which is available at \url{https://lazycat1119.github.io/RASVC-demo/}., Comment: 5 pages,3 figures
Published: 2024

13. NGM-SLAM: Gaussian Splatting SLAM with Radiance Field Submap

Author: Li, Mingrui, Huang, Jingwei, Sun, Lei, Tian, Aaron Xuxiang, Deng, Tianchen, and Wang, Hongyu
Subjects: Computer Science - Robotics
Abstract: SLAM systems based on Gaussian Splatting have garnered attention due to their capabilities for rapid real-time rendering and high-fidelity mapping. However, current Gaussian Splatting SLAM systems usually struggle with large scene representation and lack effective loop closure detection. To address these issues, we introduce NGM-SLAM, the first 3DGS based SLAM system that utilizes neural radiance field submaps for progressive scene expression, effectively integrating the strengths of neural radiance fields and 3D Gaussian Splatting. We utilize neural radiance field submaps as supervision and achieve high-quality scene expression and online loop closure adjustments through Gaussian rendering of fused submaps. Our results on multiple real-world scenes and large-scale scene datasets demonstrate that our method can achieve accurate hole filling and high-quality scene expression, supporting monocular, stereo, and RGB-D inputs, and achieving state-of-the-art scene reconstruction and tracking performance., Comment: 9pages, 4 figures
Published: 2024

14. PCG: Mitigating Conflict-based Cache Side-channel Attacks with Prefetching

Author: Jiang, Fang, Tong, Fei, Wang, Hongyu, Cheng, Xiaoyu, Zhou, Zhe, Ling, Ming, and Mao, Yuxing
Subjects: Computer Science - Cryptography and Security, Computer Science - Hardware Architecture
Abstract: To defend against conflict-based cache side-channel attacks, cache partitioning or remapping techniques were proposed to prevent set conflicts between different security domains or obfuscate the locations of such conflicts. But such techniques complicate cache design and may result in significant performance penalties. Therefore, there have been lightweight prefetching-based schemes proposed to introduce noise to confuse attackers' observation. However, we have validated experimentally that relying on prefetching to only introduce noise is insufficient, as attackers can still reliably distinguish the victim's cache accesses. This paper proposes a novel prefetching-based scheme, called PCG. It combines adding victim-irrelevant cache occupancy changes and reducing victim-relevant cache occupancy changes to disrupt attackers by generating noisy and indistinguishable cache access patterns. Additionally, PCG can either work independently or seamlessly be integrated with most of the commonly used prefetchers. We have implemented and evaluated PCG in both gem5 and the open-source RISC-V core BOOMv3. The evaluation results show the PCG's robust security superior to the existing solutions, while without resulting in significant performance degradation. According to the evaluation based on the SPEC CPU 2017 benchmark suite, PCG even shows an average performance improvement of about 1.64%. Moreover, it incurs only 1.26% overhead on hardware resource consumption., Comment: 12 pages, 9 figures, submitting to a journal
Published: 2024

15. Prompt-Guided Generation of Structured Chest X-Ray Report Using a Pre-trained LLM

Author: Li, Hongzhao, Wang, Hongyu, Sun, Xia, He, Hua, and Feng, Jun
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia
Abstract: Medical report generation automates radiology descriptions from images, easing the burden on physicians and minimizing errors. However, current methods lack structured outputs and physician interactivity for clear, clinically relevant reports. Our method introduces a prompt-guided approach to generate structured chest X-ray reports using a pre-trained large language model (LLM). First, we identify anatomical regions in chest X-rays to generate focused sentences that center on key visual elements, thereby establishing a structured report foundation with anatomy-based sentences. We also convert the detected anatomy into textual prompts conveying anatomical comprehension to the LLM. Additionally, the clinical context prompts guide the LLM to emphasize interactivity and clinical requirements. By integrating anatomy-focused sentences and anatomy/clinical prompts, the pre-trained LLM can generate structured chest X-ray reports tailored to prompted anatomical regions and clinical contexts. We evaluate using language generation and clinical effectiveness metrics, demonstrating strong performance., Comment: Accepted by IEEE Conference on Multimedia Expo 2024
Published: 2024

16. Illicit Promotion on Twitter

Author: Wang, Hongyu, Li, Ying, Huang, Ronghong, and Mi, Xianghang
Subjects: Computer Science - Cryptography and Security, Computer Science - Social and Information Networks
Abstract: In this paper, we present an extensive study of the promotion of illicit goods and services on Twitter, a popular online social network(OSN). This study is made possible through the design and implementation of multiple novel tools for detecting and analyzing illicit promotion activities as well as their underlying campaigns. As the results, we observe that illicit promotion is prevalent on Twitter, along with noticeable existence on other three popular OSNs including Youtube, Facebook, and TikTok. Particularly, 12 million distinct posts of illicit promotion (PIPs) have been observed on the Twitter platform, which are widely distributed in 5 major natural languages and 10 categories of illicit goods and services, e.g., drugs, data leakage, gambling, and weapon sales. What are also observed are 580K Twitter accounts publishing PIPs as well as 37K distinct instant messaging (IM) accounts that are embedded in PIPs and serve as next hops of communication, which strongly indicates that the campaigns underpinning PIPs are also of a large scale. Also, an arms race between Twitter and illicit promotion operators is also observed. On one hand, Twitter is observed to conduct content moderation in a continuous manner and almost 80% PIPs will get gradually unpublished within six months since posted. However, in the meantime, miscreants adopt various evasion tactics to masquerade their PIPs, which renders more than 90% PIPs keeping hidden from the detection radar for two months or longer.
Published: 2024

17. Verification and Refinement of Local Difference in TC4 Hollow Blade Pressing with Low Melting Point Alloy Mandrel

Author: Liu, Menghan, Wang, Hongyu, Yang, Dehui, Li, Yuanyuan, Sun, Jie, and Zhang, Shunhu
Published: 2024
Full Text: View/download PDF

18. Holes position prediction and inverse design on complex surface in deep-drawing process with sand dies based on NURBS and deformation mathematical zoning

Author: Li, Yuanyuan, Wang, Hongyu, Liu, Menghan, Yang, Dehui, Sun, Jie, Zhang, Shunhu, and Ma, Xiangkun
Published: 2024
Full Text: View/download PDF

19. The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Author: Ma, Shuming, Wang, Hongyu, Ma, Lingxiao, Wang, Lei, Wang, Wenhui, Huang, Shaohan, Dong, Li, Wang, Ruiping, Xue, Jilong, and Wei, Furu
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single parameter (or weight) of the LLM is ternary {-1, 0, 1}. It matches the full-precision (i.e., FP16 or BF16) Transformer LLM with the same model size and training tokens in terms of both perplexity and end-task performance, while being significantly more cost-effective in terms of latency, memory, throughput, and energy consumption. More profoundly, the 1.58-bit LLM defines a new scaling law and recipe for training new generations of LLMs that are both high-performance and cost-effective. Furthermore, it enables a new computation paradigm and opens the door for designing specific hardware optimized for 1-bit LLMs., Comment: Work in progress
Published: 2024

20. SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM

Author: Li, Mingrui, Liu, Shuhong, Zhou, Heng, Zhu, Guohao, Cheng, Na, Deng, Tianchen, and Wang, Hongyu
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Robotics
Abstract: We present SGS-SLAM, the first semantic visual SLAM system based on Gaussian Splatting. It incorporates appearance, geometry, and semantic features through multi-channel optimization, addressing the oversmoothing limitations of neural implicit SLAM systems in high-quality rendering, scene understanding, and object-level geometry. We introduce a unique semantic feature loss that effectively compensates for the shortcomings of traditional depth and color losses in object optimization. Through a semantic-guided keyframe selection strategy, we prevent erroneous reconstructions caused by cumulative errors. Extensive experiments demonstrate that SGS-SLAM delivers state-of-the-art performance in camera pose estimation, map reconstruction, precise semantic segmentation, and object-level geometric accuracy, while ensuring real-time rendering capabilities.
Published: 2024

21. DDN-SLAM: Real-time Dense Dynamic Neural Implicit SLAM

Author: Li, Mingrui, Zhou, Yiming, Jiang, Guangan, Deng, Tianchen, Wang, Yangyang, and Wang, Hongyu
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Robotics
Abstract: SLAM systems based on NeRF have demonstrated superior performance in rendering quality and scene reconstruction for static environments compared to traditional dense SLAM. However, they encounter tracking drift and mapping errors in real-world scenarios with dynamic interferences. To address these issues, we introduce DDN-SLAM, the first real-time dense dynamic neural implicit SLAM system integrating semantic features. To address dynamic tracking interferences, we propose a feature point segmentation method that combines semantic features with a mixed Gaussian distribution model. To avoid incorrect background removal, we propose a mapping strategy based on sparse point cloud sampling and background restoration. We propose a dynamic semantic loss to eliminate dynamic occlusions. Experimental results demonstrate that DDN-SLAM is capable of robustly tracking and producing high-quality reconstructions in dynamic environments, while appropriately preserving potential dynamic objects. Compared to existing neural implicit SLAM systems, the tracking results on dynamic datasets indicate an average 90% improvement in Average Trajectory Error (ATE) accuracy., Comment: 11pages, 4figures
Published: 2024

22. Temporal Adaptive RGBT Tracking with Modality Prompt

Author: Wang, Hongyu, Liu, Xiaotao, Li, Yifan, Sun, Meng, Yuan, Dian, and Liu, Jing
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: RGBT tracking has been widely used in various fields such as robotics, surveillance processing, and autonomous driving. Existing RGBT trackers fully explore the spatial information between the template and the search region and locate the target based on the appearance matching results. However, these RGBT trackers have very limited exploitation of temporal information, either ignoring temporal information or exploiting it through online sampling and training. The former struggles to cope with the object state changes, while the latter neglects the correlation between spatial and temporal information. To alleviate these limitations, we propose a novel Temporal Adaptive RGBT Tracking framework, named as TATrack. TATrack has a spatio-temporal two-stream structure and captures temporal information by an online updated template, where the two-stream structure refers to the multi-modal feature extraction and cross-modal interaction for the initial template and the online update template respectively. TATrack contributes to comprehensively exploit spatio-temporal information and multi-modal information for target localization. In addition, we design a spatio-temporal interaction (STI) mechanism that bridges two branches and enables cross-modal interaction to span longer time scales. Extensive experiments on three popular RGBT tracking benchmarks show that our method achieves state-of-the-art performance, while running at real-time speed.
Published: 2024

23. PLE-SLAM: A Visual-Inertial SLAM Based on Point-Line Features and Efficient IMU Initialization

Author: He, Jiaming, Li, Mingrui, Wang, Yangyang, and Wang, Hongyu
Subjects: Computer Science - Robotics
Abstract: Visual-inertial SLAM is crucial in various fields, such as aerial vehicles, industrial robots, and autonomous driving. The fusion of camera and inertial measurement unit (IMU) makes up for the shortcomings of a signal sensor, which significantly improves the accuracy and robustness of localization in challenging environments. This article presents PLE-SLAM, an accurate and real-time visual-inertial SLAM algorithm based on point-line features and efficient IMU initialization. First, we use parallel computing methods to extract features and compute descriptors to ensure real-time performance. Adjacent short line segments are merged into long line segments, and isolated short line segments are directly deleted. Second, a rotation-translation-decoupled initialization method is extended to use both points and lines. Gyroscope bias is optimized by tightly coupling IMU measurements and image observations. Accelerometer bias and gravity direction are solved by an analytical method for efficiency. To improve the system's intelligence in handling complex environments, a scheme of leveraging semantic information and geometric constraints to eliminate dynamic features and A solution for loop detection and closed-loop frame pose estimation using CNN and GNN are integrated into the system. All networks are accelerated to ensure real-time performance. The experiment results on public datasets illustrate that PLE-SLAM is one of the state-of-the-art visual-inertial SLAM systems.
Published: 2024

24. Chronic GLP1 therapy reduces postprandial IL6 in obese humans with prediabetes.

Author: Hamidi, Vala, Wang, Hongyu, Pham, Vi, Bermudez Saint Andre, Karla, Taegtmeyer, Heinrich, and Gutierrez, Absalon
Subjects: GLP1, IL6, inflammation, obesity, prediabetes
Abstract: Single-dose glucagon-like peptide 1 (GLP1) therapy increases postprandial plasma IL6 levels in prediabetic, obese humans. GLP1-IL6 interactions underly multiple antidiabetic effects, but these may differ after acute versus chronic therapy. This study examines postprandial effects of GLP1 after chronic therapy. Seven humans (six Black) with prediabetes and obesity completed 6 weeks of exenatide extended release therapy. Then subjects returned for pre- and post-meal measurements of plasma IL6, GLP1, glucagon, and related inflammatory markers. Weight, which was measured before and after therapy, did not change. Plasma IL6 decreased from baseline to postmeal state ( = 0.016), with decreases in free fatty acids (P
Published: 2024

25. A New Perspective on Speaker Verification: Joint Modeling with DFSMN and Transformer

Author: Wang, Hongyu, Li, Hui, and Li, Bo
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: Speaker verification is to judge the similarity between two unknown voices in an open set, where the ideal speaker embedding should be able to condense discriminant information into a compact utterance-level representation that has small intra-speaker distances and large inter-speaker distances. We propose Voice Transformer (VOT), a novel model for speaker verification, which integrates parallel transformers at multiple scales. A deep feedforward sequential memory network (DFSMN) is incorporated into the attention part of these transformers to increase feature granularity. The attentive statistics pooling layer is added to focus on important frames and form utterance-level features. We propose Additive Angular Margin Focal Loss (AAMF) to address the hard samples problem. We evaluate the proposed approach on the VoxCeleb1 and CN-Celeb2 datasets, demonstrating that VOT surpasses most mainstream models. The code is available on GitHub\footnote{\url{https://github.com/luckyerr/Voice-Transformer_Speaker-Verification}}., Comment: 5 pages,4 figures,3 tables
Published: 2023

26. Improvement for tolerance of lithium metal counter electrodes towards sodium contamination in hybrid Li/Na-ion electrolytes

Author: Liu, Liyang, Zhao, Xufan, Qi, Jiaxing, Abdussalam, Abubakar, Zhang, Wei, Wang, Hongyu, and Xu, Guobao
Published: 2024
Full Text: View/download PDF

27. Multiple hypergraph convolutional network social recommendation using dual contrastive learning

Author: Wang, Hongyu, Zhou, Wei, Wen, Junhao, and Qiao, Shutong
Published: 2024
Full Text: View/download PDF

28. Who's Watching Me?: Exploring the Impact of Audience Familiarity on Player Performance, Experience, and Exertion in Virtual Reality Exergames

Author: Guo, Zixuan, Xu, Wenge, Zhang, Jialin, Wang, Hongyu, Lo, Cheng-Hung, and Liang, Hai-Ning
Subjects: Computer Science - Human-Computer Interaction
Abstract: Familiarity with audiences plays a significant role in shaping individual performance and experience across various activities in everyday life. This study delves into the impact of familiarity with non-playable character (NPC) audiences on player performance and experience in virtual reality (VR) exergames. By manipulating of NPC appearance (face and body shape) and voice familiarity, we explored their effect on game performance, experience, and exertion. The findings reveal that familiar NPC audiences have a positive impact on performance, creating a more enjoyable gaming experience, and leading players to perceive less exertion. Moreover, individuals with higher levels of self-consciousness exhibit heightened sensitivity to the familiarity with NPC audiences. Our results shed light on the role of familiar NPC audiences in enhancing player experiences and provide insights for designing more engaging and personalized VR exergame environments., Comment: 10 pages, 5 figures, IEEE International Symposium on Mixed and Augmented Reality (ISMAR) 2023
Published: 2023

29. BitNet: Scaling 1-bit Transformers for Large Language Models

Author: Wang, Hongyu, Ma, Shuming, Dong, Li, Huang, Shaohan, Wang, Huaijie, Ma, Lingxiao, Yang, Fan, Wang, Ruiping, Wu, Yi, and Wei, Furu
Subjects: Computer Science - Computation and Language
Abstract: The increasing size of large language models has posed challenges for deployment and raised concerns about environmental impact due to high energy consumption. In this work, we introduce BitNet, a scalable and stable 1-bit Transformer architecture designed for large language models. Specifically, we introduce BitLinear as a drop-in replacement of the nn.Linear layer in order to train 1-bit weights from scratch. Experimental results on language modeling show that BitNet achieves competitive performance while substantially reducing memory footprint and energy consumption, compared to state-of-the-art 8-bit quantization methods and FP16 Transformer baselines. Furthermore, BitNet exhibits a scaling law akin to full-precision Transformers, suggesting its potential for effective scaling to even larger language models while maintaining efficiency and performance benefits., Comment: Work in progress
Published: 2023

30. The Gehring-Hayman type theorem on pseudoconvex domains of finite type in $\mathbb{C}^2$

Author: Li, Haichou, Pu, Xingsi, and Wang, Hongyu
Subjects: Mathematics - Complex Variables
Abstract: In this paper, we obtain the Gehring-Hayman type theorem on smoothly bounded pseudoconvex domains of finite type in $\mathbb{C}^2$. As an application, we provide a quantitative comparison between global and local Kobayashi distances near a boundary point for these domains., Comment: arXiv admin note: substantial text overlap with arXiv:2301.06411
Published: 2023

31. Electrical Characteristics of the GEC Reference Cell with Impedance Matching: A Two-Dimensional PIC/MCC Modeling Study

Author: Chen, Zili, Wang, Hongyu, Yu, Shimin, Wang, Yu, Chen, Zhipeng, Jiang, Wei, Schulze, Julian, and Zhang, Ya
Subjects: Physics - Plasma Physics
Abstract: In this paper, the electrical characteristics of the Gaseous Electronics Conference (GEC) reference cell with impedance matching are investigated through a two-dimensional electrostatic implicit Particle-in-Cell/Monte Carlo Collision (PIC/MCC) model in an axisymmetric coordinate system. The coupling between the complex reactor geometry and the external circuit is included via an equivalent capacitance calculated from the electric energy density. The results of this model are compared with experimental measurements and other model calculations and show good agreement. This simulation obtains the plasma kinetics of the capacitively coupled discharge process at low pressure and detailed external circuit responses, including power transmission, reflection, and higher-order harmonics in the circuit, which provides important insights for impedance-matching design in semiconductor plasma processing.
Published: 2023

32. Effect of annealing temperature on microstructure and memory properties of Fe-Mn-Si-Cr-Ni alloy prepared by additive manufacturing

Author: Yin, Xiaoqin, Song, Laidong, Zhu, Jian, Wang, Hongyu, and Zhang, Qin
Published: 2024
Full Text: View/download PDF

33. Efficacy of Zengye Chengqi decoction combined with olanzapine in the treatment of schizophrenia of Yangming Fushi syndrome

Author: Wang Weili, Deng Li, Wang Hongyu, Yang Shichang, and Cui Guimei
Subjects: zengye chengqi decoction, olanzapine, schizophrenia, yangming fushi syndrome, event-related potential p300, Psychology, BF1-990, Psychiatry, RC435-571
Abstract: BackgroundPatients with schizophrenia of Yangming Fushi syndrome experience more severe symptoms， and a substantial proportion of patients derive inadequate benefit from antipsychotics and suffer from serious adverse effects， yet few studies have been conducted on the treatment of schizophrenia of Yangming Fushi syndrome with Zengye Chengqi decoction.ObjectiveTo explore the efficacy of Zengye Chengqi decoction combined with olanzapine in the treatment of schizophrenia of Yangming Fushi syndrome， in order to provide references for the treatment of schizophrenia with the combination of traditional Chinese and western medicine.MethodsA total of 60 patients attending the Second Affiliated Hospital of Xinxiang Medical College from January 2022 to August 2023 and fulfilling the International Classification of Diseases （ICD-10） diagnostic criteria for schizophrenia were enrolled， and assigned into study group （n=30） and control group （n=30） using random number table methods. All patients were treated with olanzapine， and study group was given Zengye Chengqi decoction on this basis. Treatment for both groups lasted for 4 weeks. All participants were assessed using Positive and Negative Syndrome Scale （PANSS）， Montreal Cognitive Assessment （MoCA） and Event-Related Potential P300 at baseline and end of treatment. The occurrence of adverse reactions was recorded at the end of treatment.ResultsStudy group reported a higher treatment effective rate compared with control group （χ2=9.320， P=0.002）. After treatment， study group detected a significant reduction in PANSS subscales and total scores （F=10.287， 8.258， 8.844， 20.079， P
Published: 2024
Full Text: View/download PDF

34. PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine

Author: Zhang, Chenrui, Liu, Lin, Wang, Jinpeng, Wang, Chuyuan, Sun, Xiao, Wang, Hongyu, and Cai, Mingchen
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: As an effective tool for eliciting the power of Large Language Models (LLMs), prompting has recently demonstrated unprecedented abilities across a variety of complex tasks. To further improve the performance, prompt ensemble has attracted substantial interest for tackling the hallucination and instability of LLMs. However, existing methods usually adopt a two-stage paradigm, which requires a pre-prepared set of prompts with substantial manual effort, and is unable to perform directed optimization for different weak learners. In this paper, we propose a simple, universal, and automatic method named PREFER (Pompt Ensemble learning via Feedback-Reflect-Refine) to address the stated limitations. Specifically, given the fact that weak learners are supposed to focus on hard examples during boosting, PREFER builds a feedback mechanism for reflecting on the inadequacies of existing weak learners. Based on this, the LLM is required to automatically synthesize new prompts for iterative refinement. Moreover, to enhance stability of the prompt effect evaluation, we propose a novel prompt bagging method involving forward and backward thinking, which is superior to majority voting and is beneficial for both feedback and weight calculation in boosting. Extensive experiments demonstrate that our PREFER achieves state-of-the-art performance in multiple types of tasks by a significant margin. We have made our code publicly available., Comment: 8 pages, 4 figures
Published: 2023

35. Noise Reduction in Wind Turbine Airfoils with Serrated Trailing Edges: An Experimental Study in Low Turbulence Wind Tunnel

Author: Xue, Weicheng, Wang, Hongyu, Chen, Zhe, and Yang, Bing
Subjects: Physics - Fluid Dynamics
Abstract: This study explores the noise reduction achieved by airfoils with serrated trailing edges in a low turbulence wind tunnel, focusing on acoustic spectral characteristics and wake flow field measurements. We analyze the effects of various factors, including Reynolds number, angle of attack, serration parameters, and model type, on sound power levels and far-field radiation patterns. Our findings reveal that serrated trailing edges significantly reduce noise across a broader frequency range than previously documented, particularly in the mid-to-high frequency range, with reductions bounded by Strouhal numbers $St_u = 1$ and $St_l = 0.48$. Interestingly, the serration geometry exhibits minimal impact on noise reduction, which varies with the angle of attack and airfoil profile across all tested conditions. Additionally, while serrations effectively lower noise levels, especially at higher frequencies, they do not significantly alter the airfoil's acoustic directivity patterns. Measurements of wake flow velocity spectra demonstrate a clear correlation between reduced wake turbulence and noise reduction, as serrated edges decrease the power spectral density of turbulent velocity fluctuations, effectively disrupting larger vortex structures responsible for noise generation. These valuable insights contribute to understanding the aerodynamic and acoustic benefits of serrated trailing edges, warranting further experimental validation in future studies.
Published: 2023

36. Fourier-transformed gauge theory models of three-dimensional topological orders with gapped boundaries

Author: Wang, Siyuan, Chen, Yanyan, Wang, Hongyu, Hu, Yuting, and Wan, Yidun
Subjects: Condensed Matter - Strongly Correlated Electrons, High Energy Physics - Theory, Mathematical Physics
Abstract: In this paper, we apply the method of Fourier transform and basis rewriting developed in arXiv:1910.13441 for the two-dimensional quantum double model of topological orders to the three-dimensional gauge theory model (with a gauge group $G$) of three-dimensional topological orders. We find that the gapped boundary condition of the gauge theory model is characterized by a Frobenius algebra in the representation category $\mathcal Rep(G)$ of $G$, which also describes the charge splitting and condensation on the boundary. We also show that our Fourier transform maps the three-dimensional gauge theory model with input data $G$ to the Walker-Wang model with input data $\mathcal Rep(G)$ on a trivalent lattice with dangling edges, after truncating the Hilbert space by projecting all dangling edges to the trivial representation of $G$. This Fourier transform also provides a systematic construction of the gapped boundary theory of the Walker-Wang model. This establishes a correspondence between two types of topological field theories: the extended Dijkgraaf-Witten and extended Crane-Yetter theories., Comment: 39 pages, 9 figures
Published: 2023

37. Hyperspectral Target Detection Based on Low-Rank Background Subspace Learning and Graph Laplacian Regularization

Author: Shen, Dunbin, Ma, Xiaorui, Kong, Wenfeng, Tian, Jiacheng, and Wang, Hongyu
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Hyperspectral target detection is good at finding dim and small objects based on spectral characteristics. However, existing representation-based methods are hindered by the problem of the unknown background dictionary and insufficient utilization of spatial information. To address these issues, this paper proposes an efficient optimizing approach based on low-rank representation (LRR) and graph Laplacian regularization (GLR). Firstly, to obtain a complete and pure background dictionary, we propose a LRR-based background subspace learning method by jointly mining the low-dimensional structure of all pixels. Secondly, to fully exploit local spatial relationships and capture the underlying geometric structure, a local region-based GLR is employed to estimate the coefficients. Finally, the desired detection map is generated by computing the ratio of representation errors from binary hypothesis testing. The experiments conducted on two benchmark datasets validate the effectiveness and superiority of the approach. For reproduction, the accompanying code is available at https://github.com/shendb2022/LRBSL-GLR., Comment: 4 pages, 3 figures, 1 table
Published: 2023

38. Association of diabetes and white blood cell count with stroke in patients with carotid artery dissection

Author: Zhao, Meng, Zhong, Xuemin, Du, Jiaxiu, Li, Li, Wang, Jian, and Wang, Hongyu
Published: 2024
Full Text: View/download PDF

39. miR-455/GREM1 axis promotes colorectal cancer progression and liver metastasis by affecting PI3K/AKT pathway and inducing M2 macrophage polarization

Author: Dai, Shipeng, Xu, Fan, Xu, Xiaozhang, Huang, Tian, Wang, Yiming, Wang, Hongyu, Xie, Yucheng, Yue, Lei, Zhao, Wenhu, Xia, Yongxiang, Gu, Jian, and Qian, Xiaofeng
Published: 2024
Full Text: View/download PDF

40. Knockdown of trem2 promotes proinflammatory microglia and inhibits glioma progression via the JAK2/STAT3 and NF-κB pathways

Author: Yan, Yunji, Bai, Shengwei, Han, Hongxi, Dai, Junqiang, Niu, Liang, Wang, Hongyu, Dong, Qiang, Yin, Hang, Yuan, Guoqiang, and Pan, Yawen
Published: 2024
Full Text: View/download PDF

41. Efficacy of expanded periurethral cleansing in reducing catheter-associated urinary tract infection in comatose patients: a randomized controlled clinical trial

Author: Qin, Xingsong, Zhao, He, Qin, Wei, Qin, Xinglei, Shen, Songying, and Wang, Hongyu
Published: 2024
Full Text: View/download PDF

42. Multi-modal molecular determinants of clinically relevant osteoporosis subtypes

Author: Yuan, Chunchun, Yu, Xiang-Tian, Wang, Jing, Shu, Bing, Wang, Xiao-Yun, Huang, Chen, Lv, Xia, Peng, Qian-Qian, Qi, Wen-Hao, Zhang, Jing, Zheng, Yan, Wang, Si-Jia, Liang, Qian-Qian, Shi, Qi, Li, Ting, Huang, He, Mei, Zhen-Dong, Zhang, Hai-Tao, Xu, Hong-Bin, Cui, Jiarui, Wang, Hongyu, Zhang, Hong, Shi, Bin-Hao, Sun, Pan, Zhang, Hui, Ma, Zhao-Long, Feng, Yuan, Chen, Luonan, Zeng, Tao, Tang, De-Zhi, and Wang, Yong-Jun
Published: 2024
Full Text: View/download PDF

43. Melatonin as an immunomodulator in CD19-targeting CAR-T cell therapy: managing cytokine release syndrome

Author: Zheng, Na, Long, Yihao, Bai, Zixuan, Li, Jianing, Wang, Hongyu, Song, Dan-Dan, Liu, Hong-Lin, Shi, Jian-Hong, and Zhao, Shuli
Published: 2024
Full Text: View/download PDF

44. Utilizing fNIRS to investigate the impact of Baduanjin training on attentional function in post-stroke cognitive impairment patients: a study protocol for a randomized controlled trial

Author: Zhou, Xingchen, Wan, Yiwen, Xu, Zhengxian, Yu, Cancan, Wu, Ziyi, Zhuang, Zesen, Xia, Rui, Wang, Hongyu, and Chen, Shangjie
Published: 2024
Full Text: View/download PDF

45. Long non-coding RNAs and their potential function in response to postharvest senescence of Sparassis latifolia during cold storage

Author: Weng, Mengting, Zhang, Di, Wang, Hongyu, Yang, Chi, Lin, Hongyi, Pan, Yanfang, and Lin, Yanquan
Published: 2024
Full Text: View/download PDF

46. Simultaneous determination of chloramphenicol, stilbenes, and resorcylic acid lactones in pork using UPLC–MS/MS with a C18 cartridge and immunoaffinity microextraction in a packed syringe

Author: Zhao, Shijin, Fu, Leiming, Yang, Linyan, Li, Na, Zhang, Xinda, Liu, Chang, Wang, Hongyu, Zhang, Yan, Guo, Yongze, and Li, Cun
Published: 2024
Full Text: View/download PDF

47. DLRD: dual-level network for rumor detection on geo-textual data

Author: Wang, Hongyu, Li, Ke, and Shang, Shuo
Published: 2024
Full Text: View/download PDF

48. CPU-GPU Heterogeneous Code Acceleration of a Finite Volume Computational Fluid Dynamics Solver

Author: Xue, Weicheng, Wang, Hongyu, and Roy, Christopher J.
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Performance
Abstract: This work deals with the CPU-GPU heterogeneous code acceleration of a finite-volume CFD solver utilizing multiple CPUs and GPUs at the same time. First, a high-level description of the CFD solver called SENSEI, the discretization of SENSEI, and the CPU-GPU heterogeneous computing workflow in SENSEI leveraging MPI and OpenACC are given. Then, a performance model for CPU-GPU heterogeneous computing requiring ghost cell exchange is proposed to help estimate the performance of the heterogeneous implementation. The scaling performance of the CPU-GPU heterogeneous computing and its comparison with the pure multi-CPU/GPU performance for a supersonic inlet test case is presented to display the advantages of leveraging the computational power of both the CPU and the GPU. Using CPUs and GPUs as workers together, the performance can be improved further compared to using pure CPUs or GPUs, and the advantages can be fairly estimated by the performance model proposed in this work. Finally, conclusions are drawn to provide 1) suggestions for application users who have an interest to leverage the computational power of the CPU and GPU to accelerate their own scientific computing simulations and 2) feedback for hardware architects who have an interest to design a better CPU-GPU heterogeneous system for heterogeneous computing.
Published: 2023

49. On closed almost complex four manifolds

Author: Wang, Hongyu, Wang, Ken, and Zhu, Peng
Subjects: Mathematics - Differential Geometry, 53D35, 53C56, 53C65
Abstract: In this paper, by using weakly \widetilde{\mathcal{D}}^+_J (resp. \mathcal{D}^+_J )-closed technique firstly introduced by Tan, Wang, Zhou and Zhu, we will give a characterization of tamed and weakened tamed four-manifolds, and an almost Kaehler version of Nakai-Moishezon criterion for almost complex four-manifolds., Comment: arXiv admin note: text overlap with arXiv:1712.02948
Published: 2023

50. Symmetry Fractionalized (Irrationalized) Fusion Rules and Two Domain-Wall Verlinde Formulae

Author: Zhao, Yu, Wang, Hongyu, Hu, Yuting, and Wan, Yidun
Subjects: Condensed Matter - Strongly Correlated Electrons, High Energy Physics - Theory, Mathematical Physics
Abstract: We investigate the composite systems consisting of topological orders separated by gapped domain walls. We derive a pair of domain-wall Verlinde formulae, that elucidate the connection between the braiding of interdomain excitations labeled by pairs of anyons in different domains and quasiparticles in the gapped domain wall with their respective fusion rules. Through explicit non-Abelian examples, we showcase the calculation of such braiding and fusion, revealing that the fusion rules for interdomain excitations are generally fractional or irrational. By investigating the correspondence between composite systems and anyon condensation, we unveil the reason for designating these fusion rules as symmetry fractionalized (irrationalized) fusion rules. Our findings hold promise for applications across various fields, such as topological quantum computation, topological field theory, and conformal field theory., Comment: 14 pages, 7 figures
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

2,623 results on '"Wang, Hongyu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources