Author: "wang, Yu" / Publication Year Range: Last 50 years - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"wang, Yu"' showing total 135,451 results

Start Over Author "wang, Yu" Publication Year Range Last 50 years

135,451 results on '"wang, Yu"'

1. Serological diagnosis of Parabronema skrjabini infection using a recombinant antigen in bactrian camels

Author: Chen, Xindi, Feng, Chenchen, Wang, Jinling, Wang, Yu, Wang, Tengyu, Liu, Chunxia, and Wang, Wenlong
Published: 2023
Full Text: View/download PDF

2. Listening to the Enemy: Radio Consumption and Technological Culture in Maoist China, 1949–1965

Author: Wang, Yu
Published: 2022
Full Text: View/download PDF

3. On the Hochschild homology of singularity categories

Author: Wang, Yu, Arunachalam, Umamaheswaran, and Keller, Bernhard
Subjects: Mathematics, QA1-939
Abstract: Let $k$ be an algebraically closed field and $A$ a finite-dimensional $k$-algebra. In this note, we determine complexes which compute the Hochschild homology of the canonical dg enhancement of the bounded derived category of $A$ and of the canonical dg enhancement of the singularity category of $A$. As an application, we obtain a new approach to the computation of Hochschild homology of Leavitt path algebras.
Published: 2022
Full Text: View/download PDF

4. Effectiveness Evaluation of Recombinant Antigen rCPI For iELISA Detection of Camel Parabronemiasis

Author: Wang, Yu, Feng, Chenchen, Liu, Chunxia, LI, Jianyun, and Wang, Wenlong
Published: 2021
Full Text: View/download PDF

5. Multi-UAV Behavior-based Formation with Static and Dynamic Obstacles Avoidance via Reinforcement Learning

Author: Xie, Yuqing, Yu, Chao, Zang, Hongzhi, Gao, Feng, Tang, Wenhao, Huang, Jingyi, Chen, Jiayu, Xu, Botian, Wu, Yi, and Wang, Yu
Subjects: Computer Science - Robotics
Abstract: Formation control of multiple Unmanned Aerial Vehicles (UAVs) is vital for practical applications. This paper tackles the task of behavior-based UAV formation while avoiding static and dynamic obstacles during directed flight. We present a two-stage reinforcement learning (RL) training pipeline to tackle the challenge of multi-objective optimization, large exploration spaces, and the sim-to-real gap. The first stage searches in a simplified scenario for a linear utility function that balances all task objectives simultaneously, whereas the second stage applies the utility function in complex scenarios, utilizing curriculum learning to navigate large exploration spaces. Additionally, we apply an attention-based observation encoder to enhance formation maintenance and manage varying obstacle quantity. Experiments in simulation and real world demonstrate that our method outperforms planning-based and RL-based baselines regarding collision-free rate and formation maintenance in scenarios with static, dynamic, and mixed obstacles.
Published: 2024

6. A Comprehensive Analysis of Social Tie Strength: Definitions, Prediction Methods, and Future Directions

Author: Cheng, Xueqi, Yang, Catherine, Zhao, Yuying, Wang, Yu, Karimi, Hamid, and Derr, Tyler
Subjects: Computer Science - Social and Information Networks
Abstract: The rapid growth of online social networks has underscored the importance of understanding the intensity of user relationships, referred to as "tie strength." Over the past few decades, extensive efforts have been made to assess tie strength in networks. However, the lack of ground-truth tie strength labels and the differing perspectives on tie strength among researchers have complicated the development of effective prediction methods for real-world applications. In our study, we first categorize mainstream understandings of tie strength into seven standardized definitions and verify their effectiveness by investigating the class distributions and correlations across these definitions. We also draw key insights into tie resilience from the perspective of tie dissolution that (1) stronger ties are more resilient than weaker ones, and (2) this tie resiliency ratio increases as the network evolves. We then conduct extensive experiments to evaluate existing tie strength prediction methods under these definitions, revealing that (1) neural network methods capable of learning from semantic features hold great potential for high performance, (2) models struggle under definitions that offer limited understandings of tie strength in the network, (3) existing models face imbalance issues that cannot be addressed by traditional quantity imbalance techniques, and (4) different definitions of tie strength allow for the inference of not only the current state but also the future state of a tie. Building on these findings, we propose strategies to improve existing methods and suggest several promising directions for future research.
Published: 2024

7. Guide for Defense (G4D): Dynamic Guidance for Robust and Balanced Defense in Large Language Models

Author: Cao, He, Luo, Weidi, Wang, Yu, Liu, Zijing, Feng, Bing, Yao, Yuan, and Li, Yu
Subjects: Computer Science - Artificial Intelligence
Abstract: With the extensive deployment of Large Language Models (LLMs), ensuring their safety has become increasingly critical. However, existing defense methods often struggle with two key issues: (i) inadequate defense capabilities, particularly in domain-specific scenarios like chemistry, where a lack of specialized knowledge can lead to the generation of harmful responses to malicious queries. (ii) over-defensiveness, which compromises the general utility and responsiveness of LLMs. To mitigate these issues, we introduce a multi-agents-based defense framework, Guide for Defense (G4D), which leverages accurate external information to provide an unbiased summary of user intentions and analytically grounded safety response guidance. Extensive experiments on popular jailbreak attacks and benign datasets show that our G4D can enhance LLM's robustness against jailbreak attacks on general and domain-specific scenarios without compromising the model's general functionality.
Published: 2024

8. ReflecTool: Towards Reflection-Aware Tool-Augmented Clinical Agents

Author: Liao, Yusheng, Jiang, Shuyang, Wang, Yanfeng, and Wang, Yu
Subjects: Computer Science - Computation and Language
Abstract: Large Language Models (LLMs) have shown promising potential in the medical domain, assisting with tasks like clinical note generation and patient communication. However, current LLMs are limited to text-based communication, hindering their ability to interact with diverse forms of information in clinical environments. Despite clinical agents succeeding in diverse signal interaction, they are oriented to a single clinical scenario and hence fail for broader applications. To evaluate clinical agents holistically, we propose ClinicalAgent Bench~(CAB), a comprehensive medical agent benchmark consisting of 18 tasks across five key realistic clinical dimensions. Building on this, we introduce ReflecTool, a novel framework that excels at utilizing domain-specific tools within two stages. The first optimization stage progressively enlarges a long-term memory by saving successful solving processes and tool-wise experience of agents in a tiny pre-defined training set. In the following inference stage, ReflecTool can search for supportive successful demonstrations from already built long-term memory to guide the tool selection strategy, and a verifier improves the tool usage according to the tool-wise experience with two verification methods--iterative refinement and candidate selection. Extensive experiments on ClinicalAgent Benchmark demonstrate that ReflecTool surpasses the pure LLMs with more than 10 points and the well-established agent-based methods with 3 points, highlighting its adaptability and effectiveness in solving complex clinical tasks., Comment: 20 pages
Published: 2024

9. PLGS: Robust Panoptic Lifting with 3D Gaussian Splatting

Author: Wang, Yu, Wei, Xiaobao, Lu, Ming, and Kang, Guoliang
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Previous methods utilize the Neural Radiance Field (NeRF) for panoptic lifting, while their training and rendering speed are unsatisfactory. In contrast, 3D Gaussian Splatting (3DGS) has emerged as a prominent technique due to its rapid training and rendering speed. However, unlike NeRF, the conventional 3DGS may not satisfy the basic smoothness assumption as it does not rely on any parameterized structures to render (e.g., MLPs). Consequently, the conventional 3DGS is, in nature, more susceptible to noisy 2D mask supervision. In this paper, we propose a new method called PLGS that enables 3DGS to generate consistent panoptic segmentation masks from noisy 2D segmentation masks while maintaining superior efficiency compared to NeRF-based methods. Specifically, we build a panoptic-aware structured 3D Gaussian model to introduce smoothness and design effective noise reduction strategies. For the semantic field, instead of initialization with structure from motion, we construct reliable semantic anchor points to initialize the 3D Gaussians. We then use these anchor points as smooth regularization during training. Additionally, we present a self-training approach using pseudo labels generated by merging the rendered masks with the noisy masks to enhance the robustness of PLGS. For the instance field, we project the 2D instance masks into 3D space and match them with oriented bounding boxes to generate cross-view consistent instance masks for supervision. Experiments on various benchmarks demonstrate that our method outperforms previous state-of-the-art methods in terms of both segmentation quality and speed.
Published: 2024

10. Error estimates between SGD with momentum and underdamped Langevin diffusion

Author: Guillin, Arnaud, Wang, Yu, Xu, Lihu, and Yang, Haoran
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Mathematics - Probability
Abstract: Stochastic gradient descent with momentum is a popular variant of stochastic gradient descent, which has recently been reported to have a close relationship with the underdamped Langevin diffusion. In this paper, we establish a quantitative error estimate between them in the 1-Wasserstein and total variation distances.
Published: 2024

11. Few-shot In-Context Preference Learning Using Large Language Models

Author: Yu, Chao, Lu, Hong, Gao, Jiaxuan, Tan, Qixin, Yang, Xinting, Wang, Yu, Wu, Yi, and Vinitsky, Eugene
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Designing reward functions is a core component of reinforcement learning but can be challenging for truly complex behavior. Reinforcement Learning from Human Feedback (RLHF) has been used to alleviate this challenge by replacing a hand-coded reward function with a reward function learned from preferences. However, it can be exceedingly inefficient to learn these rewards as they are often learned tabula rasa. We investigate whether Large Language Models (LLMs) can reduce this query inefficiency by converting an iterative series of human preferences into code representing the rewards. We propose In-Context Preference Learning (ICPL), a method that uses the grounding of an LLM to accelerate learning reward functions from preferences. ICPL takes the environment context and task description, synthesizes a set of reward functions, and then repeatedly updates the reward functions using human rankings of videos of the resultant policies. Using synthetic preferences, we demonstrate that ICPL is orders of magnitude more efficient than RLHF and is even competitive with methods that use ground-truth reward functions instead of preferences. Finally, we perform a series of human preference-learning trials and observe that ICPL extends beyond synthetic settings and can work effectively with humans-in-the-loop. Additional information and videos are provided at https://sites.google.com/view/few-shot-icpl/home.
Published: 2024

12. DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization

Author: Zhu, Haowei, Tang, Dehua, Liu, Ji, Lu, Mingjie, Zheng, Jintu, Peng, Jinzhang, Li, Dong, Wang, Yu, Jiang, Fan, Tian, Lu, Tiwari, Spandan, Sirasao, Ashish, Yong, Jun-Hai, Wang, Bin, and Barsoum, Emad
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Diffusion models have achieved remarkable progress in the field of image generation due to their outstanding capabilities. However, these models require substantial computing resources because of the multi-step denoising process during inference. While traditional pruning methods have been employed to optimize these models, the retraining process necessitates large-scale training datasets and extensive computational costs to maintain generalization ability, making it neither convenient nor efficient. Recent studies attempt to utilize the similarity of features across adjacent denoising stages to reduce computational costs through simple and static strategies. However, these strategies cannot fully harness the potential of the similar feature patterns across adjacent timesteps. In this work, we propose a novel pruning method that derives an efficient diffusion model via a more intelligent and differentiable pruner. At the core of our approach is casting the model pruning process into a SubNet search process. Specifically, we first introduce a SuperNet based on standard diffusion via adding some backup connections built upon the similar features. We then construct a plugin pruner network and design optimization losses to identify redundant computation. Finally, our method can identify an optimal SubNet through few-step gradient optimization and a simple post-processing procedure. We conduct extensive experiments on various diffusion models including Stable Diffusion series and DiTs. Our DiP-GO approach achieves 4.4 x speedup for SD-1.5 without any loss of accuracy, significantly outperforming the previous state-of-the-art methods.
Published: 2024

13. Large Language Model-based Augmentation for Imbalanced Node Classification on Text-Attributed Graphs

Author: Wang, Leyao, Wang, Yu, Ni, Bo, Zhao, Yuying, and Derr, Tyler
Subjects: Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Social and Information Networks
Abstract: Node classification on graphs frequently encounters the challenge of class imbalance, leading to biased performance and posing significant risks in real-world applications. Although several data-centric solutions have been proposed, none of them focus on Text-Attributed Graphs (TAGs), and therefore overlook the potential of leveraging the rich semantics encoded in textual features for boosting the classification of minority nodes. Given this crucial gap, we investigate the possibility of augmenting graph data in the text space, leveraging the textual generation power of Large Language Models (LLMs) to handle imbalanced node classification on TAGs. Specifically, we propose a novel approach called LA-TAG (LLM-based Augmentation on Text-Attributed Graphs), which prompts LLMs to generate synthetic texts based on existing node texts in the graph. Furthermore, to integrate these synthetic text-attributed nodes into the graph, we introduce a text-based link predictor to connect the synthesized nodes with the existing nodes. Our experiments across multiple datasets and evaluation metrics show that our framework significantly outperforms traditional non-textual-based data augmentation strategies and specific node imbalance solutions. This highlights the promise of using LLMs to resolve imbalance issues on TAGs., Comment: 11 pages, 4 figures
Published: 2024

14. Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning

Author: Lu, Yuxiang, Cao, Shengcao, and Wang, Yu-Xiong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Vision Foundation Models (VFMs) have demonstrated outstanding performance on numerous downstream tasks. However, due to their inherent representation biases originating from different training paradigms, VFMs exhibit advantages and disadvantages across distinct vision tasks. Although amalgamating the strengths of multiple VFMs for downstream tasks is an intuitive strategy, effectively exploiting these biases remains a significant challenge. In this paper, we propose a novel and versatile "Swiss Army Knife" (SAK) solution, which adaptively distills knowledge from a committee of VFMs to enhance multi-task learning. Unlike existing methods that use a single backbone for knowledge transfer, our approach preserves the unique representation bias of each teacher by collaborating the lightweight Teacher-Specific Adapter Path modules with the Teacher-Agnostic Stem. Through dynamic selection and combination of representations with Mixture-of-Representations Routers, our SAK is capable of synergizing the complementary strengths of multiple VFMs. Extensive experiments show that our SAK remarkably outperforms prior state of the arts in multi-task learning by 10% on the NYUD-v2 benchmark, while also providing a flexible and robust framework that can readily accommodate more advanced model designs.
Published: 2024

15. SPF-EMPC Planner: A real-time multi-robot trajectory planner for complex environments with uncertainties

Author: Liu, Peng, Zhu, Pengming, Zeng, Zhiwen, Qiu, Xuekai, Wang, Yu, and Lu, Huimin
Subjects: Computer Science - Robotics
Abstract: In practical applications, the unpredictable movement of obstacles and the imprecise state observation of robots introduce significant uncertainties for the swarm of robots, especially in cluster environments. However, existing methods are difficult to realize safe navigation, considering uncertainties, complex environmental structures, and robot swarms. This paper introduces an extended state model predictive control planner with a safe probability field to address the multi-robot navigation problem in complex, dynamic, and uncertain environments. Initially, the safe probability field offers an innovative approach to model the uncertainty of external dynamic obstacles, combining it with an unconstrained optimization method to generate safe trajectories for multi-robot online. Subsequently, the extended state model predictive controller can accurately track these generated trajectories while considering the robots' inherent model constraints and state uncertainty, thus ensuring the practical feasibility of the planned trajectories. Simulation experiments show a success rate four times higher than that of state-of-the-art algorithms. Physical experiments demonstrate the method's ability to operate in real-time, enabling safe navigation for multi-robot in uncertain environments.
Published: 2024

16. Disentangling Likes and Dislikes in Personalized Generative Explainable Recommendation

Author: Shimizu, Ryotaro, Wada, Takashi, Wang, Yu, Kruse, Johannes, O'Brien, Sean, HtaungKham, Sai, Song, Linxin, Yoshikawa, Yuya, Saito, Yuki, Tsung, Fugee, Goto, Masayuki, and McAuley, Julian
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: Recent research on explainable recommendation generally frames the task as a standard text generation problem, and evaluates models simply based on the textual similarity between the predicted and ground-truth explanations. However, this approach fails to consider one crucial aspect of the systems: whether their outputs accurately reflect the users' (post-purchase) sentiments, i.e., whether and why they would like and/or dislike the recommended items. To shed light on this issue, we introduce new datasets and evaluation methods that focus on the users' sentiments. Specifically, we construct the datasets by explicitly extracting users' positive and negative opinions from their post-purchase reviews using an LLM, and propose to evaluate systems based on whether the generated explanations 1) align well with the users' sentiments, and 2) accurately identify both positive and negative opinions of users on the target items. We benchmark several recent models on our datasets and demonstrate that achieving strong performance on existing metrics does not ensure that the generated explanations align well with the users' sentiments. Lastly, we find that existing models can provide more sentiment-aware explanations when the users' (predicted) ratings for the target items are directly fed into the models as input. We will release our code and datasets upon acceptance.
Published: 2024

17. Soft-Matter-Based Topological Vertical Cavity Surface Emitting Lasers

Author: Wang, Yu, Xia, Shiqi, Shao, Jingbin, Xie, Qun, Yang, Donghao, Zhang, Xinzheng, Drevensek-Olenik, Irena, Wu, Qiang, Chen, Zhigang, and Xu, Jingjun
Subjects: Physics - Optics
Abstract: Polarized topological vertical cavity surface-emitting lasers (VCSELs), as stable and efficient on-chip light sources, play an important role in the next generation of optical storage and optical communications. However, most current topological lasers demand complex design and expensive fabrication processes, and their semiconductor-based structures pose challenges for flexible device applications. By use of an analogy with two-dimensional Semenov insulators in synthetic parametric space, we design and realize a one-dimensional optical superlattice (stacked polymerized cholesteric liquid crystal films and Mylar films), thereby we demonstrate a flexible, low threshold, circularly polarized topological VCSEL with high slope efficiency. We show that such a laser maintains a good single-mode property under low pump power and inherits the transverse spatial profile of the pump laser. Thanks to the soft-matter-based flexibility, our topological VCSEL can be "attached" to substrates of various shapes, enabling desired laser properties and robust beam steering even after undergoing hundreds of bends. Our results may find applications in consumer electronics, laser scanning and displays, as well as wearable devices.
Published: 2024

18. Instability of steady-state mixed-state symmetry-protected topological order to strong-to-weak spontaneous symmetry breaking

Author: Shah, Jeet, Fechisin, Christopher, Wang, Yu-Xin, Iosue, Joseph T., Watson, James D., Wang, Yan-Qi, Ware, Brayden, Gorshkov, Alexey V., and Lin, Cheng-Ju
Subjects: Quantum Physics, Condensed Matter - Statistical Mechanics, Condensed Matter - Strongly Correlated Electrons
Abstract: Recent experimental progress in controlling open quantum systems enables the pursuit of mixed-state nonequilibrium quantum phases. We investigate whether open quantum systems hosting mixed-state symmetry-protected topological states as steady states retain this property under symmetric perturbations. Focusing on the decohered cluster state -- a mixed-state symmetry-protected topological state protected by a combined strong and weak symmetry -- we construct a parent Lindbladian that hosts it as a steady state. This Lindbladian can be mapped onto exactly solvable reaction-diffusion dynamics, even in the presence of certain perturbations, allowing us to solve the parent Lindbladian in detail and reveal previously-unknown steady states. Using both analytical and numerical methods, we find that typical symmetric perturbations cause strong-to-weak spontaneous symmetry breaking at arbitrarily small perturbations, destabilize the steady-state mixed-state symmetry-protected topological order. However, when perturbations introduce only weak symmetry defects, the steady-state mixed-state symmetry-protected topological order remains stable. Additionally, we construct a quantum channel which replicates the essential physics of the Lindbladian and can be efficiently simulated using only Clifford gates, Pauli measurements, and feedback., Comment: 21+12 pages, 10+4 figures
Published: 2024

19. Strings and membranes from $\mathcal{A}$-theory five brane

Author: Hatsuda, Machiko, Hulík, Ondřej, Linch, William D., Siegel, Warren D., Wang, Di, and Wang, Yu-Ping
Subjects: High Energy Physics - Theory
Abstract: The $\mathcal{A}$-theory takes U-duality symmetry as a guiding principle, with the SL(5) U-duality symmetry being described as the world-volume theory of a 5-brane. Furthermore, by unifying the 6-dimensional world-volume Lorentz symmetry with the SL(5) spacetime symmetry, it extends to SL(6) U-duality symmetry. The SL(5) spacetime vielbein fields and the 5-brane world-volume vielbein fields are mixed under the SL(6) U-duality transformation. We demonstrate that consistent sectionings of the SL(6) $\mathcal{A}$5-brane world-volume Lagrangian yield Lagrangians of the $\mathcal{T}$-string with O(D,D) T-duality symmetry, the conventional string, the ${\cal M}$5-brane with GL(4) duality symmetry, and the non-perturbative M2-brane in supergravity theory. The GL(4) covariant Lagrangian of the ${\cal M}$5-brane derived in this manner is a new, perturbatively quantizable theory., Comment: 41 pages
Published: 2024

20. SceneCraft: Layout-Guided 3D Scene Generation

Author: Yang, Xiuyu, Man, Yunze, Chen, Jun-Kun, and Wang, Yu-Xiong
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The creation of complex 3D scenes tailored to user specifications has been a tedious and challenging task with traditional 3D modeling tools. Although some pioneering methods have achieved automatic text-to-3D generation, they are generally limited to small-scale scenes with restricted control over the shape and texture. We introduce SceneCraft, a novel method for generating detailed indoor scenes that adhere to textual descriptions and spatial layout preferences provided by users. Central to our method is a rendering-based technique, which converts 3D semantic layouts into multi-view 2D proxy maps. Furthermore, we design a semantic and depth conditioned diffusion model to generate multi-view images, which are used to learn a neural radiance field (NeRF) as the final scene representation. Without the constraints of panorama image generation, we surpass previous methods in supporting complicated indoor space generation beyond a single room, even as complicated as a whole multi-bedroom apartment with irregular shapes and layouts. Through experimental analysis, we demonstrate that our method significantly outperforms existing approaches in complex indoor scene generation with diverse textures, consistent geometry, and realistic visual quality. Code and more results are available at: https://orangesodahub.github.io/SceneCraft, Comment: NeurIPS 2024. Code: https://github.com/OrangeSodahub/SceneCraft Project Page: https://orangesodahub.github.io/SceneCraft
Published: 2024

21. Towards Trustworthy Knowledge Graph Reasoning: An Uncertainty Aware Perspective

Author: Ni, Bo, Wang, Yu, Cheng, Lu, Blasch, Erik, and Derr, Tyler
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Recently, Knowledge Graphs (KGs) have been successfully coupled with Large Language Models (LLMs) to mitigate their hallucinations and enhance their reasoning capability, such as in KG-based retrieval-augmented frameworks. However, current KG-LLM frameworks lack rigorous uncertainty estimation, limiting their reliable deployment in high-stakes applications. Directly incorporating uncertainty quantification into KG-LLM frameworks presents challenges due to their complex architectures and the intricate interactions between the knowledge graph and language model components. To address this gap, we propose a new trustworthy KG-LLM framework, Uncertainty Aware Knowledge-Graph Reasoning (UAG), which incorporates uncertainty quantification into the KG-LLM framework. We design an uncertainty-aware multi-step reasoning framework that leverages conformal prediction to provide a theoretical guarantee on the prediction set. To manage the error rate of the multi-step process, we additionally introduce an error rate control module to adjust the error rate within the individual components. Extensive experiments show that our proposed UAG can achieve any pre-defined coverage rate while reducing the prediction set/interval size by 40% on average over the baselines.
Published: 2024

22. Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping

Author: Yang, Yue, Zhang, Shuibai, Shao, Wenqi, Zhang, Kaipeng, Bin, Yi, Wang, Yu, and Luo, Ping
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities across multimodal tasks such as visual perception and reasoning, leading to good performance on various multimodal evaluation benchmarks. However, these benchmarks keep a static nature and overlap with the pre-training data, resulting in fixed complexity constraints and data contamination issues. This raises the concern regarding the validity of the evaluation. To address these two challenges, we introduce a dynamic multimodal evaluation protocol called Vision-Language Bootstrapping (VLB). VLB provides a robust and comprehensive assessment for LVLMs with reduced data contamination and flexible complexity. To this end, VLB dynamically generates new visual question-answering samples through a multimodal bootstrapping module that modifies both images and language, while ensuring that newly generated samples remain consistent with the original ones by a judge module. By composing various bootstrapping strategies, VLB offers dynamic variants of existing benchmarks with diverse complexities, enabling the evaluation to co-evolve with the ever-evolving capabilities of LVLMs. Extensive experimental results across multiple benchmarks, including SEEDBench, MMBench, and MME, show that VLB significantly reduces data contamination and exposes performance limitations of LVLMs.
Published: 2024

23. Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision

Author: Cao, Shengcao, Gui, Liang-Yan, and Wang, Yu-Xiong
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Current large multimodal models (LMMs) face challenges in grounding, which requires the model to relate language components to visual entities. Contrary to the common practice that fine-tunes LMMs with additional grounding supervision, we find that the grounding ability can in fact emerge in LMMs trained without explicit grounding supervision. To reveal this emerging grounding, we introduce an "attend-and-segment" method which leverages attention maps from standard LMMs to perform pixel-level segmentation. Furthermore, to enhance the grounding ability, we propose DIFFLMM, an LMM utilizing a diffusion-based visual encoder, as opposed to the standard CLIP visual encoder, and trained with the same weak supervision. Without being constrained by the biases and limited scale of grounding-specific supervision data, our approach is more generalizable and scalable. We achieve competitive performance on both grounding-specific and general visual question answering benchmarks, compared with grounding LMMs and generalist LMMs, respectively. Notably, we achieve a 44.2 grounding mask recall on grounded conversation generation without any grounding supervision, outperforming the extensively supervised model GLaMM. Project page: https://groundLMM.github.io.
Published: 2024

24. Multi-messenger signatures of a deformed magnetar in gamma-ray bursts

Author: Hashemi, Parisa, Shakeri, Soroush, Wang, Yu, Li, Liang, and Moradi, Rahim
Subjects: Astrophysics - High Energy Astrophysical Phenomena, Astrophysics - Cosmology and Nongalactic Astrophysics, High Energy Physics - Phenomenology
Abstract: We study the evolution of a newly formed magnetized neutron-star (NS) as a power source of gamma-ray bursts (GRBs) in the light of both gravitational wave (GW) and electromagnetic (EM) radiations. The compressible and incompressible fluids are employed in order to model the secular evolution of Maclaurian spheroids. It is shown that the GW and EM light curves evolve as a function of eccentricity and rotational frequency with time. We find that the light curve characteristics crucially depend on NS parameters such as magnitude and structure of magnetic field, ellipticity and the equation of state (EOS) of the fluid. The presence of X-ray flares, whose origins are not yet well understood, can be captured in our model regarding some specific nuclear EOSs. Our model allowing us to explain flares that occur within the wide range of $ 10$ to $10^4$ seconds and the peak luminosity in the order of $10^{46}$ - $10^{51}$ $\rm \text{erg}/s$ by using a reasonable set of parameters such as magnetic field strength around $10^{14}-10^{16}$ Gauss, the quadrupole to dipole ratio of magnetic field up to 500. By applying our model to a sample of GRB X-ray flares observed by Swift/XRT, we try to constraint the crucial parameters of a deformed magnetar via MCMC fitting method. Our analysis shows that ongoing and upcoming joint multi-messenger detections can be used to understand the nature of GRB's central engine and its evolution at the early times of the burst formation.
Published: 2024

25. InstructG2I: Synthesizing Images from Multimodal Attributed Graphs

Author: Jin, Bowen, Pang, Ziqi, Guo, Bingjun, Wang, Yu-Xiong, You, Jiaxuan, and Han, Jiawei
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Computer Science - Social and Information Networks
Abstract: In this paper, we approach an overlooked yet critical task Graph2Image: generating images from multimodal attributed graphs (MMAGs). This task poses significant challenges due to the explosion in graph size, dependencies among graph entities, and the need for controllability in graph conditions. To address these challenges, we propose a graph context-conditioned diffusion model called InstructG2I. InstructG2I first exploits the graph structure and multimodal information to conduct informative neighbor sampling by combining personalized page rank and re-ranking based on vision-language features. Then, a Graph-QFormer encoder adaptively encodes the graph nodes into an auxiliary set of graph prompts to guide the denoising process of diffusion. Finally, we propose graph classifier-free guidance, enabling controllable generation by varying the strength of graph guidance and multiple connected edges to a node. Extensive experiments conducted on three datasets from different domains demonstrate the effectiveness and controllability of our approach. The code is available at https://github.com/PeterGriffinJin/InstructG2I., Comment: 16 pages
Published: 2024

26. 3D UAV Trajectory Planning for IoT Data Collection via Matrix-Based Evolutionary Computation

Author: Sun, Pei-Fa, Song, Yujae, Gao, Kang-Yu, Wang, Yu-Kai, Zhou, Changjun, Jeon, Sang-Woon, and Zhang, Jun
Subjects: Computer Science - Neural and Evolutionary Computing
Abstract: UAVs are increasingly becoming vital tools in various wireless communication applications including internet of things (IoT) and sensor networks, thanks to their rapid and agile non-terrestrial mobility. Despite recent research, planning three-dimensional (3D) UAV trajectories over a continuous temporal-spatial domain remains challenging due to the need to solve computationally intensive optimization problems. In this paper, we study UAV-assisted IoT data collection aimed at minimizing total energy consumption while accounting for the UAV's physical capabilities, the heterogeneous data demands of IoT nodes, and 3D terrain. We propose a matrix-based differential evolution with constraint handling (MDE-CH), a computation-efficient evolutionary algorithm designed to address non-convex constrained optimization problems with several different types of constraints. Numerical evaluations demonstrate that the proposed MDE-CH algorithm provides a continuous 3D temporal-spatial UAV trajectory capable of efficiently minimizing energy consumption under various practical constraints and outperforms the conventional fly-hover-fly model for both two-dimensional (2D) and 3D trajectory planning.
Published: 2024

27. Reconstruction of Particle Flow Energy Distribution Using Deep Learning Algorithms

Author: Zhang, Han, Lin, Shengxiang, Zhang, Xingyi, Wang, Yu, and Zhang, Yangguang
Subjects: Physics - Instrumentation and Detectors, Computer Science - Artificial Intelligence
Abstract: In high-energy particle physics, extracting information from complex detector signals is crucial for energy reconstruction. Recent advancements involve using deep learning to process calorimeter images from various sub-detectors in experiments like the Large Hadron Collider (LHC) for energy map reconstruction. This paper compares classical algorithms\-MLP, CNN, U-Net, and RNN\-with variants that include self-attention and 3D convolution modules to evaluate their effectiveness in reconstructing the initial energy distribution. Additionally, a test dataset of jet events is utilized to analyze and compare models' performance in handling anomalous high-energy events. The analysis highlights the effectiveness of deep learning techniques for energy image reconstruction and explores their potential in this area., Comment: 11 pages, 1 tables, 9 figures Code available at https://github.com/Image-processing-Particle-flow/Project1
Published: 2024

28. Supermassive primordial black holes for the GHZ9 and UHZ1 observed by the JWST

Author: Huang, Hai-Long, Wang, Yu-Tong, and Piao, Yun-Song
Subjects: Astrophysics - Astrophysics of Galaxies, Astrophysics - Cosmology and Nongalactic Astrophysics, General Relativity and Quantum Cosmology
Abstract: The high redshift ($z>10$) galaxies GHZ9 and UHZ1 observed by the James Webb Space Telescope (JWST) are very massive and have exceptionally high black hole-to-star mass ratios with the central black hole masses $M\gtrsim 10^7\rm~M_\odot$. In this paper, we explore the possibility that they are seeded by the supermassive primordial black holes (SMPBHs), which came into being in the very early universe, with initial masses $\sim 10^7\rm~M_\odot$. We present the self-similar accretion solutions for SMPBHs, and find that the mass growth of SMPBHs during pregalactic era may be negligible. These SMPBHs, when the redshift $z\lesssim 20$, can accelerate seeding high-redshift galaxies and their baryonic content, and consequently explain the central supermassive black holes (SMBHs) of high-redshift massive galaxies through sub-Eddington accretion. According to our results, SMPBHs actually could lead to the existence of more massive SMBHs at higher redshifts compared to other SMBH seed scenarios, specially SMBHs with masses $M\gtrsim 10^7~\rm M_\odot$ at $z>20$ might only origin from SMPBHs, thus the corresponding observation can serve as a potential probe to PBHs., Comment: 14 pages, 5 figures
Published: 2024

29. Exponential entanglement advantage in sensing correlated noise

Author: Wang, Yu-Xin, Bringewatt, Jacob, Seif, Alireza, Brady, Anthony J., Oh, Changhun, and Gorshkov, Alexey V.
Subjects: Quantum Physics
Abstract: In this work, we propose a new form of exponential quantum advantage in the context of sensing correlated noise. Specifically, we focus on the problem of estimating parameters associated with Lindblad dephasing dynamics, and show that entanglement can lead to an exponential enhancement in the sensitivity (as quantified via quantum Fisher information of the sensor state) for estimating a small parameter characterizing the deviation of system Lindbladians from a class of maximally correlated dephasing dynamics. This result stands in stark contrast with previously studied scenarios of sensing uncorrelated dephasing noise, where one can prove that entanglement does not lead to an advantage in the signal-to-noise ratio. Our work thus opens a novel pathway towards achieving entanglement-based sensing advantage, which may find applications in characterizing decoherence dynamics of near-term quantum devices. Further, our approach provides a potential quantum-enhanced probe of many-body correlated phases by measuring noise generated by a sensing target. We also discuss realization of our protocol using near-term quantum hardware., Comment: 7+2 pages, 1 figure
Published: 2024

30. Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective

Author: Li, Jinhao, Xu, Jiaming, Huang, Shan, Chen, Yonghua, Li, Wen, Liu, Jun, Lian, Yaoxiu, Pan, Jiayi, Ding, Li, Zhou, Hao, Wang, Yu, and Dai, Guohao
Subjects: Computer Science - Hardware Architecture, Computer Science - Machine Learning
Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across various fields, from natural language understanding to text generation. Compared to non-generative LLMs like BERT and DeBERTa, generative LLMs like GPT series and Llama series are currently the main focus due to their superior algorithmic performance. The advancements in generative LLMs are closely intertwined with the development of hardware capabilities. Various hardware platforms exhibit distinct hardware characteristics, which can help improve LLM inference performance. Therefore, this paper comprehensively surveys efficient generative LLM inference on different hardware platforms. First, we provide an overview of the algorithm architecture of mainstream generative LLMs and delve into the inference process. Then, we summarize different optimization methods for different platforms such as CPU, GPU, FPGA, ASIC, and PIM/NDP, and provide inference results for generative LLMs. Furthermore, we perform a qualitative and quantitative comparison of inference performance with batch sizes 1 and 8 on different hardware platforms by considering hardware power consumption, absolute inference speed (tokens/s), and energy efficiency (tokens/J). We compare the performance of the same optimization methods across different hardware platforms, the performance across different hardware platforms, and the performance of different methods on the same hardware platform. This provides a systematic and comprehensive summary of existing inference acceleration work by integrating software optimization methods and hardware platforms, which can point to the future trends and potential developments of generative LLMs and hardware technology for edge-side scenarios., Comment: 43 pages, 15 figures
Published: 2024

31. Efficiently Identifying Watermarked Segments in Mixed-Source Texts

Author: Zhao, Xuandong, Liao, Chenwen, Wang, Yu-Xiang, and Li, Lei
Subjects: Computer Science - Computation and Language
Abstract: Text watermarks in large language models (LLMs) are increasingly used to detect synthetic text, mitigating misuse cases like fake news and academic dishonesty. While existing watermarking detection techniques primarily focus on classifying entire documents as watermarked or not, they often neglect the common scenario of identifying individual watermark segments within longer, mixed-source documents. Drawing inspiration from plagiarism detection systems, we propose two novel methods for partial watermark detection. First, we develop a geometry cover detection framework aimed at determining whether there is a watermark segment in long text. Second, we introduce an adaptive online learning algorithm to pinpoint the precise location of watermark segments within the text. Evaluated on three popular watermarking techniques (KGW-Watermark, Unigram-Watermark, and Gumbel-Watermark), our approach achieves high accuracy, significantly outperforming baseline methods. Moreover, our framework is adaptable to other watermarking techniques, offering new insights for precise watermark detection.
Published: 2024

32. Beyond the Phase Ordering Problem: Finding the Globally Optimal Code w.r.t. Optimization Phases

Author: Wang, Yu, Chen, Hongyu, and Wang, Ke
Subjects: Computer Science - Programming Languages
Abstract: In this paper, we propose a new concept called \textit{semantically equivalence} \wrt \textit{optimization phases} \textit{(\sep)}, which defines the set of programs a compiler considers semantically equivalent to the input using a set of optimization phases. We show both theoretically and empirically that solving the phase ordering problem does not necessarily result in the most efficient code among all programs that a compiler deems semantically equivalent to the input, hereinafter referred to as the global optimal code \wrt optimization phases. To find the global optimal code \wrt optimization phases, we present a conceptual framework, leveraging the reverse of existing optimization phases. In theory, we prove that the framework is capable of finding the global optimal code for any program. We realize this framework into a technique, called \textit{iterative bi-directional optimization (\tool)}, which performs both the normal and reverse optimizations to increase and decrease the efficiency of the generated code, respectively. We evaluate \tool on C/C++ files randomly extracted from highly mature and influential programs (\eg, Linux kernel, OpenSSL, Z3). Results show that \tool frequently generates more efficient code -- measured by either code size or runtime performance -- than exhaustive search, which is the solution to the phase ordering problem. We also find by simply incorporating \tool's reverse optimization phases, the effectiveness of the optimization of state-of-the-art compilers (\eg, GCC/LLVM) can be significantly improved.
Published: 2024

33. Complexity factor for a static self-gravitating sphere in Rastall-Rainbow gravity

Author: Ye, Zhou-Li, Wang, Yu, Yang, Rui-Xin, and Liu, Dao-Jun
Subjects: General Relativity and Quantum Cosmology
Abstract: We generalized Herrera's definition of complexity factor for static spherically symmetric fluid distributions to Rastall-Rainbow theory of gravity. For this purpose, an energy-dependent equation of motion is employed in accordance with the principle of gravity's rainbow. It is found that the complexity factor appears in the orthogonal splitting of the Riemann curvature tensor, and measures the deviation of the value of the active gravitational mass from the simplest system under the combined corrections of Rastall and rainbow. In the low-energy limit, all the results we have obtained reduce to the counterparts of general relativity when the non-conserved parameter is taken to be one. We also demonstrate how to build an anisotropic or isotropic star model using complexity approach. In particular, the vanishing complexity factor condition in Rastall-Rainbow gravity is exactly the same as that derived in general relativity. This fact may imply a deeper geometric foundation for the complexity factor., Comment: 10 two-column pages, 1 figure; accepted by Physics of the Dark Universe
Published: 2024

34. Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

Author: Teng, Yao, Shi, Han, Liu, Xian, Ning, Xuefei, Dai, Guohao, Wang, Yu, Li, Zhenguo, and Liu, Xihui
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The current large auto-regressive models can generate high-quality, high-resolution images, but these models require hundreds or even thousands of steps of next-token prediction during inference, resulting in substantial time consumption. In existing studies, Jacobi decoding, an iterative parallel decoding algorithm, has been used to accelerate the auto-regressive generation and can be executed without training. However, the Jacobi decoding relies on a deterministic criterion to determine the convergence of iterations. Thus, it works for greedy decoding but is incompatible with sampling-based decoding which is crucial for visual quality and diversity in the current auto-regressive text-to-image generation. In this paper, we propose a training-free probabilistic parallel decoding algorithm, Speculative Jacobi Decoding (SJD), to accelerate auto-regressive text-to-image generation. By introducing a probabilistic convergence criterion, our SJD accelerates the inference of auto-regressive text-to-image generation while maintaining the randomness in sampling-based token decoding and allowing the model to generate diverse images. Specifically, SJD facilitates the model to predict multiple tokens at each step and accepts tokens based on the probabilistic criterion, enabling the model to generate images with fewer steps than the conventional next-token-prediction paradigm. We also investigate the token initialization strategies that leverage the spatial locality of visual data to further improve the acceleration ratio under specific scenarios. We conduct experiments for our proposed SJD on multiple auto-regressive text-to-image generation models, showing the effectiveness of model acceleration without sacrificing the visual quality.
Published: 2024

35. Topological one-way Weyl fiber

Author: Lin, Hao, Wang, Yu, Ji, Zitao, Zheng, Yidong, Chen, Jianfeng, and Li, Zhi-Yuan
Subjects: Physics - Optics, Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: Topological photonics enables unprecedented photon manipulation by realizing various topological states, such as corner states, edge states, and surface states. However, achieving a topological fiber state has remained elusive. Here, we demonstrate a topological fiber state in a Weyl gyromagnetic photonic crystal fiber. By applying an in-plane magnetic bias to a gyromagnetic photonic crystal fiber with broken parity-inversion symmetry, we create an asymmetrical Weyl bandgap that supports one-way fiber states associated with type-II Weyl points. Dispersion and topological invariant calculations reveal the transition from Weyl surface states to one-way Weyl fiber states. Electromagnetic field simulations confirm the existence of one-way Weyl fiber states and their robust transport in the presence of metallic obstacle along the transport path. Our findings offer an intriguing pathway for exploring novel topological states and guiding the design of topological fibers., Comment: 19 pages, 7 figures
Published: 2024

36. Atmospheric Pressure Ammonia Synthesis on AuRu Catalysts Enabled by Plasmon-Controlled Hydrogenation and Nitrogen-species Desorption

Author: Yuan, Lin, Bourgeois, Briley B., Begin, Elijah, Zhang, Yirui, Dai, Alan X., Cheng, Zhihua, McKeown-Green, Amy S., Xue, Zhichen, Cui, Yi, Xu, Kun, Wang, Yu, Jones, Matthew R., Majumdar, Arun, Bao, Junwei Lucas, and Dionne, Jennifer A.
Subjects: Physics - Chemical Physics
Abstract: Ammonia is a key component of fertilizer and a potential clean fuel and hydrogen carrier. The Haber-Bosch process for ammonia synthesis consumes more than half of industrial hydrogen and contributes up to ~3% of global greenhouse gas emissions. Light-driven reactions via surface plasmon resonances offer a less energy-intensive pathway for ammonia production by altering reaction intermediates. Here, we report gold-ruthenium plasmonic bimetallic alloys for ammonia synthesis at room temperature and pressure, driven by visible light. We use colloidal synthesis to create AuRu$_x$ alloys (x=0.1, 0.2, 0.3) and disperse these nanoparticles on MgO supports for gas-phase ammonia synthesis. We observe a ~60 $\mu$mol/g/h reactivity and ~0.12% external quantum efficiency on a AuRu$_0$$_.$$_2$ sample under 100 mW/cm$^2$ visible light. In-situ diffuse reflective infrared Fourier transform spectroscopic measurements show that hydrogenation of nitrogen adsorbates is accelerated under light compared to thermocatalysis. Combining wavelength-dependent reactivity and spectroscopic findings with semi-classical electromagnetic modeling, we show plasmonic bimetallic alloys expedite ammonia synthesis by aiding hydrogenation of adsorbed nitrogen species via plasmon-mediated hot electrons. Quantum mechanical calculations reveal hydrogen-assisted N$_2$ splitting in the excited state is key to activating the reaction under ambient conditions. Therefore, light or H$_2$ alone cannot dissociate N$_2$ -- the key bottleneck to breaking N$_2$'s triple bond. Our findings are consistent with recent hypotheses on how nitrogenase enzymes catalyze ammonia production at mild conditions and provide insights for sustainable photochemical transformations., Comment: 21 pages, 4 figures, journal article submission soon
Published: 2024

37. Self-Updatable Large Language Models with Parameter Integration

Author: Wang, Yu, Liu, Xinshuang, Chen, Xiusi, O'Brien, Sean, Wu, Junda, and McAuley, Julian
Subjects: Computer Science - Computation and Language
Abstract: Despite significant advancements in large language models (LLMs), the rapid and frequent integration of small-scale experiences, such as interactions with surrounding objects, remains a substantial challenge. Two critical factors in assimilating these experiences are (1) Efficacy: the ability to accurately remember recent events; (2) Retention: the capacity to recall long-past experiences. Current methods either embed experiences within model parameters using continual learning, model editing, or knowledge distillation techniques, which often struggle with rapid updates and complex interactions, or rely on external storage to achieve long-term retention, thereby increasing storage requirements. In this paper, we propose SELF-PARAM (Self-Updatable Large Language Models with Parameter Integration). SELF-PARAM requires no extra parameters while ensuring near-optimal efficacy and long-term retention. Our method employs a training objective that minimizes the Kullback-Leibler (KL) divergence between the predictions of an original model (with access to contextual information) and a target model (without such access). By generating diverse question-answer pairs related to the knowledge and minimizing the KL divergence across this dataset, we update the target model to internalize the knowledge seamlessly within its parameters. Evaluations on question-answering and conversational recommendation tasks demonstrate that SELF-PARAM significantly outperforms existing methods, even when accounting for non-zero storage requirements. This advancement paves the way for more efficient and scalable integration of experiences in large language models by embedding knowledge directly into model parameters.
Published: 2024

38. Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data

Author: Lu, Ke-Han, Chen, Zhehuai, Fu, Szu-Wei, Yang, Chao-Han Huck, Balam, Jagadeesh, Ginsburg, Boris, Wang, Yu-Chiang Frank, and Lee, Hung-yi
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Computation and Language, Computer Science - Sound
Abstract: Recent end-to-end speech language models (SLMs) have expanded upon the capabilities of large language models (LLMs) by incorporating pre-trained speech models. However, these SLMs often undergo extensive speech instruction-tuning to bridge the gap between speech and text modalities. This requires significant annotation efforts and risks catastrophic forgetting of the original language capabilities. In this work, we present a simple yet effective automatic process for creating speech-text pair data that carefully injects speech paralinguistic understanding abilities into SLMs while preserving the inherent language capabilities of the text-based LLM. Our model demonstrates general capabilities for speech-related tasks without the need for speech instruction-tuning data, achieving impressive performance on Dynamic-SUPERB and AIR-Bench-Chat benchmarks. Furthermore, our model exhibits the ability to follow complex instructions derived from LLMs, such as specific output formatting and chain-of-thought reasoning. Our approach not only enhances the versatility and effectiveness of SLMs but also reduces reliance on extensive annotated datasets, paving the way for more efficient and capable speech understanding systems., Comment: Submitted to ICASSP 2025
Published: 2024

39. Multi-Designated Detector Watermarking for Language Models

Author: Huang, Zhengan, Zeng, Gongxian, Mu, Xin, Wang, Yu, and Yu, Yue
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence
Abstract: In this paper, we initiate the study of \emph{multi-designated detector watermarking (MDDW)} for large language models (LLMs). This technique allows model providers to generate watermarked outputs from LLMs with two key properties: (i) only specific, possibly multiple, designated detectors can identify the watermarks, and (ii) there is no perceptible degradation in the output quality for ordinary users. We formalize the security definitions for MDDW and present a framework for constructing MDDW for any LLM using multi-designated verifier signatures (MDVS). Recognizing the significant economic value of LLM outputs, we introduce claimability as an optional security feature for MDDW, enabling model providers to assert ownership of LLM outputs within designated-detector settings. To support claimable MDDW, we propose a generic transformation converting any MDVS to a claimable MDVS. Our implementation of the MDDW scheme highlights its advanced functionalities and flexibility over existing methods, with satisfactory performance metrics.
Published: 2024

40. Towards the Mitigation of Confirmation Bias in Semi-supervised Learning: a Debiased Training Perspective

Author: Wang, Yu, Yin, Yuxuan, and Li, Peng
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Semi-supervised learning (SSL) commonly exhibits confirmation bias, where models disproportionately favor certain classes, leading to errors in predicted pseudo labels that accumulate under a self-training paradigm. Unlike supervised settings, which benefit from a rich, static data distribution, SSL inherently lacks mechanisms to correct this self-reinforced bias, necessitating debiased interventions at each training step. Although the generation of debiased pseudo labels has been extensively studied, their effective utilization remains underexplored. Our analysis indicates that data from biased classes should have a reduced influence on parameter updates, while more attention should be given to underrepresented classes. To address these challenges, we introduce TaMatch, a unified framework for debiased training in SSL. TaMatch employs a scaling ratio derived from both a prior target distribution and the model's learning status to estimate and correct bias at each training step. This ratio adjusts the raw predictions on unlabeled data to produce debiased pseudo labels. In the utilization phase, these labels are differently weighted according to their predicted class, enhancing training equity and minimizing class bias. Additionally, TaMatch dynamically adjust the target distribution in response to the model's learning progress, facilitating robust handling of practical scenarios where the prior distribution is unknown. Empirical evaluations show that TaMatch significantly outperforms existing state-of-the-art methods across a range of challenging image classification tasks, highlighting the critical importance of both the debiased generation and utilization of pseudo labels in SSL., Comment: 11 pages, 4 figures
Published: 2024

41. Behavior evolution-inspired approach to walking gait reinforcement training for quadruped robots

Author: Wang, Yu, Jia, Wenchuan, Sun, Yi, and He, Dong
Subjects: Computer Science - Robotics
Abstract: Reinforcement learning method is extremely competitive in gait generation techniques for quadrupedal robot, which is mainly due to the fact that stochastic exploration in reinforcement training is beneficial to achieve an autonomous gait. Nevertheless, although incremental reinforcement learning is employed to improve training success and movement smoothness by relying on the continuity inherent during limb movements, challenges remain in adapting gait policy to diverse terrain and external disturbance. Inspired by the association between reinforcement learning and the evolution of animal motion behavior, a self-improvement mechanism for reference gait is introduced in this paper to enable incremental learning of action and self-improvement of reference action together to imitate the evolution of animal motion behavior. Further, a new framework for reinforcement training of quadruped gait is proposed. In this framework, genetic algorithm is specifically adopted to perform global probabilistic search for the initial value of the arbitrary foot trajectory to update the reference trajectory with better fitness. Subsequently, the improved reference gait is used for incremental reinforcement learning of gait. The above process is repeatedly and alternatively executed to finally train the gait policy. The analysis considering terrain, model dimensions, and locomotion condition is presented in detail based on simulation, and the results show that the framework is significantly more adaptive to terrain compared to regular incremental reinforcement learning.
Published: 2024

42. Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning

Author: Chen, Jiayu, Yu, Chao, Li, Guosheng, Tang, Wenhao, Yang, Xinyi, Xu, Botian, Yang, Huazhong, and Wang, Yu
Subjects: Computer Science - Robotics, Computer Science - Machine Learning
Abstract: Multi-UAV pursuit-evasion, where pursuers aim to capture evaders, poses a key challenge for UAV swarm intelligence. Multi-agent reinforcement learning (MARL) has demonstrated potential in modeling cooperative behaviors, but most RL-based approaches remain constrained to simplified simulations with limited dynamics or fixed scenarios. Previous attempts to deploy RL policy to real-world pursuit-evasion are largely restricted to two-dimensional scenarios, such as ground vehicles or UAVs at fixed altitudes. In this paper, we address multi-UAV pursuit-evasion by considering UAV dynamics and physical constraints. We introduce an evader prediction-enhanced network to tackle partial observability in cooperative strategy learning. Additionally, we propose an adaptive environment generator within MARL training, enabling higher exploration efficiency and better policy generalization across diverse scenarios. Simulations show our method significantly outperforms all baselines in challenging scenarios, generalizing to unseen scenarios with a 100% capture rate. Finally, we derive a feasible policy via a two-stage reward refinement and deploy the policy on real quadrotors in a zero-shot manner. To our knowledge, this is the first work to derive and deploy an RL-based policy using collective thrust and body rates control commands for multi-UAV pursuit-evasion in unknown environments. The open-source code and videos are available at https://sites.google.com/view/pursuit-evasion-rl.
Published: 2024

43. Partial disruption of a planet around a white dwarf: the effect of perturbation from the remnant planet on the accretion

Author: Kurban, Abdusattar, Zhou, Xia, Wang, Na, Huang, Yong-Feng, Wang, Yu-Bin, and Nurmamat, Nurimangul
Subjects: Astrophysics - Earth and Planetary Astrophysics, Astrophysics - Solar and Stellar Astrophysics
Abstract: About 25\% -50\% of white dwarfs (WDs) are found to be polluted by heavy elements. It has been argued that the pollution could be caused by the tidal disruption of an approaching planet around the WD, during which a large number of clumps would be produced and would finally fall onto the WD. The reason that the planet approaches the WD is usually believed to be due to gravitational perturbations from another distant planet or stellar companion. However, the dynamics of the perturbation and the detailed partial disruption process are still poorly understood. In this study, we present an in-depth investigation of these issues. A triple system composed of a WD, an inner orbit planet, and an outer orbit planet is considered. The inner plant would be partially disrupted periodically in the long-term evolution. Fragments generated in the process are affected by the gravitational perturbations from the remnant planet, facilitating their falling toward the WD. The mass loss rate of the inner planet depends on both its internal structure and also on the orbital configuration of the planetary system., Comment: 14 pages, 6 figures, 4 tables, Accepted for publication in The Astrophysical Journal
Published: 2024
Full Text: View/download PDF

44. The sparseness of g-convex functions

Author: Wang, Yu and Ye, Ke
Subjects: Mathematics - Differential Geometry, Mathematics - Optimization and Control
Abstract: The g-convexity of functions on manifolds is a generalization of the convexity of functions on Rn. It plays an essential role in both differential geometry and non-convex optimization theory. This paper is concerned with g-convex smooth functions on manifolds. We establish criteria for the existence of a Riemannian metric (or connection) with respect to which a given function is g-convex. Using these criteria, we obtain three sparseness results for g-convex functions: (1) The set of g-convex functions on a compact manifold is nowhere dense in the space of smooth functions. (2) Most polynomials on Rn that is g-convex with respect to some geodesically complete connection has at most one critical point. (3) The density of g-convex univariate (resp. quadratic, monomial, additively separable) polynomials asymptotically decreases to zero
Published: 2024

45. Anisotropic Diffusion Probabilistic Model for Imbalanced Image Classification

Author: Kong, Jingyu, Guo, Yuan, Wang, Yu, and Duan, Yuping
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Real-world data often has a long-tailed distribution, where the scarcity of tail samples significantly limits the model's generalization ability. Denoising Diffusion Probabilistic Models (DDPM) are generative models based on stochastic differential equation theory and have demonstrated impressive performance in image classification tasks. However, existing diffusion probabilistic models do not perform satisfactorily in classifying tail classes. In this work, we propose the Anisotropic Diffusion Probabilistic Model (ADPM) for imbalanced image classification problems. We utilize the data distribution to control the diffusion speed of different class samples during the forward process, effectively improving the classification accuracy of the denoiser in the reverse process. Specifically, we provide a theoretical strategy for selecting noise levels for different categories in the diffusion process based on error analysis theory to address the imbalanced classification problem. Furthermore, we integrate global and local image prior in the forward process to enhance the model's discriminative ability in the spatial dimension, while incorporate semantic-level contextual information in the reverse process to boost the model's discriminative power and robustness. Through comparisons with state-of-the-art methods on four medical benchmark datasets, we validate the effectiveness of the proposed method in handling long-tail data. Our results confirm that the anisotropic diffusion model significantly improves the classification accuracy of rare classes while maintaining the accuracy of head classes. On the skin lesion datasets, PAD-UFES and HAM10000, the F1-scores of our method improved by 4% and 3%, respectively compared to the original diffusion probabilistic model.
Published: 2024

46. Towards LifeSpan Cognitive Systems

Author: Wang, Yu, Han, Chi, Wu, Tongtong, He, Xiaoxin, Zhou, Wangchunshu, Sadeq, Nafis, Chen, Xiusi, He, Zexue, Wang, Wei, Haffari, Gholamreza, Ji, Heng, and McAuley, Julian
Subjects: Computer Science - Computation and Language
Abstract: Building a human-like system that continuously interacts with complex environments -- whether simulated digital worlds or human society -- presents several key challenges. Central to this is enabling continuous, high-frequency interactions, where the interactions are termed experiences. We refer to this envisioned system as the LifeSpan Cognitive System (LSCS). A critical feature of LSCS is its ability to engage in incremental and rapid updates while retaining and accurately recalling past experiences. We identify two major challenges in achieving this: (1) Abstraction and Experience Merging, and (2) Long-term Retention with Accurate Recall. These properties are essential for storing new experiences, organizing past experiences, and responding to the environment in ways that leverage relevant historical data. Unlike language models with continual learning, which typically rely on large corpora for fine-tuning and focus on improving performance within specific domains or tasks, LSCS must rapidly and incrementally update with new information from its environment at a high frequency. Existing technologies with the potential of solving the above two major challenges can be classified into four classes based on a conceptual metric called Storage Complexity, which measures the relative space required to store past experiences. Each of these four classes of technologies has its own strengths and limitations. Given that none of the existing technologies can achieve LSCS alone, we propose a novel paradigm for LSCS that integrates all four classes of technologies. The new paradigm operates through two core processes: Absorbing Experiences and Generating Responses.
Published: 2024

47. Human-Robot Cooperative Distribution Coupling for Hamiltonian-Constrained Social Navigation

Author: Wang, Weizheng, Yu, Chao, Wang, Yu, and Min, Byung-Cheol
Subjects: Computer Science - Robotics
Abstract: Navigating in human-filled public spaces is a critical challenge for deploying autonomous robots in real-world environments. This paper introduces NaviDIFF, a novel Hamiltonian-constrained socially-aware navigation framework designed to address the complexities of human-robot interaction and socially-aware path planning. NaviDIFF integrates a port-Hamiltonian framework to model dynamic physical interactions and a diffusion model to manage uncertainty in human-robot cooperation. The framework leverages a spatial-temporal transformer to capture social and temporal dependencies, enabling more accurate pedestrian strategy predictions and port-Hamiltonian dynamics construction. Additionally, reinforcement learning from human feedback is employed to fine-tune robot policies, ensuring adaptation to human preferences and social norms. Extensive experiments demonstrate that NaviDIFF outperforms state-of-the-art methods in social navigation tasks, offering improved stability, efficiency, and adaptability.
Published: 2024

48. Efficient Entanglement Routing for Satellite-Aerial-Terrestrial Quantum Networks

Author: Zhang, Yu, Gong, Yanmin, Fan, Lei, Wang, Yu, Han, Zhu, and Guo, Yuanxiong
Subjects: Quantum Physics, Computer Science - Networking and Internet Architecture
Abstract: In the era of 6G and beyond, space-aerial-terrestrial quantum networks (SATQNs) are shaping the future of the global-scale quantum Internet. This paper investigates the collaboration among satellite, aerial, and terrestrial quantum networks to efficiently transmit high-fidelity quantum entanglements over long distances. We begin with a comprehensive overview of existing satellite-, aerial-, and terrestrial-based quantum networks. Subsequently, we address the entanglement routing problem with the objective of maximizing quantum network throughput by jointly optimizing path selection and entanglement generation rates (PS-EGR). Given that the original problem is formulated as a mixed-integer linear programming (MILP) problem, which is inherently intractable, we propose a Benders' decomposition (BD)-based algorithm to solve the problem efficiently. Numerical results validate the effectiveness of the proposed PS-EGR scheme, offering valuable insights into various optimizable factors within the system. Finally, we discuss the current challenges and propose promising avenues for future research in SATQNs.
Published: 2024

49. Quantum-Assisted Joint Virtual Network Function Deployment and Maximum Flow Routing for Space Information Networks

Author: Zhang, Yu, Gong, Yanmin, Fan, Lei, Wang, Yu, Han, Zhu, and Guo, Yuanxiong
Subjects: Computer Science - Networking and Internet Architecture
Abstract: Network function virtualization (NFV)-enabled space information network (SIN) has emerged as a promising method to facilitate global coverage and seamless service. This paper proposes a novel NFV-enabled SIN to provide end-to-end communication and computation services for ground users. Based on the multi-functional time expanded graph (MF-TEG), we jointly optimize the user association, virtual network function (VNF) deployment, and flow routing strategy (U-VNF-R) to maximize the total processed data received by users. The original problem is a mixed-integer linear program (MILP) that is intractable for classical computers. Inspired by quantum computing techniques, we propose a hybrid quantum-classical Benders' decomposition (HQCBD) algorithm. Specifically, we convert the master problem of the Benders' decomposition into the quadratic unconstrained binary optimization (QUBO) model and solve it with quantum computers. To further accelerate the optimization, we also design a multi-cut strategy based on the quantum advantages in parallel computing. Numerical results demonstrate the effectiveness and efficiency of the proposed algorithm and U-VNF-R scheme.
Published: 2024
Full Text: View/download PDF

50. Enhanced Krylov Methods for Molecular Hamiltonians: Reduced Memory Cost and Complexity Scaling via Tensor Hypercontraction

Author: Wang, Yu, Luo, Maxine, and Mendl, Christian B.
Subjects: Condensed Matter - Strongly Correlated Electrons, Physics - Chemical Physics, Physics - Computational Physics, Quantum Physics
Abstract: We present a matrix product operator (MPO) construction based on the tensor hypercontraction (THC) format for ab initio molecular Hamiltonians. Such an MPO construction dramatically lowers the memory requirement and cost scaling of Krylov subspace methods. These can find low-lying eigenstates while avoiding local minima and simulate quantum time evolution with high accuracy. In our approach, the molecular Hamiltonian is represented as a sum of products of four MPOs, each with a bond dimension of only $2$. Iteratively applying the MPOs to the current quantum state in matrix product state (MPS) form, summing and re-compressing the MPS leads to a scheme with the same asymptotic memory cost as the bare MPS and reduces the computational cost scaling compared to the Krylov method based on a conventional MPO construction. We provide a detailed theoretical derivation of these statements and conduct supporting numerical experiments.
Published: 2024

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

135,451 results on '"wang, Yu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources