Author: "Chen, Feng" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Chen, Feng"' showing total 37,757 results

Start Over Author "Chen, Feng"

37,757 results on '"Chen, Feng"'

1. Dimensional Changes in the Skulls of Ancient Children with Age in Xinjiang, China

Author: Li, Haijun, Zhou, Jing, Zhao, Yujie, Chen, Feng, Zhang, Hailong, Fu, Chang, Wang, Bo, and Xiao, Xiaoyong
Published: 2024

2. The Effects of the COVID-19 Pandemic on Educational Attainment

Author: Harris, Douglas N., Chen, Feng, Martin, Rylie C., Bernhardt, Ann F., Marsicano, Christopher R., and Von Hippel, Paul T.
Published: 2024

3. Feature-Space Semantic Invariance: Enhanced OOD Detection for Open-Set Domain Generalization

Author: Wang, Haoliang, Zhao, Chen, and Chen, Feng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Open-set domain generalization addresses a real-world challenge: training a model to generalize across unseen domains (domain generalization) while also detecting samples from unknown classes not encountered during training (open-set recognition). However, most existing approaches tackle these issues separately, limiting their practical applicability. To overcome this limitation, we propose a unified framework for open-set domain generalization by introducing Feature-space Semantic Invariance (FSI). FSI maintains semantic consistency across different domains within the feature space, enabling more accurate detection of OOD instances in unseen domains. Additionally, we adopt a generative model to produce synthetic data with novel domain styles or class labels, enhancing model robustness. Initial experiments show that our method improves AUROC by 9.1% to 18.9% on ColoredMNIST, while also significantly increasing in-distribution classification accuracy., Comment: IEEE BigData 2024, Ph.D. Forum
Published: 2024

4. KMM: Key Frame Mask Mamba for Extended Motion Generation

Author: Zhang, Zeyu, Gao, Hang, Liu, Akide, Chen, Qi, Chen, Feng, Wang, Yiran, Li, Danning, and Tang, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Human motion generation is a cut-edge area of research in generative computer vision, with promising applications in video creation, game development, and robotic manipulation. The recent Mamba architecture shows promising results in efficiently modeling long and complex sequences, yet two significant challenges remain: Firstly, directly applying Mamba to extended motion generation is ineffective, as the limited capacity of the implicit memory leads to memory decay. Secondly, Mamba struggles with multimodal fusion compared to Transformers, and lack alignment with textual queries, often confusing directions (left or right) or omitting parts of longer text queries. To address these challenges, our paper presents three key contributions: Firstly, we introduce KMM, a novel architecture featuring Key frame Masking Modeling, designed to enhance Mamba's focus on key actions in motion segments. This approach addresses the memory decay problem and represents a pioneering method in customizing strategic frame-level masking in SSMs. Additionally, we designed a contrastive learning paradigm for addressing the multimodal fusion problem in Mamba and improving the motion-text alignment. Finally, we conducted extensive experiments on the go-to dataset, BABEL, achieving state-of-the-art performance with a reduction of more than 57% in FID and 70% parameters compared to previous state-of-the-art methods. See project website: https://steve-zeyu-zhang.github.io/KMM
Published: 2024

5. ZipCache: A DRAM/SSD Cache with Built-in Transparent Compression

Author: Xie, Rui, Ma, Linsen, Zhong, Alex, Chen, Feng, and Zhang, Tong
Subjects: Computer Science - Databases
Abstract: As a core component in modern data centers, key-value cache provides high-throughput and low-latency services for high-speed data processing. The effectiveness of a key-value cache relies on its ability of accommodating the needed data. However, expanding the cache capacity is often more difficult than commonly expected because of many practical constraints, such as server costs, cooling issues, rack space, and even human resource expenses. A potential solution is compression, which virtually extends the cache capacity by condensing data in cache. In practice, this seemingly simple idea has not gained much traction in key-value cache system design, due to several critical issues: the compression-unfriendly index structure, severe read/write amplification, wasteful decompression operations, and heavy computing cost. This paper presents a hybrid DRAM-SSD cache design to realize a systematic integration of data compression in key-value cache. By treating compression as an essential component, we have redesigned the indexing structure, data management, and leveraged the emerging computational SSD hardware for collaborative optimizations. We have developed a prototype, called ZipCache. Our experimental results show that ZipCache can achieve up to 72.4% higher throughput and 42.4% lower latency, while reducing the write amplification by up to 26.2 times.
Published: 2024

6. Fair In-Context Learning via Latent Concept Variables

Author: Bhaila, Karuna, Van, Minh-Hao, Edemacu, Kennedy, Zhao, Chen, Chen, Feng, and Wu, Xintao
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: The emerging in-context learning (ICL) ability of large language models (LLMs) has prompted their use for predictive tasks in various domains with different types of data facilitated by serialization methods. However, with increasing applications in high-stakes domains, it has been shown that LLMs can inherit social bias and discrimination from their pre-training data. In this work, we investigate this inherent bias in LLMs during in-context learning with tabular data. We focus on an optimal demonstration selection approach that utilizes latent concept variables for resource-efficient task adaptation. We design data augmentation strategies that reduce correlation between predictive outcomes and sensitive variables helping to promote fairness during latent concept learning. We utilize the learned concept and select demonstrations from a training dataset to obtain fair predictions during inference while maintaining model utility. The latent concept variable is learned using a smaller internal LLM and the selected demonstrations can be used for inference with larger external LLMs. We empirically verify that the fair latent variable approach improves fairness results on tabular datasets compared to multiple heuristic demonstration selection methods., Comment: 12 pages
Published: 2024

7. MADOD: Generalizing OOD Detection to Unseen Domains via G-Invariance Meta-Learning

Author: Wang, Haoliang, Zhao, Chen, and Chen, Feng
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Real-world machine learning applications often face simultaneous covariate and semantic shifts, challenging traditional domain generalization and out-of-distribution (OOD) detection methods. We introduce Meta-learned Across Domain Out-of-distribution Detection (MADOD), a novel framework designed to address both shifts concurrently. MADOD leverages meta-learning and G-invariance to enhance model generalizability and OOD detection in unseen domains. Our key innovation lies in task construction: we randomly designate in-distribution classes as pseudo-OODs within each meta-learning task, simulating OOD scenarios using existing data. This approach, combined with energy-based regularization, enables the learning of robust, domain-invariant features while calibrating decision boundaries for effective OOD detection. Operating in a test domain-agnostic setting, MADOD eliminates the need for adaptation during inference, making it suitable for scenarios where test data is unavailable. Extensive experiments on real-world and synthetic datasets demonstrate MADOD's superior performance in semantic OOD detection across unseen domains, achieving an AUPR improvement of 8.48% to 20.81%, while maintaining competitive in-distribution classification accuracy, representing a significant advancement in handling both covariate and semantic shifts., Comment: IEEE International Conference on Big Data 2024
Published: 2024

8. FEED: Fairness-Enhanced Meta-Learning for Domain Generalization

Author: Jiang, Kai, Zhao, Chen, Wang, Haoliang, and Chen, Feng
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Generalizing to out-of-distribution data while being aware of model fairness is a significant and challenging problem in meta-learning. The goal of this problem is to find a set of fairness-aware invariant parameters of classifier that is trained using data drawn from a family of related training domains with distribution shift on non-sensitive features as well as different levels of dependence between model predictions and sensitive features so that the classifier can achieve good generalization performance on unknown but distinct test domains. To tackle this challenge, existing state-of-the-art methods either address the domain generalization problem but completely ignore learning with fairness or solely specify shifted domains with various fairness levels. This paper introduces an approach to fairness-aware meta-learning that significantly enhances domain generalization capabilities. Our framework, Fairness-Enhanced Meta-Learning for Domain Generalization (FEED), disentangles latent data representations into content, style, and sensitive vectors. This disentanglement facilitates the robust generalization of machine learning models across diverse domains while adhering to fairness constraints. Unlike traditional methods that focus primarily on domain invariance or sensitivity to shifts, our model integrates a fairness-aware invariance criterion directly into the meta-learning process. This integration ensures that the learned parameters uphold fairness consistently, even when domain characteristics vary widely. We validate our approach through extensive experiments across multiple benchmarks, demonstrating not only superior performance in maintaining high accuracy and fairness but also significant improvements over existing state-of-the-art methods in domain generalization tasks., Comment: IEEE International Conference on Big Data 2024
Published: 2024

9. Rethinking the Uncertainty: A Critical Review and Analysis in the Era of Large Language Models

Author: Beigi, Mohammad, Wang, Sijia, Shen, Ying, Lin, Zihao, Kulkarni, Adithya, He, Jianfeng, Chen, Feng, Jin, Ming, Cho, Jin-Hee, Zhou, Dawei, Lu, Chang-Tien, and Huang, Lifu
Subjects: Computer Science - Artificial Intelligence
Abstract: In recent years, Large Language Models (LLMs) have become fundamental to a broad spectrum of artificial intelligence applications. As the use of LLMs expands, precisely estimating the uncertainty in their predictions has become crucial. Current methods often struggle to accurately identify, measure, and address the true uncertainty, with many focusing primarily on estimating model confidence. This discrepancy is largely due to an incomplete understanding of where, when, and how uncertainties are injected into models. This paper introduces a comprehensive framework specifically designed to identify and understand the types and sources of uncertainty, aligned with the unique characteristics of LLMs. Our framework enhances the understanding of the diverse landscape of uncertainties by systematically categorizing and defining each type, establishing a solid foundation for developing targeted methods that can precisely quantify these uncertainties. We also provide a detailed introduction to key related concepts and examine the limitations of current methods in mission-critical and safety-sensitive applications. The paper concludes with a perspective on future directions aimed at enhancing the reliability and practical adoption of these methods in real-world scenarios.
Published: 2024

10. Integrated spectrally multiplexed light-matter interface at telecom band

Author: Zhang, Xueying, Zhang, Bin, Wei, Shihai, Li, Hao, Liao, Jinyu, Zhou, Tao, Deng, Guangwei, Wang, You, Song, Haizhi, You, Lixing, Fan, Boyu, Fan, Yunru, Chen, Feng, Guo, Guangcan, and Zhou, Qiang
Subjects: Quantum Physics
Abstract: Light-matter interface is an important building block for long-distance quantum networks. Towards a scalable quantum network with high-rate quantum information processing, it requires to develop integrated light-matter interfaces with broadband and multiplexing capacities. Here we demonstrate a light-matter interface at telecom band in an integrated system. A five-spectral-channel atomic-frequency-comb photonic memory is prepared on a laser-written Er3+:LiNbO3 chip. The bandwidth of each channel is 4 GHz with a channel spacing of 15 GHz. The signal photons from time-bin entangled photon pairs at telecom band are sent into the on-chip memory and recalled after a storage time of 152 ns. The entanglement-preserving nature of our integrated quantum interface is assessed by an input/output fidelity of >92% for all the five spectral channels. Our light-matter interfaces constitute a notable step forward toward a high-rate quantum network involving integrated device.
Published: 2024

11. ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression

Author: He, Yefei, Chen, Feng, Liu, Jing, Shao, Wenqi, Zhou, Hong, Zhang, Kaipeng, and Zhuang, Bohan
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: The efficiency of large vision-language models (LVLMs) is constrained by the computational bottleneck of the attention mechanism during the prefill phase and the memory bottleneck of fetching the key-value (KV) cache in the decoding phase, particularly in scenarios involving high-resolution images or videos. Visual content often exhibits substantial redundancy, resulting in highly sparse attention maps within LVLMs. This sparsity can be leveraged to accelerate attention computation or compress the KV cache through various approaches. However, most studies focus on addressing only one of these bottlenecks and do not adequately support dynamic adjustment of sparsity concerning distinct layers or tasks. In this paper, we present ZipVL, an efficient inference framework designed for LVLMs that resolves both computation and memory bottlenecks through a dynamic ratio allocation strategy of important tokens. This ratio is adaptively determined based on the layer-specific distribution of attention scores, rather than fixed hyper-parameters, thereby improving efficiency for less complex tasks while maintaining high performance for more challenging ones. Then we select important tokens based on their normalized attention scores and perform attention mechanism solely on those important tokens to accelerate the prefill phase. To mitigate the memory bottleneck in the decoding phase, we employ mixed-precision quantization to the KV cache, where high-bit quantization is used for caches of important tokens, while low-bit quantization is applied to those of less importance. Our experiments demonstrate that ZipVL can accelerate the prefill phase by 2.6$\times$ and reduce GPU memory usage by 50.0%, with a minimal accuracy reduction of only 0.2% on Video-MME benchmark over LongVA-7B model, effectively enhancing the generation efficiency of LVLMs., Comment: 15 pages
Published: 2024

12. Unity is Power: Semi-Asynchronous Collaborative Training of Large-Scale Models with Structured Pruning in Resource-Limited Clients

Author: Li, Yan, Li, Mingyi, Zhang, Xiao, Xu, Guangwei, Chen, Feng, Yuan, Yuan, Zou, Yifei, Zhao, Mengying, Lu, Jianbo, and Yu, Dongxiao
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Machine Learning
Abstract: In this work, we study to release the potential of massive heterogeneous weak computing power to collaboratively train large-scale models on dispersed datasets. In order to improve both efficiency and accuracy in resource-adaptive collaborative learning, we take the first step to consider the \textit{unstructured pruning}, \textit{varying submodel architectures}, \textit{knowledge loss}, and \textit{straggler} challenges simultaneously. We propose a novel semi-asynchronous collaborative training framework, namely ${Co\text{-}S}^2{P}$, with data distribution-aware structured pruning and cross-block knowledge transfer mechanism to address the above concerns. Furthermore, we provide theoretical proof that ${Co\text{-}S}^2{P}$ can achieve asymptotic optimal convergence rate of $O(1/\sqrt{N^*EQ})$. Finally, we conduct extensive experiments on a real-world hardware testbed, in which 16 heterogeneous Jetson devices can be united to train large-scale models with parameters up to 0.11 billion. The experimental results demonstrate that $Co\text{-}S^2P$ improves accuracy by up to 8.8\% and resource utilization by up to 1.2$\times$ compared to state-of-the-art methods, while reducing memory consumption by approximately 22\% and training time by about 24\% on all resource-limited devices., Comment: 24 Pages, 12 figures
Published: 2024

13. On the Evaluation of Generative Robotic Simulations

Author: Chen, Feng, Xu, Botian, Hua, Pu, Duan, Peiqi, Yang, Yanchao, Ma, Yi, and Xu, Huazhe
Subjects: Computer Science - Robotics, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Due to the difficulty of acquiring extensive real-world data, robot simulation has become crucial for parallel training and sim-to-real transfer, highlighting the importance of scalable simulated robotic tasks. Foundation models have demonstrated impressive capacities in autonomously generating feasible robotic tasks. However, this new paradigm underscores the challenge of adequately evaluating these autonomously generated tasks. To address this, we propose a comprehensive evaluation framework tailored to generative simulations. Our framework segments evaluation into three core aspects: quality, diversity, and generalization. For single-task quality, we evaluate the realism of the generated task and the completeness of the generated trajectories using large language models and vision-language models. In terms of diversity, we measure both task and data diversity through text similarity of task descriptions and world model loss trained on collected task trajectories. For task-level generalization, we assess the zero-shot generalization ability on unseen tasks of a policy trained with multiple generated tasks. Experiments conducted on three representative task generation pipelines demonstrate that the results from our framework are highly consistent with human evaluations, confirming the feasibility and validity of our approach. The findings reveal that while metrics of quality and diversity can be achieved through certain methods, no single approach excels across all metrics, suggesting a need for greater focus on balancing these different metrics. Additionally, our analysis further highlights the common challenge of low generalization capability faced by current works. Our anonymous website: https://sites.google.com/view/evaltasks., Comment: Project website: https://sites.google.com/view/evaltasks
Published: 2024

14. Hopping Transfer Optimizes Avalanche Multiplication in Molybdenum Disulfide

Author: Cai, Xiaofan, Chen, Ruichang, Gao, Xu, Yuan, Meili, Hu, Haixia, Yin, Hang, Qu, Yuanyuan, Tan, Yang, and Chen, Feng
Subjects: Physics - Applied Physics
Abstract: Recently, avalanche multiplication has been observed in TMDC-based FETs, enhancing sensor performance with high sensitivity. However, the high voltage required for operation can damage the FETs, making it crucial to reduce the breakdown voltage for effective sensing applications. Here, we demonstrate that the utilization of hopping transfer induced by high-density defects can effectively reduce the breakdown voltage in TMDCs FETs. By substituting oxygen atoms for sulfur atoms in a monolayer of MoS2, we create MoS2-xOx, with x carefully adjusted within the range of 0 to 0.51. Oxygen doping reduces the bandgap of TMDCs and enhances ion collision rates. Moreover, higher levels of oxygen doping (x > 0.41) in MoS2-xOx exhibit nearest-neighbor hopping behavior, leading to a significant enhancement in electron mobility. These improvements result in a decrease in the breakdown voltage of avalanche multiplication from 26.2 V to 12.6 V. Additionally, we propose avalanche multiplication in MoS2-xOx as an efficient sensing mechanism to overcome the limitations of gas sensing. The MoS2-xOx sensors display an ultra-high response to NO2 gas in the air, with a response of 5.8x103 % to NO2 gas of 50 ppb at room temperature, which is nearly two orders of magnitude higher than resistance-type gas detectors based on TMDCs. This work demonstrates that hopping transfer induced by high-density oxygen defects can effectively decrease the breakdown voltage of MoS2-xOx FETs, enhancing avalanche multiplication and serving as a promising mechanism for ultrasensitive gas detection.
Published: 2024

15. Cross-Dataset Gaze Estimation by Evidential Inter-intra Fusion

Author: Wang, Shijing, Huang, Yaping, Xie, Jun, YiTian, Chen, Feng, and Wang, Zhepeng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Achieving accurate and reliable gaze predictions in complex and diverse environments remains challenging. Fortunately, it is straightforward to access diverse gaze datasets in real-world applications. We discover that training these datasets jointly can significantly improve the generalization of gaze estimation, which is overlooked in previous works. However, due to the inherent distribution shift across different datasets, simply mixing multiple dataset decreases the performance in the original domain despite gaining better generalization abilities. To address the problem of ``cross-dataset gaze estimation'', we propose a novel Evidential Inter-intra Fusion EIF framework, for training a cross-dataset model that performs well across all source and unseen domains. Specifically, we build independent single-dataset branches for various datasets where the data space is partitioned into overlapping subspaces within each dataset for local regression, and further create a cross-dataset branch to integrate the generalizable features from single-dataset branches. Furthermore, evidential regressors based on the Normal and Inverse-Gamma (NIG) distribution are designed to additionally provide uncertainty estimation apart from predicting gaze. Building upon this foundation, our proposed framework achieves both intra-evidential fusion among multiple local regressors within each dataset and inter-evidential fusion among multiple branches by Mixture \textbfof Normal Inverse-Gamma (MoNIG distribution. Experiments demonstrate that our method consistently achieves notable improvements in both source domains and unseen domains., Comment: This paper was previously submitted to ACM MM 2024
Published: 2024

16. IAFI-FCOS: Intra- and across-layer feature interaction FCOS model for lesion detection of CT images

Author: Guan, Qiu, Pan, Mengjie, Chen, Feng, Yang, Zhiqiang, Yu, Zhongwen, Zhou, Qianwei, and Hu, Haigen
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Effective lesion detection in medical image is not only rely on the features of lesion region,but also deeply relative to the surrounding information.However,most current methods have not fully utilize it.What is more,multi-scale feature fusion mechanism of most traditional detectors are unable to transmit detail information without loss,which makes it hard to detect small and boundary ambiguous lesion in early stage disease.To address the above issues,we propose a novel intra- and across-layer feature interaction FCOS model (IAFI-FCOS) with a multi-scale feature fusion mechanism ICAF-FPN,which is a network structure with intra-layer context augmentation (ICA) block and across-layer feature weighting (AFW) block.Therefore,the traditional FCOS detector is optimized by enriching the feature representation from two perspectives.Specifically,the ICA block utilizes dilated attention to augment the context information in order to capture long-range dependencies between the lesion region and the surrounding.The AFW block utilizes dual-axis attention mechanism and weighting operation to obtain the efficient across-layer interaction features,enhancing the representation of detailed features.Our approach has been extensively experimented on both the private pancreatic lesion dataset and the public DeepLesion dataset,our model achieves SOTA results on the pancreatic lesion dataset., Comment: 2024 IJCNN
Published: 2024

17. Periodic Coronal Rain Driven by Self-consistent Heating Process in a Radiative Magnetohydrodynamic Simulation

Author: Lu, Zekun, Chen, Feng, Guo, J. H., Ding, M. D., Wang, Can, Yu, Haocheng, Ni, Y. W., and Xia, Chun
Subjects: Astrophysics - Solar and Stellar Astrophysics
Abstract: The periodic coronal rain and in-phase radiative intensity pulsations have been observed in multiple wavelengths in recent years. However, due to the lack of three-dimensional coronal magnetic fields and thermodynamic data in observations, it remains challenging to quantify the coronal heating rate that drives the mass cycles. In this work, based on the MURaM code, we conduct a three-dimensional radiative magnetohydrodynamic simulation spanning from the convective zone to the corona, where the solar atmosphere is heated self-consistently through dissipation resulting from magneto-convection. For the first time, we model the periodic coronal rain in an active region. With a high spatial resolution, the simulation well resembles the observational features across different extreme ultraviolet wavelengths. These include the realistic interweaving coronal loops, periodic coronal rain and periodic intensity pulsations, with two periods of 3.0~h and 3.7~h identified within one loop system. Moreover, the simulation allows for a detailed three-dimensional depiction of coronal rain on small scales, revealing adjacent shower-like rain clumps $\sim500$~km in width and showcasing their multi-thermal internal structures. We further reveal that these periodic variations essentially reflect the cyclic energy evolution of the coronal loop under thermal non-equilibrium state. Importantly, as the driver of the mass circulation, the self-consistent coronal heating rate is considerably complex in time and space, with hour-level variations in one order of magnitude, minute-level bursts, and varying asymmetry reaching ten times between footpoints. This provides an instructive template for the ad hoc heating function, and further enhances our understanding of the coronal heating process., Comment: 14 Pages, 7 figures, accepted for publication in ApJL
Published: 2024

18. LSM-YOLO: A Compact and Effective ROI Detector for Medical Detection

Author: Yu, Zhongwen, Guan, Qiu, Yang, Jianmin, Yang, Zhiqiang, Zhou, Qianwei, Chen, Yang, and Chen, Feng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In existing medical Region of Interest (ROI) detection, there lacks an algorithm that can simultaneously satisfy both real-time performance and accuracy, not meeting the growing demand for automatic detection in medicine. Although the basic YOLO framework ensures real-time detection due to its fast speed, it still faces challenges in maintaining precision concurrently. To alleviate the above problems, we propose a novel model named Lightweight Shunt Matching-YOLO (LSM-YOLO), with Lightweight Adaptive Extraction (LAE) and Multipath Shunt Feature Matching (MSFM). Firstly, by using LAE to refine feature extraction, the model can obtain more contextual information and high-resolution details from multiscale feature maps, thereby extracting detailed features of ROI in medical images while reducing the influence of noise. Secondly, MSFM is utilized to further refine the fusion of high-level semantic features and low-level visual features, enabling better fusion between ROI features and neighboring features, thereby improving the detection rate for better diagnostic assistance. Experimental results demonstrate that LSM-YOLO achieves 48.6% AP on a private dataset of pancreatic tumors, 65.1% AP on the BCCD blood cell detection public dataset, and 73.0% AP on the Br35h brain tumor detection public dataset. Our model achieves state-of-the-art performance with minimal parameter cost on the above three datasets. The source codes are at: https://github.com/VincentYuuuuuu/LSM-YOLO.
Published: 2024

19. Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data

Author: Yang, Tao, Shi, Yangming, Huang, Yunwen, Chen, Feng, Zheng, Yin, and Zhang, Lei
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Text-to-video (T2V) generation has gained significant attention due to its wide applications to video generation, editing, enhancement and translation, \etc. However, high-quality (HQ) video synthesis is extremely challenging because of the diverse and complex motions existed in real world. Most existing works struggle to address this problem by collecting large-scale HQ videos, which are inaccessible to the community. In this work, we show that publicly available limited and low-quality (LQ) data are sufficient to train a HQ video generator without recaptioning or finetuning. We factorize the whole T2V generation process into two steps: generating an image conditioned on a highly descriptive caption, and synthesizing the video conditioned on the generated image and a concise caption of motion details. Specifically, we present \emph{Factorized-Dreamer}, a factorized spatiotemporal framework with several critical designs for T2V generation, including an adapter to combine text and image embeddings, a pixel-aware cross attention module to capture pixel-level image information, a T5 text encoder to better understand motion description, and a PredictNet to supervise optical flows. We further present a noise schedule, which plays a key role in ensuring the quality and stability of video generation. Our model lowers the requirements in detailed captions and HQ videos, and can be directly trained on limited LQ datasets with noisy and brief captions such as WebVid-10M, largely alleviating the cost to collect large-scale HQ video-text pairs. Extensive experiments in a variety of T2V and image-to-video generation tasks demonstrate the effectiveness of our proposed Factorized-Dreamer. Our source codes are available at \url{https://github.com/yangxy/Factorized-Dreamer/}.
Published: 2024

20. Likelihood inference of the non-stationary Hawkes process with non-exponential kernel

Author: Kwan, Tsz-Kit Jeffrey, Chen, Feng, and Dunsmuir, William
Subjects: Mathematics - Statistics Theory, Primary: 60G55. Secondary: 62F12, 62M15, G.3
Abstract: The Hawkes process is a popular point process model for event sequences that exhibit temporal clustering. The intensity process of a Hawkes process consists of two components, the baseline intensity and the accumulated excitation effect due to past events, with the latter specified via an excitation kernel. The classical Hawkes process assumes a constant baseline intensity and an exponential excitation kernel. This results in an intensity process that is Markovian, a fact that has been used extensively to establish the strong consistency and asymtpotic normality of maximum likelihood estimators or similar. However, these assumptions can be overly restrictive and unrealistic for modelling the many applications which require the baseline intensity to vary with time and the excitation kernel to have non-exponential decay. However, asymptotic properties of maximum likelihood inference for the parameters specifying the baseline intensity and the self-exciting decay under this setup are substantially more difficult since the resulting intensity process is non-Markovian. To overcome this challenge, we develop an approximation procedure to show the intensity process is asymptotically ergodic in a suitably defined sense. This allows for the identification of an ergodic limit to the likelihood function and its derivatives, as required for obtaining large sample inference under minimal regularity conditions., Comment: 41 pages, 3 figures
Published: 2024

21. HQOD: Harmonious Quantization for Object Detection

Author: Huang, Long, Dong, Zhiwei, Chen, Song-Lu, Zhang, Ruiyao, Ti, Shutong, Chen, Feng, and Yin, Xu-Cheng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Task inharmony problem commonly occurs in modern object detectors, leading to inconsistent qualities between classification and regression tasks. The predicted boxes with high classification scores but poor localization positions or low classification scores but accurate localization positions will worsen the performance of detectors after Non-Maximum Suppression. Furthermore, when object detectors collaborate with Quantization-Aware Training (QAT), we observe that the task inharmony problem will be further exacerbated, which is considered one of the main causes of the performance degradation of quantized detectors. To tackle this issue, we propose the Harmonious Quantization for Object Detection (HQOD) framework, which consists of two components. Firstly, we propose a task-correlated loss to encourage detectors to focus on improving samples with lower task harmony quality during QAT. Secondly, a harmonious Intersection over Union (IoU) loss is incorporated to balance the optimization of the regression branch across different IoU levels. The proposed HQOD can be easily integrated into different QAT algorithms and detectors. Remarkably, on the MS COCO dataset, our 4-bit ATSS with ResNet-50 backbone achieves a state-of-the-art mAP of 39.6%, even surpassing the full-precision one., Comment: 2024 IEEE International Conference on Multimedia and Expo (ICME), July 15 - July 19, 2024, Niagra Falls, Ontario, Canada
Published: 2024

22. InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation

Author: Zhang, Zeyu, Liu, Akide, Chen, Qi, Chen, Feng, Reid, Ian, Hartley, Richard, Zhuang, Bohan, and Tang, Hao
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Text-to-motion generation holds potential for film, gaming, and robotics, yet current methods often prioritize short motion generation, making it challenging to produce long motion sequences effectively: (1) Current methods struggle to handle long motion sequences as a single input due to prohibitively high computational cost; (2) Breaking down the generation of long motion sequences into shorter segments can result in inconsistent transitions and requires interpolation or inpainting, which lacks entire sequence modeling. To solve these challenges, we propose InfiniMotion, a method that generates continuous motion sequences of arbitrary length within an autoregressive framework. We highlight its groundbreaking capability by generating a continuous 1-hour human motion with around 80,000 frames. Specifically, we introduce the Motion Memory Transformer with Bidirectional Mamba Memory, enhancing the transformer's memory to process long motion sequences effectively without overwhelming computational resources. Notably our method achieves over 30% improvement in FID and 6 times longer demonstration compared to previous state-of-the-art methods, showcasing significant advancements in long motion generation. See project webpage: https://steve-zeyu-zhang.github.io/InfiniMotion/
Published: 2024

23. Toward Automated Detection of Biased Social Signals from the Content of Clinical Conversations

Author: Chen, Feng, Bedmutha, Manas Satish, Chung, Ray-Yuan, Sabin, Janice, Pratt, Wanda, Wood, Brian R., Weibel, Nadir, Hartzler, Andrea L., and Cohen, Trevor
Subjects: Computer Science - Computers and Society, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Implicit bias can impede patient-provider interactions and lead to inequities in care. Raising awareness is key to reducing such bias, but its manifestations in the social dynamics of patient-provider communication are difficult to detect. In this study, we used automated speech recognition (ASR) and natural language processing (NLP) to identify social signals in patient-provider interactions. We built an automated pipeline to predict social signals from audio recordings of 782 primary care visits that achieved 90.1% average accuracy across codes, and exhibited fairness in its predictions for white and non-white patients. Applying this pipeline, we identified statistically significant differences in provider communication behavior toward white versus non-white patients. In particular, providers expressed more patient-centered behaviors towards white patients including more warmth, engagement, and attentiveness. Our study underscores the potential of automated tools in identifying subtle communication signals that may be linked with bias and impact healthcare quality and equity., Comment: Accepted by AMIA 2024 Annual Symposium
Published: 2024

24. $CP$ asymmetries in $\tau\to K_S\pi\nu_\tau$ decays

Author: Chen, Feng-Zhi and Li, Xin-Qiang
Subjects: High Energy Physics - Phenomenology
Abstract: We present here the CP asymmetries in the decay rate and angular distributions of $\tau\to K_S\pi\nu_\tau$ decays in the Standard Model (SM) and beyond (BSM). The CP asymmetries in the SM are induced by the CP violation in $K^0-\bar{K}^0$ mixing. To investigate the BSM CP-violating (CPV) effects, a model-independent analysis is performed by using the low-energy effective field theory (LEFT) framework at $\mu=2$~GeV. If one further assumes the BSM physics to stem from above the electroweak scale, the LEFT shall then be matched onto the SM effective field theory (SMEFT), the operators of which contributing to $\tau\to K_S\pi\nu_\tau$ decays will also contribute to the neutron electric dipole moment (EDM) and $D^0-\bar{D}^0$ mixing. The stringent bounds from the latter suggest that no remarkable CPV effects can be observed in either the decay rate or the angular distributions. The prospects for future measurements of these observables are also mentioned., Comment: 8 pages, 3 figures, and 1 table; proceedings to the 2024 International Workshop on Future Tau Charm Facilities (FTCF2024)
Published: 2024

25. Study of $\tau^- \to \omega \pi^- \nu_\tau$ decay in resonance chiral theory with tensor sources

Author: Chen, Feng-Zhi, Li, Xin-Qiang, Peng, Shi-Can, Yang, Ya-Dong, and Zou, Yuan-He
Subjects: High Energy Physics - Phenomenology
Abstract: In this work, we make a study of the $\tau^- \to \omega\pi^-\nu_\tau$ decay in the framework of low-energy effective field theory. The $J^{\mathcal{P}G}$ decompositions of the quark currents and the $\omega\pi$ final state show that, besides the Standard Model vector interaction, only the non-standard tensor interaction can have a non-zero contribution to the decay. To discuss its effect, a reliable calculation of the $\omega\pi$ tensor form factors is necessary. After constructing the Lagrangian of resonance chiral theory with external tensor sources, we calculate both the vector and tensor form factors with the relevant resonance couplings determined by combining the QCD short-distance constraints, the fit to the spectral function of $\tau^- \to \omega\pi^-\nu_\tau$ decay, as well as the matching between the $\mathcal{O}(p^4)$ odd-intrinsic-parity operators after integrating out the vector resonances and the $\mathcal{O}(p^6)$ operators of chiral perturbation theory. The new physics effect is then investigated in the distributions of the spectral function and the forward-backward asymmetry of $\tau^- \to \omega\pi^-\nu_\tau$ decay. We find that the spectral function is dominated by the Standard Model, and the non-standard tensor contribution is negligible. However, since the forward-backward asymmetry can be only generated with a non-zero tensor interaction, the observable is quite sensitive to this kind of new physics. A future measurement of the observable at the Belle II experiment as well as at the proposed Tera-Z and STCF facilities is, therefore, strongly called for to check the existence of such a non-standard tensor interaction., Comment: 27 pages, 4 tables, and 2 figures; minor modification, final version published in the journal
Published: 2024

26. Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?

Author: He, Jianfeng, Yang, Runing, Yu, Linlin, Li, Changbin, Jia, Ruoxi, Chen, Feng, Jin, Ming, and Lu, Chang-Tien
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Text summarization, a key natural language generation (NLG) task, is vital in various domains. However, the high cost of inaccurate summaries in risk-critical applications, particularly those involving human-in-the-loop decision-making, raises concerns about the reliability of uncertainty estimation on text summarization (UE-TS) evaluation methods. This concern stems from the dependency of uncertainty model metrics on diverse and potentially conflicting NLG metrics. To address this issue, we introduce a comprehensive UE-TS benchmark incorporating 31 NLG metrics across four dimensions. The benchmark evaluates the uncertainty estimation capabilities of two large language models and one pre-trained language model on three datasets, with human-annotation analysis incorporated where applicable. We also assess the performance of 14 common uncertainty estimation methods within this benchmark. Our findings emphasize the importance of considering multiple uncorrelated NLG metrics and diverse uncertainty estimation methods to ensure reliable and efficient evaluation of UE-TS techniques. Our code and data are available https://github.com/he159ok/Benchmark-of-Uncertainty-Estimation-Methods-in-Text-Summarization., Comment: 62 pages, 41 figures, 11 tables
Published: 2024

27. GMT: Guided Mask Transformer for Leaf Instance Segmentation

Author: Chen, Feng, Tsaftaris, Sotirios A., and Giuffrida, Mario Valerio
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Leaf instance segmentation is a challenging multi-instance segmentation task, aiming to separate and delineate each leaf in an image of a plant. Accurate segmentation of each leaf is crucial for plant-related applications such as the fine-grained monitoring of plant growth and crop yield estimation. This task is challenging because of the high similarity (in shape and colour), great size variation, and heavy occlusions among leaf instances. Furthermore, the typically small size of annotated leaf datasets makes it more difficult to learn the distinctive features needed for precise segmentation. We hypothesise that the key to overcoming the these challenges lies in the specific spatial patterns of leaf distribution. In this paper, we propose the Guided Mask Transformer (GMT), which leverages and integrates leaf spatial distribution priors into a Transformer-based segmentor. These spatial priors are embedded in a set of guide functions that map leaves at different positions into a more separable embedding space. Our GMT consistently outperforms the state-of-the-art on three public plant datasets.
Published: 2024

28. PCIE_EgoHandPose Solution for EgoExo4D Hand Pose Challenge

Author: Chen, Feng, Ding, Ling, Lertniphonphan, Kanokphan, Li, Jian, Huang, Kaer, and Wang, Zhepeng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This report presents our team's 'PCIE_EgoHandPose' solution for the EgoExo4D Hand Pose Challenge at CVPR2024. The main goal of the challenge is to accurately estimate hand poses, which involve 21 3D joints, using an RGB egocentric video image provided for the task. This task is particularly challenging due to the subtle movements and occlusions. To handle the complexity of the task, we propose the Hand Pose Vision Transformer (HP-ViT). The HP-ViT comprises a ViT backbone and transformer head to estimate joint positions in 3D, utilizing MPJPE and RLE loss function. Our approach achieved the 1st position in the Hand Pose challenge with 25.51 MPJPE and 8.49 PA-MPJPE. Code is available at https://github.com/KanokphanL/PCIE_EgoHandPose
Published: 2024

29. PCIE_LAM Solution for Ego4D Looking At Me Challenge

Author: Lertniphonphan, Kanokphan, Xie, Jun, Meng, Yaqing, Wang, Shijing, Chen, Feng, and Wang, Zhepeng
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This report presents our team's 'PCIE_LAM' solution for the Ego4D Looking At Me Challenge at CVPR2024. The main goal of the challenge is to accurately determine if a person in the scene is looking at the camera wearer, based on a video where the faces of social partners have been localized. Our proposed solution, InternLSTM, consists of an InternVL image encoder and a Bi-LSTM network. The InternVL extracts spatial features, while the Bi-LSTM extracts temporal features. However, this task is highly challenging due to the distance between the person in the scene and the camera movement, which results in significant blurring in the face image. To address the complexity of the task, we implemented a Gaze Smoothing filter to eliminate noise or spikes from the output. Our approach achieved the 1st position in the looking at me challenge with 0.81 mAP and 0.93 accuracy rate. Code is available at https://github.com/KanokphanL/Ego4D_LAM_InternLSTM
Published: 2024

30. Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning

Author: Kunin, Daniel, Raventós, Allan, Dominé, Clémentine, Chen, Feng, Klindt, David, Saxe, Andrew, and Ganguli, Surya
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: While the impressive performance of modern neural networks is often attributed to their capacity to efficiently extract task-relevant features from data, the mechanisms underlying this rich feature learning regime remain elusive, with much of our theoretical understanding stemming from the opposing lazy regime. In this work, we derive exact solutions to a minimal model that transitions between lazy and rich learning, precisely elucidating how unbalanced layer-specific initialization variances and learning rates determine the degree of feature learning. Our analysis reveals that they conspire to influence the learning regime through a set of conserved quantities that constrain and modify the geometry of learning trajectories in parameter and function space. We extend our analysis to more complex linear models with multiple neurons, outputs, and layers and to shallow nonlinear networks with piecewise linear activation functions. In linear networks, rapid feature learning only occurs from balanced initializations, where all layers learn at similar speeds. While in nonlinear networks, unbalanced initializations that promote faster learning in earlier layers can accelerate rich learning. Through a series of experiments, we provide evidence that this unbalanced rich regime drives feature learning in deep finite-width networks, promotes interpretability of early layers in CNNs, reduces the sample complexity of learning hierarchical data, and decreases the time to grokking in modular arithmetic. Our theory motivates further exploration of unbalanced initializations to enhance efficient feature learning., Comment: 40 pages, 12 figures, NeurIPS 2024
Published: 2024

31. Feature Contamination: Neural Networks Learn Uncorrelated Features and Fail to Generalize

Author: Zhang, Tianren, Zhao, Chujie, Chen, Guanyu, Jiang, Yizhou, and Chen, Feng
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Learning representations that generalize under distribution shifts is critical for building robust machine learning models. However, despite significant efforts in recent years, algorithmic advances in this direction have been limited. In this work, we seek to understand the fundamental difficulty of out-of-distribution generalization with deep neural networks. We first empirically show that perhaps surprisingly, even allowing a neural network to explicitly fit the representations obtained from a teacher network that can generalize out-of-distribution is insufficient for the generalization of the student network. Then, by a theoretical study of two-layer ReLU networks optimized by stochastic gradient descent (SGD) under a structured feature model, we identify a fundamental yet unexplored feature learning proclivity of neural networks, feature contamination: neural networks can learn uncorrelated features together with predictive features, resulting in generalization failure under distribution shifts. Notably, this mechanism essentially differs from the prevailing narrative in the literature that attributes the generalization failure to spurious correlations. Overall, our results offer new insights into the non-linear feature learning dynamics of neural networks and highlight the necessity of considering inductive biases in out-of-distribution generalization., Comment: ICML 2024
Published: 2024

32. Uncertainty Quantification for Bird's Eye View Semantic Segmentation: Methods and Benchmarks

Author: Yu, Linlin, Yang, Bowen, Wang, Tianhao, Li, Kangshuo, and Chen, Feng
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition
Abstract: The fusion of raw features from multiple sensors on an autonomous vehicle to create a Bird's Eye View (BEV) representation is crucial for planning and control systems. There is growing interest in using deep learning models for BEV semantic segmentation. Anticipating segmentation errors and improving the explainability of DNNs is essential for autonomous driving, yet it is under-studied. This paper introduces a benchmark for predictive uncertainty quantification in BEV segmentation. The benchmark assesses various approaches across three popular datasets using two representative backbones and focuses on the effectiveness of predicted uncertainty in identifying misclassified and out-of-distribution (OOD) pixels, as well as calibration. Empirical findings highlight the challenges in uncertainty quantification. Our results find that evidential deep learning based approaches show the most promise by efficiently quantifying aleatoric and epistemic uncertainty. We propose the Uncertainty-Focal-Cross-Entropy (UFCE) loss, designed for highly imbalanced data, which consistently improves the segmentation quality and calibration. Additionally, we introduce a vacuity-scaled regularization term that enhances the model's focus on high uncertainty pixels, improving epistemic uncertainty quantification.
Published: 2024

33. Streaming Video Diffusion: Online Video Editing with Diffusion Models

Author: Chen, Feng, Yang, Zhen, Zhuang, Bohan, and Wu, Qi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: We present a novel task called online video editing, which is designed to edit \textbf{streaming} frames while maintaining temporal consistency. Unlike existing offline video editing assuming all frames are pre-established and accessible, online video editing is tailored to real-life applications such as live streaming and online chat, requiring (1) fast continual step inference, (2) long-term temporal modeling, and (3) zero-shot video editing capability. To solve these issues, we propose Streaming Video Diffusion (SVDiff), which incorporates the compact spatial-aware temporal recurrence into off-the-shelf Stable Diffusion and is trained with the segment-level scheme on large-scale long videos. This simple yet effective setup allows us to obtain a single model that is capable of executing a broad range of videos and editing each streaming frame with temporal coherence. Our experiments indicate that our model can edit long, high-quality videos with remarkable results, achieving a real-time inference speed of 15.2 FPS at a resolution of 512x512.
Published: 2024

34. Networked Integrated Sensing and Communications for 6G Wireless Systems

Author: Li, Jiapeng, Shao, Xiaodan, Chen, Feng, Wan, Shaohua, Liu, Chang, Wei, Zhiqiang, and Ng, Derrick Wing Kwan
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: Integrated sensing and communication (ISAC) is envisioned as a key pillar for enabling the upcoming sixth generation (6G) communication systems, requiring not only reliable communication functionalities but also highly accurate environmental sensing capabilities. In this paper, we design a novel networked ISAC framework to explore the collaboration among multiple users for environmental sensing. Specifically, multiple users can serve as powerful sensors, capturing back scattered signals from a target at various angles to facilitate reliable computational imaging. Centralized sensing approaches are extremely sensitive to the capability of the leader node because it requires the leader node to process the signals sent by all the users. To this end, we propose a two-step distributed cooperative sensing algorithm that allows low-dimensional intermediate estimate exchange among neighboring users, thus eliminating the reliance on the centralized leader node and improving the robustness of sensing. This way, multiple users can cooperatively sense a target by exploiting the block-wise environment sparsity and the interference cancellation technique. Furthermore, we analyze the mean square error of the proposed distributed algorithm as a networked sensing performance metric and propose a beamforming design for the proposed network ISAC scheme to maximize the networked sensing accuracy and communication performance subject to a transmit power constraint. Simulation results validate the effectiveness of the proposed algorithm compared with the state-of-the-art algorithms., Comment: Received by IEEE Internet of Things Journal
Published: 2024

35. Robust non-Abelian even-denominator fractional Chern insulator in twisted bilayer MoTe$_2$

Author: Chen, Feng, Luo, Wei-Wei, Zhu, Wei, and Sheng, D. N.
Subjects: Condensed Matter - Strongly Correlated Electrons
Abstract: A recent experiment observes a series of quantum spin Hall effects in transition metal dichalcogenide moir\'e MoTe$_2$ [K. Kang, \textit{et. al}, Nature 628, 522-526 (2024)]. Among them, the filling $\nu=3$ state points to a time-reversal pair of edge states resembling those of the even-denominator fractional Chern insulators (FCIs). Inspired by this discovery, we investigate whether a robust incompressible quantum Hall liquid can be stabilized in the half-filled Chern band of twisted MoTe$_2$ bilayers. We use the continuum model with parameters relevant to twisted MoTe$_2$ bilayers and obtain three consecutive nearly flat Chern bands with the same Chern number. Crucially, when the second moir\'e miniband is half-filled, signatures of non-Abelian states are found via exact diagonalization calculations, including the stable six-fold ground state degeneracy which grows more robust for larger lattice sizes and is consistent with an even-denominator FCI state. We further perform flux insertion simulations to reveal a 1/2 quantized many-body Chern number as direct evidence of topological order. Furthermore, the ground state density structure factors show no sharp peak, indicating no charge density wave order. These evidences signal the potential of realizing the non-Abelian state at zero magnetic field in twisted bilayer MoTe$_2$ at the fractional hole filling 3/2., Comment: a complete undated version with errors corrected
Published: 2024

36. Winning the Social Media Influence Battle: Uncertainty-Aware Opinions to Understand and Spread True Information via Competitive Influence Maximization

Author: Zhang, Qi, Kaplan, Lance M., Jøsang, Audun, Jeong, Dong Hyun., Chen, Feng, and Cho, Jin-Hee
Subjects: Computer Science - Social and Information Networks
Abstract: Competitive Influence Maximization (CIM) involves entities competing to maximize influence in online social networks (OSNs). Current Deep Reinforcement Learning (DRL) methods in CIM rely on simplistic binary opinion models (i.e., an opinion is represented by either 0 or 1) and often overlook the complexity of users' behavioral characteristics and their prior knowledge. We propose a novel DRL-based framework that enhances CIM analysis by integrating Subjective Logic (SL) to accommodate uncertain opinions, users' behaviors, and their preferences. This approach targets the mitigation of false information by effectively propagating true information. By modeling two competitive agents, one spreading true information and the other spreading false information, we capture the strategic interplay essential to CIM. Our framework utilizes an uncertainty-based opinion model (UOM) to assess the impact on information quality in OSNs, emphasizing the importance of user behavior alongside network topology in selecting influential seed nodes. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art methods, achieving faster and more influential results (i.e., outperforming over 20%) under realistic network conditions. Moreover, our method shows robust performance in partially observable networks, effectively doubling the performance when users are predisposed to disbelieve true information., Comment: 8 pages, 3 figures, submitted to ASONAM 2024
Published: 2024

37. Causal Diffusion Autoencoders: Toward Counterfactual Generation via Diffusion Probabilistic Models

Author: Komanduri, Aneesh, Zhao, Chen, Chen, Feng, and Wu, Xintao
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Methodology
Abstract: Diffusion probabilistic models (DPMs) have become the state-of-the-art in high-quality image generation. However, DPMs have an arbitrary noisy latent space with no interpretable or controllable semantics. Although there has been significant research effort to improve image sample quality, there is little work on representation-controlled generation using diffusion models. Specifically, causal modeling and controllable counterfactual generation using DPMs is an underexplored area. In this work, we propose CausalDiffAE, a diffusion-based causal representation learning framework to enable counterfactual generation according to a specified causal model. Our key idea is to use an encoder to extract high-level semantically meaningful causal variables from high-dimensional data and model stochastic variation using reverse diffusion. We propose a causal encoding mechanism that maps high-dimensional data to causally related latent factors and parameterize the causal mechanisms among latent factors using neural networks. To enforce the disentanglement of causal variables, we formulate a variational objective and leverage auxiliary label information in a prior to regularize the latent space. We propose a DDIM-based counterfactual generation procedure subject to do-interventions. Finally, to address the limited label supervision scenario, we also study the application of CausalDiffAE when a part of the training data is unlabeled, which also enables granular control over the strength of interventions in generating counterfactuals during inference. We empirically show that CausalDiffAE learns a disentangled latent space and is capable of generating high-quality counterfactual images., Comment: Accepted to the 27th European Conference on Artificial Intelligence (ECAI 2024)
Published: 2024

38. Hyper Evidential Deep Learning to Quantify Composite Classification Uncertainty

Author: Li, Changbin, Li, Kangshuo, Ou, Yuzhe, Kaplan, Lance M., Jøsang, Audun, Cho, Jin-Hee, Jeong, Dong Hyun, and Chen, Feng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Deep neural networks (DNNs) have been shown to perform well on exclusive, multi-class classification tasks. However, when different classes have similar visual features, it becomes challenging for human annotators to differentiate them. This scenario necessitates the use of composite class labels. In this paper, we propose a novel framework called Hyper-Evidential Neural Network (HENN) that explicitly models predictive uncertainty due to composite class labels in training data in the context of the belief theory called Subjective Logic (SL). By placing a grouped Dirichlet distribution on the class probabilities, we treat predictions of a neural network as parameters of hyper-subjective opinions and learn the network that collects both single and composite evidence leading to these hyper-opinions by a deterministic DNN from data. We introduce a new uncertainty type called vagueness originally designed for hyper-opinions in SL to quantify composite classification uncertainty for DNNs. Our results demonstrate that HENN outperforms its state-of-the-art counterparts based on four image datasets. The code and datasets are available at: https://github.com/Hugo101/HyperEvidentialNN., Comment: In Proceedings of The Twelfth International Conference on Learning Representations, ICLR 2024
Published: 2024

39. A model for heating the super-hot corona in solar active regions

Author: Lu, Zekun, Chen, Feng, Ding, M. D., Wang, Can, Dai, Yu, and Cheng, Xin
Subjects: Astrophysics - Solar and Stellar Astrophysics
Abstract: What physical mechanisms heat the outer solar or stellar atmosphere to million-Kelvin temperatures is a fundamental but long-standing open question. In particular, the solar corona in active region cores contains an even hotter component reaching ten million Kelvin, manifesting as persistent coronal loops in extreme ultraviolet and soft X-ray images, which imposes a more stringent energy budget. Here, we present a self-consistent coronal heating model using a state-of-the-art three-dimensional radiative magnetohydrodynamics simulation. We find that the continuous magnetic flux emergence in active regions keeps driving magnetic reconnections that release energy impulsively but, on time average, persistently. As a result, numerous sub-structures are heated to ten million Kelvin and then evolve independently, which collectively form long-lived and stable coronal loops as in observations. This provides a heating model explaining the origin of the super-hot coronal plasma and the persistence of hot coronal loops in emerging active regions., Comment: 34 pages, 14 figures
Published: 2024
Full Text: View/download PDF

40. Data-independent acquisition-based blood proteomics unveils predictive biomarkers for neonatal necrotizing enterocolitis

Author: Chen, Feng, Tan, Kezhe, Lv, Zhibao, Chen, Faling, Xu, Weijue, Gong, Xiaohui, Lu, Li, Sun, Hailiang, Fu, Qinqin, and Zhuang, Wenjun
Published: 2024
Full Text: View/download PDF

41. Sennoside A represses the malignant phenotype and tumor immune microenvironment of non-small cell lung cancer cells by inhibiting the TRAF6/NF-κB pathway

Author: Xia, Wenchao, Shen, Yimeng, Chen, Feng, Liu, Xin, Cao, Yuqi, and Shi, Zhenliang
Published: 2024
Full Text: View/download PDF

42. hnRNPA2B1 deacetylation by SIRT6 restrains local transcription and safeguards genome stability

Author: Chen, Feng, Xu, Wenchao, Tang, Ming, Tian, Yuan, Shu, Yuxin, He, Xingkai, Zhou, Linmin, Liu, Qi, Zhu, Qian, Lu, Xiaopeng, Zhang, Jun, and Zhu, Wei-Guo
Published: 2024
Full Text: View/download PDF

43. Disentangling User Cognitive Intent with Causal Reasoning for Knowledge-Enhanced Recommendation

Author: xu, Hongcai, Bao, Junpeng, Lin, Qika, Hou, Lifang, and Chen, Feng
Published: 2024
Full Text: View/download PDF

44. Precipitation reconstructions in the northern and southern Qilian Mountains based on tree rings of Picea crassifolia

Author: Niu, Junqiang, Zhao, Xiaoen, Chen, Feng, Chen, Youping, and Yue, Weipeng
Published: 2024
Full Text: View/download PDF

45. Circular RNA circ_0002984 Facilitates the Proliferation and Migration of Ox-LDL-Induced Vascular Smooth Muscle Cells via the Let-7a-5p/KLF5 Pathway

Author: Chen, Feng, Jiang, Ruilai, and Yu, Xiufeng
Published: 2024
Full Text: View/download PDF

46. Adversarial learning with optimism for bias reduction in machine learning

Author: Cheng, Yu-Chen, Chen, Po-An, Chen, Feng-Chi, and Cheng, Ya-Wen
Published: 2024
Full Text: View/download PDF

47. Tumour evolution and microenvironment interactions in 2D and 3D space

Author: Mo, Chia-Kuei, Liu, Jingxian, Chen, Siqi, Storrs, Erik, Targino da Costa, Andre Luiz N., Houston, Andrew, Wendl, Michael C., Jayasinghe, Reyka G., Iglesia, Michael D., Ma, Cong, Herndon, John M., Southard-Smith, Austin N., Liu, Xinhao, Mudd, Jacqueline, Karpova, Alla, Shinkle, Andrew, Goedegebuure, S. Peter, Abdelzaher, Abdurrahman Taha Mousa Ali, Bo, Peng, Fulghum, Lauren, Livingston, Samantha, Balaban, Metin, Hill, Angela, Ippolito, Joseph E., Thorsson, Vesteinn, Held, Jason M., Hagemann, Ian S., Kim, Eric H., Bayguinov, Peter O., Kim, Albert H., Mullen, Mary M., Shoghi, Kooresh I., Ju, Tao, Reimers, Melissa A., Weimholt, Cody, Kang, Liang-I, Puram, Sidharth V., Veis, Deborah J., Pachynski, Russell, Fuh, Katherine C., Chheda, Milan G., Gillanders, William E., Fields, Ryan C., Raphael, Benjamin J., Chen, Feng, and Ding, Li
Published: 2024
Full Text: View/download PDF

48. Differential chromatin accessibility and transcriptional dynamics define breast cancer subtypes and their lineages

Author: Iglesia, Michael D., Jayasinghe, Reyka G., Chen, Siqi, Terekhanova, Nadezhda V., Herndon, John M., Storrs, Erik, Karpova, Alla, Zhou, Daniel Cui, Naser Al Deen, Nataly, Shinkle, Andrew T., Lu, Rita Jui-Hsien, Caravan, Wagma, Houston, Andrew, Zhao, Yanyan, Sato, Kazuhito, Lal, Preet, Street, Cherease, Martins Rodrigues, Fernanda, Southard-Smith, Austin N., Targino da Costa, André Luiz N., Zhu, Houxiang, Mo, Chia-Kuei, Crowson, Lisa, Fulton, Robert S., Wyczalkowski, Matthew A., Fronick, Catrina C., Fulton, Lucinda A., Sun, Hua, Davies, Sherri R., Appelbaum, Elizabeth L., Chasnoff, Sara E., Carmody, Madelyn, Brooks, Candace, Liu, Ruiyang, Wendl, Michael C., Oh, Clara, Bender, Diane, Cruchaga, Carlos, Harari, Oscar, Bredemeyer, Andrea, Lavine, Kory, Bose, Ron, Margenthaler, Julie, Held, Jason M., Achilefu, Samuel, Ademuyiwa, Foluso, Aft, Rebecca, Ma, Cynthia, Colditz, Graham A., Ju, Tao, Oh, Stephen T., Fitzpatrick, James, Hwang, E. Shelley, Shoghi, Kooresh I., Chheda, Milan G., Veis, Deborah J., Chen, Feng, Fields, Ryan C., Gillanders, William E., and Ding, Li
Published: 2024
Full Text: View/download PDF

49. Large-scale structural covariance networks changes relate to executive function deficit in betel quid-dependent chewers

Author: Guo, Yihao, Liu, Tao, Xu, Xiaoling, Li, Tiansheng, Xiong, Xiaoli, Chen, Huijuan, Huang, Weiyuan, Zhang, Xianchang, and Chen, Feng
Published: 2024
Full Text: View/download PDF

50. Molecular mechanism of mechanical pressure induced changes in the microenvironment of intervertebral disc degeneration

Author: Liu, Fei, Chao, Song, Yang, Lei, Chen, Chaoqi, Huang, Wutao, Chen, Feng, and Xu, Zhiwei
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

37,757 results on '"Chen, Feng"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources