Author: "Zeng, Wenjia" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Zeng, Wenjia"' showing total 21 results

Start Over Author "Zeng, Wenjia"

21 results on '"Zeng, Wenjia"'

1. AudioEditor: A Training-Free Diffusion-Based Audio Editing Framework

Author: Jia, Yuhang, Chen, Yang, Zhao, Jinghua, Zhao, Shiwan, Zeng, Wenjia, Chen, Yong, and Qin, Yong
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Diffusion-based text-to-audio (TTA) generation has made substantial progress, leveraging latent diffusion model (LDM) to produce high-quality, diverse and instruction-relevant audios. However, beyond generation, the task of audio editing remains equally important but has received comparatively little attention. Audio editing tasks face two primary challenges: executing precise edits and preserving the unedited sections. While workflows based on LDMs have effectively addressed these challenges in the field of image processing, similar approaches have been scarcely applied to audio editing. In this paper, we introduce AudioEditor, a training-free audio editing framework built on the pretrained diffusion-based TTA model. AudioEditor incorporates Null-text Inversion and EOT-suppression methods, enabling the model to preserve original audio features while executing accurate edits. Comprehensive objective and subjective experiments validate the effectiveness of AudioEditor in delivering high-quality audio edits. Code and demo can be found at https://github.com/NKU-HLT/AudioEditor.
Published: 2024

2. M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper

Author: Zhou, Jiaming, Zhao, Shiwan, He, Jiabei, Wang, Hui, Zeng, Wenjia, Chen, Yong, Sun, Haoqin, Kong, Aobo, and Qin, Yong
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: State-of-the-art models like OpenAI's Whisper exhibit strong performance in multilingual automatic speech recognition (ASR), but they still face challenges in accurately recognizing diverse subdialects. In this paper, we propose M2R-whisper, a novel multi-stage and multi-scale retrieval augmentation approach designed to enhance ASR performance in low-resource settings. Building on the principles of in-context learning (ICL) and retrieval-augmented techniques, our method employs sentence-level ICL in the pre-processing stage to harness contextual information, while integrating token-level k-Nearest Neighbors (kNN) retrieval as a post-processing step to further refine the final output distribution. By synergistically combining sentence-level and token-level retrieval strategies, M2R-whisper effectively mitigates various types of recognition errors. Experiments conducted on Mandarin and subdialect datasets, including AISHELL-1 and KeSpeech, demonstrate substantial improvements in ASR accuracy, all achieved without any parameter updates.
Published: 2024

3. Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework

Author: Sun, Haoqin, Zhao, Shiwan, Li, Shaokai, Kong, Xiangyu, Wang, Xuechen, Kong, Aobo, Zhou, Jiaming, Chen, Yong, Zeng, Wenjia, and Qin, Yong
Subjects: Computer Science - Multimedia, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Multimodal emotion recognition systems rely heavily on the full availability of modalities, suffering significant performance declines when modal data is incomplete. To tackle this issue, we present the Cross-Modal Alignment, Reconstruction, and Refinement (CM-ARR) framework, an innovative approach that sequentially engages in cross-modal alignment, reconstruction, and refinement phases to handle missing modalities and enhance emotion recognition. This framework utilizes unsupervised distribution-based contrastive learning to align heterogeneous modal distributions, reducing discrepancies and modeling semantic uncertainty effectively. The reconstruction phase applies normalizing flow models to transform these aligned distributions and recover missing modalities. The refinement phase employs supervised point-based contrastive learning to disrupt semantic correlations and accentuate emotional traits, thereby enriching the affective content of the reconstructed representations. Extensive experiments on the IEMOCAP and MSP-IMPROV datasets confirm the superior performance of CM-ARR under conditions of both missing and complete modalities. Notably, averaged across six scenarios of missing modalities, CM-ARR achieves absolute improvements of 2.11% in WAR and 2.12% in UAR on the IEMOCAP dataset, and 1.71% and 1.96% in WAR and UAR, respectively, on the MSP-IMPROV dataset.
Published: 2024

4. Fine-grained Disentangled Representation Learning for Multimodal Emotion Recognition

Author: Sun, Haoqin, Zhao, Shiwan, Wang, Xuechen, Zeng, Wenjia, Chen, Yong, and Qin, Yong
Subjects: Computer Science - Sound, Computer Science - Multimedia, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Multimodal emotion recognition (MMER) is an active research field that aims to accurately recognize human emotions by fusing multiple perceptual modalities. However, inherent heterogeneity across modalities introduces distribution gaps and information redundancy, posing significant challenges for MMER. In this paper, we propose a novel fine-grained disentangled representation learning (FDRL) framework to address these challenges. Specifically, we design modality-shared and modality-private encoders to project each modality into modality-shared and modality-private subspaces, respectively. In the shared subspace, we introduce a fine-grained alignment component to learn modality-shared representations, thus capturing modal consistency. Subsequently, we tailor a fine-grained disparity component to constrain the private subspaces, thereby learning modality-private representations and enhancing their diversity. Lastly, we introduce a fine-grained predictor component to ensure that the labels of the output representations from the encoders remain unchanged. Experimental results on the IEMOCAP dataset show that FDRL outperforms the state-of-the-art methods, achieving 78.34% and 79.44% on WAR and UAR, respectively., Comment: Accepted by ICASSP 2024
Published: 2023

5. kNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels

Author: Zhou, Jiaming, Zhao, Shiwan, Liu, Yaqi, Zeng, Wenjia, Chen, Yong, and Qin, Yong
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: The success of retrieval-augmented language models in various natural language processing (NLP) tasks has been constrained in automatic speech recognition (ASR) applications due to challenges in constructing fine-grained audio-text datastores. This paper presents kNN-CTC, a novel approach that overcomes these challenges by leveraging Connectionist Temporal Classification (CTC) pseudo labels to establish frame-level audio-text key-value pairs, circumventing the need for precise ground truth alignments. We further introduce a skip-blank strategy, which strategically ignores CTC blank frames, to reduce datastore size. kNN-CTC incorporates a k-nearest neighbors retrieval mechanism into pre-trained CTC ASR systems, achieving significant improvements in performance. By incorporating a k-nearest neighbors retrieval mechanism into pre-trained CTC ASR systems and leveraging a fine-grained, pruned datastore, kNN-CTC consistently achieves substantial improvements in performance under various experimental settings. Our code is available at https://github.com/NKU-HLT/KNN-CTC., Comment: Accepted by ICASSP 2024
Published: 2023

6. Joint optimisation of drone routing and battery wear for sustainable supply chain development: a mixed-integer programming model based on blockchain-enabled fleet sharing

Author: Xia, Yang, Zeng, Wenjia, Xing, Xinjie, Zhan, Yuanzhu, Tan, Kim Hua, and Kumar, Ajay
Published: 2023
Full Text: View/download PDF

7. Correction to: Joint optimisation of drone routing and battery wear for sustainable supply chain development: a mixed-integer programming model based on blockchain-enabled fleet sharing

Author: Xia, Yang, Zeng, Wenjia, Xing, Xinjie, Zhan, Yuanzhu, Tan, Kim Hua, and Kumar, Ajay
Published: 2024
Full Text: View/download PDF

8. A branch-and-price-and-cut algorithm for the vehicle routing problem with load-dependent drones

Author: Xia, Yang, Zeng, Wenjia, Zhang, Canrong, and Yang, Hai
Published: 2023
Full Text: View/download PDF

9. Optimality-guaranteed algorithms on the dynamic shared-taxi problem

Author: Hua, Shijia, Zeng, Wenjia, Liu, Xinglu, and Qi, Mingyao
Published: 2022
Full Text: View/download PDF

10. KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels

Author: Zhou, Jiaming, primary, Zhao, Shiwan, additional, Liu, Yaqi, additional, Zeng, Wenjia, additional, Chen, Yong, additional, and Qin, Yong, additional
Published: 2024
Full Text: View/download PDF

11. Fine-Grained Disentangled Representation Learning For Multimodal Emotion Recognition

Author: Sun, Haoqin, primary, Zhao, Shiwan, additional, Wang, Xuechen, additional, Zeng, Wenjia, additional, Chen, Yong, additional, and Qin, Yong, additional
Published: 2024
Full Text: View/download PDF

12. Analysis of Influential Factors on the WTP of Roof Agriculture’s Non-use Value

Author: Yin, Qi, Zeng, Wenjia, Liu, Xiaoqian, Wu, Yuzhe, editor, Zheng, Sheng, editor, Luo, Jiaojiao, editor, Wang, Wei, editor, Mo, Zhibin, editor, and Shan, Liping, editor
Published: 2017
Full Text: View/download PDF

13. Analysis of Influential Factors on the WTP of Roof Agriculture’s Non-use Value

Author: Yin, Qi, primary, Zeng, Wenjia, additional, and Liu, Xiaoqian, additional
Published: 2016
Full Text: View/download PDF

14. Correction to: Joint optimisation of drone routing and battery wear for sustainable supply chain development: a mixed-integer programming model based on blockchain-enabled fleet sharing

Author: Xia, Yang, primary, Zeng, Wenjia, additional, Xing, Xinjie, additional, Zhan, Yuanzhu, additional, Tan, Kim Hua, additional, and Kumar, Ajay, additional
Published: 2021
Full Text: View/download PDF

15. Truck Departure Optimization from Distribution Center to Parcel Locker with Stochastic Demand Arrival

Author: Zeng, Wenjia, primary, Xia, Yang, additional, and Qi, Mingyao, additional
Published: 2021
Full Text: View/download PDF

16. Joint optimisation of drone routing and battery wear for sustainable supply chain development: a mixed-integer programming model based on blockchain-enabled fleet sharing

Author: Xia, Yang, primary, Zeng, Wenjia, additional, Xing, Xinjie, additional, Zhan, Yuanzhu, additional, Tan, Kim Hua, additional, and Kumar, Ajay, additional
Published: 2021
Full Text: View/download PDF

17. Method of Multispectral Image Denoising Based on Whole and Sub-Sparsity

Author: Zeng, Wenjia, primary and Zhang, Xinggan, additional
Published: 2021
Full Text: View/download PDF

18. Method for multispectral images denoising based on tensor-singular value decomposition

Author: Zeng, Wenjia, primary, Zhang, Xianggan, primary, and Bai, Yechao, primary
Published: 2017
Full Text: View/download PDF

19. Health Assessment of Urban Land Ecosystem: A Case Study in Chengdu

Author: Yin, Qi, primary, Chen, Wenkuan, primary, Zeng, Wenjia, primary, and Zhou, Ting, primary
Published: 2015
Full Text: View/download PDF

20. Chloride is required for receptor-mediated divalent cation entry in mesangial cells

Author: Kremer, Sidney G., primary, Zeng, Wenjia, additional, Hurst, Roger, additional, Ning, Terri, additional, Whiteside, Catharine, additional, and Skorecki, Karl L., additional
Published: 1995
Full Text: View/download PDF

21. Multiple signaling pathways for Cl- dependent depolarization of mesangial cells: role of Ca2+, PKC, and G proteins.

Author: KREMER, SIDNEY G., ZENG, WENJIA, SRIDHARA, SAMPATH, and SKORECKI, KARL L.
Published: 1992

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

21 results on '"Zeng, Wenjia"'

1. AudioEditor: A Training-Free Diffusion-Based Audio Editing Framework

2. M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper

3. Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework

4. Fine-grained Disentangled Representation Learning for Multimodal Emotion Recognition

5. kNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels

6. Joint optimisation of drone routing and battery wear for sustainable supply chain development: a mixed-integer programming model based on blockchain-enabled fleet sharing

7. Correction to: Joint optimisation of drone routing and battery wear for sustainable supply chain development: a mixed-integer programming model based on blockchain-enabled fleet sharing

8. A branch-and-price-and-cut algorithm for the vehicle routing problem with load-dependent drones

9. Optimality-guaranteed algorithms on the dynamic shared-taxi problem

10. KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels

11. Fine-Grained Disentangled Representation Learning For Multimodal Emotion Recognition

12. Analysis of Influential Factors on the WTP of Roof Agriculture’s Non-use Value

13. Analysis of Influential Factors on the WTP of Roof Agriculture’s Non-use Value

14. Correction to: Joint optimisation of drone routing and battery wear for sustainable supply chain development: a mixed-integer programming model based on blockchain-enabled fleet sharing

15. Truck Departure Optimization from Distribution Center to Parcel Locker with Stochastic Demand Arrival

16. Joint optimisation of drone routing and battery wear for sustainable supply chain development: a mixed-integer programming model based on blockchain-enabled fleet sharing

17. Method of Multispectral Image Denoising Based on Whole and Sub-Sparsity

18. Method for multispectral images denoising based on tensor-singular value decomposition

19. Health Assessment of Urban Land Ecosystem: A Case Study in Chengdu

20. Chloride is required for receptor-mediated divalent cation entry in mesangial cells

21. Multiple signaling pathways for Cl- dependent depolarization of mesangial cells: role of Ca2+, PKC, and G proteins.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

21 results on '"Zeng, Wenjia"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources