Author: "Yicheng Hsu" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Yicheng Hsu"' showing total 7 results

Start Over Author "Yicheng Hsu"

7 results on '"Yicheng Hsu"'

1. Deep beamforming for speech enhancement and speaker localization with an array response-aware loss function

Author: Hsinyu Chang, Yicheng Hsu, and Mingsian R. Bai
Subjects: multichannel speech enhancement, speaker localization, loss function, deep learning, beamforming, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Recent research advances in deep neural network (DNN)-based beamformers have shown great promise for speech enhancement under adverse acoustic conditions. Different network architectures and input features have been explored in estimating beamforming weights. In this paper, we propose a deep beamformer based on an efficient convolutional recurrent network (CRN) trained with a novel ARray RespOnse-aWare (ARROW) loss function. The ARROW loss exploits the array responses of the target and interferer by using the ground truth relative transfer functions (RTFs). The DNN-based beamforming system, trained with ARROW loss through supervised learning, is able to perform speech enhancement and speaker localization jointly. Experimental results have shown that the proposed deep beamformer, trained with the linearly weighted scale-invariant source-to-noise ratio (SI-SNR) and ARROW loss functions, achieves superior performance in speech enhancement and speaker localization compared to two baselines.
Published: 2024
Full Text: View/download PDF

2. Learning-based robust speaker counting and separation with the aid of spatial coherence

Author: Yicheng Hsu and Mingsian R. Bai
Subjects: Multichannel blind source separation, Speaker counting and separation, Spatial coherence, Neural network, Acoustics. Sound, QC221-246, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract A three-stage approach is proposed for speaker counting and speech separation in noisy and reverberant environments. In the spatial feature extraction, a spatial coherence matrix (SCM) is computed using whitened relative transfer functions (wRTFs) across time frames. The global activity functions of each speaker are estimated from a simplex constructed using the eigenvectors of the SCM, while the local coherence functions are computed from the coherence between the wRTFs of a time-frequency bin and the global activity function-weighted RTF of the target speaker. In speaker counting, we use the eigenvalues of the SCM and the maximum similarity of the interframe global activity distributions between two speakers as the input features to the speaker counting network (SCnet). In speaker separation, a global and local activity-driven network (GLADnet) is used to extract each independent speaker signal, which is particularly useful for highly overlapping speech signals. Experimental results obtained from the real meeting recordings show that the proposed system achieves superior speaker counting and speaker separation performance compared to previous publications without the prior knowledge of the array configurations.
Published: 2023
Full Text: View/download PDF

3. Noninvasive Delineation of Glioma Infiltration with Combined 7T Chemical Exchange Saturation Transfer Imaging and MR Spectroscopy: A Diagnostic Accuracy Study

Author: Yifan Yuan, Yang Yu, Yu Guo, Yinghua Chu, Jun Chang, Yicheng Hsu, Patrick Alexander Liebig, Ji Xiong, Wenwen Yu, Danyang Feng, Baofeng Yang, Liang Chen, He Wang, Qi Yue, and Ying Mao
Subjects: CEST, MRS, FET-PET, ultra-high field MRI, glioma, Microbiology, QR1-502
Abstract: For precise delineation of glioma extent, amino acid PET is superior to conventional MR imaging. Since metabolic MR sequences such as chemical exchange saturation transfer (CEST) imaging and MR spectroscopy (MRS) were developed, we aimed to evaluate the diagnostic accuracy of combined CEST and MRS to predict glioma infiltration. Eighteen glioma patients of different tumor grades were enrolled in this study; 18F-fluoroethyltyrosine (FET)-PET, amide proton transfer CEST at 7 Tesla(T), MRS and conventional MR at 3T were conducted preoperatively. Multi modalities and their association were evaluated using Pearson correlation analysis patient-wise and voxel-wise. Both CEST (R = 0.736, p < 0.001) and MRS (R = 0.495, p = 0.037) correlated with FET-PET, while the correlation between CEST and MRS was weaker. In subgroup analysis, APT values were significantly higher in high grade glioma (3.923 ± 1.239) and IDH wildtype group (3.932 ± 1.264) than low grade glioma (3.317 ± 0.868, p < 0.001) or IDH mutant group (3.358 ± 0.847, p < 0.001). Using high FET uptake as the standard, the CEST/MRS combination (AUC, 95% CI: 0.910, 0.907–0.913) predicted tumor infiltration better than CEST (0.812, 0.808–0.815) or MRS (0.888, 0.885–0.891) alone, consistent with contrast-enhancing and T2-hyperintense areas. Probability maps of tumor presence constructed from the CEST/MRS combination were preliminarily verified by multi-region biopsies. The combination of 7T CEST/MRS might serve as a promising non-radioactive alternative to delineate glioma infiltration, thus reshaping the guidance for tumor resection and irradiation.
Published: 2022
Full Text: View/download PDF

4. Model-Matching Principle Applied to the Design of an Array-Based All-Neural Binaural Rendering System for Audio Telepresence.

Author: Yicheng Hsu, Chenghung Ma, and Mingsian R. Bai
Published: 2023
Full Text: View/download PDF

5. Acoustic Echo Suppression Using A Learning-Based Multi-Frame Minimum Variance Distortionless Response (Mfmvdr) Filter.

Author: Yuefeng Tsai, Yicheng Hsu, and Mingsian R. Bai
Published: 2022
Full Text: View/download PDF

6. Learning-Based Personal Speech Enhancement for Teleconferencing by Exploiting Spatial-Spectral Features.

Author: Yicheng Hsu, Yonghan Lee, and Mingsian R. Bai
Published: 2022
Full Text: View/download PDF

7. Model-matching Principle Applied to the Design of an Array-based All-neural Binaural Rendering System for Audio Telepresence

Author: Yicheng Hsu, Chenghung Ma, and Mingsian R. Bai
Subjects: FOS: Computer and information sciences, Sound (cs.SD), Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Telepresence aims to create an immersive but virtual experience of the audio and visual scene at the far end for users at the near end. In this contribution, we propose an array-based binaural rendering system that converts the array microphone signals into the head-related transfer function (HRTF) filtered output signals for headphone-rendering. The proposed approach is formulated in light of a model-matching principle (MMP) and is capable of delivering more immersive experience than the conventional localization-beamforming-HRTF filtering (LBH) approach. The MMP-based rendering system can be realized via multichannel inverse filtering (MIF) and multichannel deep filtering (MDF). In this study, we adopted the MDF approach and used the LBH as well as MIF as the baselines. The all-neural system jointly captures the spatial information (spatial rendering), preserves ambient sound (enhancement), and reduces noise (enhancement) before generating binaural outputs. Objective and subjective tests are employed to compare the proposed telepresence system with two baselines., accepted by ICASSP 2023
Published: 2022

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

7 results on '"Yicheng Hsu"'

1. Deep beamforming for speech enhancement and speaker localization with an array response-aware loss function

2. Learning-based robust speaker counting and separation with the aid of spatial coherence

3. Noninvasive Delineation of Glioma Infiltration with Combined 7T Chemical Exchange Saturation Transfer Imaging and MR Spectroscopy: A Diagnostic Accuracy Study

4. Model-Matching Principle Applied to the Design of an Array-Based All-Neural Binaural Rendering System for Audio Telepresence.

5. Acoustic Echo Suppression Using A Learning-Based Multi-Frame Minimum Variance Distortionless Response (Mfmvdr) Filter.

6. Learning-Based Personal Speech Enhancement for Teleconferencing by Exploiting Spatial-Spectral Features.

7. Model-matching Principle Applied to the Design of an Array-based All-neural Binaural Rendering System for Audio Telepresence

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

7 results on '"Yicheng Hsu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources