Author: "Yang, Peiji" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Yang, Peiji"' showing total 21 results

Start Over Author "Yang, Peiji"

21 results on '"Yang, Peiji"'

1. Optimizing Neural Speech Codec for Low-Bitrate Compression via Multi-Scale Encoding

Author: Yang, Peiji, Wang, Fengping, Zhong, Yicheng, Wei, Huawei, and Wang, Zhisheng
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Neural speech codecs have demonstrated their ability to compress high-quality speech and audio by converting them into discrete token representations. Most existing methods utilize Residual Vector Quantization (RVQ) to encode speech into multiple layers of discrete codes with uniform time scales. However, this strategy overlooks the differences in information density across various speech features, leading to redundant encoding of sparse information, which limits the performance of these methods at low bitrate. This paper proposes MsCodec, a novel multi-scale neural speech codec that encodes speech into multiple layers of discrete codes, each corresponding to a different time scale. This encourages the model to decouple speech features according to their diverse information densities, consequently enhancing the performance of speech compression. Furthermore, we incorporate mutual information loss to augment the diversity among speech codes across different layers. Experimental results indicate that our proposed method significantly improves codec performance at low bitrate.
Published: 2024

2. Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models

Author: Li, Weiqin, Yang, Peiji, Zhong, Yicheng, Zhou, Yixuan, Wang, Zhisheng, Wu, Zhiyong, Wu, Xixin, and Meng, Helen
Subjects: Computer Science - Sound, Computer Science - Computation and Language, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Spontaneous style speech synthesis, which aims to generate human-like speech, often encounters challenges due to the scarcity of high-quality data and limitations in model capabilities. Recent language model-based TTS systems can be trained on large, diverse, and low-quality speech datasets, resulting in highly natural synthesized speech. However, they are limited by the difficulty of simulating various spontaneous behaviors and capturing prosody variations in spontaneous speech. In this paper, we propose a novel spontaneous speech synthesis system based on language models. We systematically categorize and uniformly model diverse spontaneous behaviors. Moreover, fine-grained prosody modeling is introduced to enhance the model's ability to capture subtle prosody variations in spontaneous speech.Experimental results show that our proposed method significantly outperforms the baseline methods in terms of prosody naturalness and spontaneous behavior naturalness., Comment: Accepted by INTERSPEECH 2024
Published: 2024

3. ExpCLIP: Bridging Text and Facial Expressions via Semantic Alignment

Author: Zhong, Yicheng, Wei, Huawei, Yang, Peiji, and Wang, Zhisheng
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: The objective of stylized speech-driven facial animation is to create animations that encapsulate specific emotional expressions. Existing methods often depend on pre-established emotional labels or facial expression templates, which may limit the necessary flexibility for accurately conveying user intent. In this research, we introduce a technique that enables the control of arbitrary styles by leveraging natural language as emotion prompts. This technique presents benefits in terms of both flexibility and user-friendliness. To realize this objective, we initially construct a Text-Expression Alignment Dataset (TEAD), wherein each facial expression is paired with several prompt-like descriptions.We propose an innovative automatic annotation method, supported by Large Language Models (LLMs), to expedite the dataset construction, thereby eliminating the substantial expense of manual annotation. Following this, we utilize TEAD to train a CLIP-based model, termed ExpCLIP, which encodes text and facial expressions into semantically aligned style embeddings. The embeddings are subsequently integrated into the facial animation generator to yield expressive and controllable facial animations. Given the limited diversity of facial emotions in existing speech-driven facial animation training data, we further introduce an effective Expression Prompt Augmentation (EPA) mechanism to enable the animation generator to support unprecedented richness in style control. Comprehensive experiments illustrate that our method accomplishes expressive facial animation generation and offers enhanced flexibility in effectively conveying the desired style.
Published: 2023

4. AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents

Author: Zhang, Yongmao, Wang, Zhichao, Yang, Peiji, Sun, Hongshen, Wang, Zhisheng, and Xie, Lei
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Learning accent from crowd-sourced data is a feasible way to achieve a target speaker TTS system that can synthesize accent speech. To this end, there are two challenging problems to be solved. First, direct use of the poor acoustic quality crowd-sourced data and the target speaker data in accent transfer will apparently lead to synthetic speech with degraded quality. To mitigate this problem, we take a bottleneck feature (BN) based TTS approach, in which TTS is decomposed into a Text-to-BN (T2BN) module to learn accent and a BN-to-Mel (BN2Mel) module to learn speaker timbre, where neural network based BN feature serves as the intermediate representation that are robust to noise interference. Second, direct training T2BN using the crowd-sourced data in the two-stage system will produce accent speech of target speaker with poor prosody. This is because the the crowd-sourced recordings are contributed from the ordinary unprofessional speakers. To tackle this problem, we update the two-stage approach to a novel three-stage approach, where T2BN and BN2Mel are trained using the high-quality target speaker data and a new BN-to-BN module is plugged in between the two modules to perform accent transfer. To train the BN2BN module, the parallel unaccented and accented BN features are obtained by a proposed data augmentation procedure. Finally the proposed three-stage approach manages to produce accent speech for the target speaker with good prosody, as the prosody pattern is inherited from the professional target speaker and accent transfer is achieved by the BN2BN module at the same time. The proposed approach, named as AccentSpeech, is validated in a Mandarin TTS accent transfer task., Comment: Accepted by ISCSLP2022
Published: 2022

5. Wind disturbance-based tomato seedlings growth control

Author: Yang, Peiji, Hao, Jie, Li, Zhiguo, Tchuenbou-Magaia, Fideline, and Ni, Jiheng
Published: 2024
Full Text: View/download PDF

6. Machine learning for polyphenol-based materials

Author: Jiang, Shengxi, Yang, Peiji, Zheng, Yujia, Lu, Xiong, and Xie, Chaoming
Published: 2024
Full Text: View/download PDF

7. Cross-lingual Text Classification with Heterogeneous Graph Neural Network

Author: Wang, Ziyun, Liu, Xuan, Yang, Peiji, Liu, Shixing, and Wang, Zhisheng
Subjects: Computer Science - Computation and Language
Abstract: Cross-lingual text classification aims at training a classifier on the source language and transferring the knowledge to target languages, which is very useful for low-resource languages. Recent multilingual pretrained language models (mPLM) achieve impressive results in cross-lingual classification tasks, but rarely consider factors beyond semantic similarity, causing performance degradation between some language pairs. In this paper we propose a simple yet effective method to incorporate heterogeneous information within and across languages for cross-lingual text classification using graph convolutional networks (GCN). In particular, we construct a heterogeneous graph by treating documents and words as nodes, and linking nodes with different relations, which include part-of-speech roles, semantic similarity, and document translations. Extensive experiments show that our graph-based method significantly outperforms state-of-the-art models on all tasks, and also achieves consistent performance gain over baselines in low-resource settings where external tools like translators are unavailable., Comment: Accepted by ACL 2021 (short paper)
Published: 2021

8. Biomechanical response of the above-ground organs in tomato seedling at different age levels under wind-flow disturbance

Author: Liu, Zhengguang, Yang, Peiji, Fadiji, Tobi, Li, Zhiguo, and Ni, Jiheng
Published: 2023
Full Text: View/download PDF

9. Study on the Evolutionary Characteristics of Post-Fire Forest Recovery Using Unmanned Aerial Vehicle Imagery and Deep Learning: A Case Study of Jinyun Mountain in Chongqing, China.

Author: Zhu, Deli and Yang, Peiji
Abstract: Forest fires pose a significant threat to forest ecosystems, with severe impacts on both the environment and human society. Understanding the post-fire recovery processes of forests is crucial for developing strategies for species diversity conservation and ecological restoration and preventing further damage. The present study proposes applying the EAswin-Mask2former model based on semantic segmentation in deep learning using visible light band data to better monitor the evolution of burn areas in forests after fires. This model is an improvement of the classical semantic segmentation model Mask2former and can better adapt to the complex environment of burned forest areas. This model employs Swin-Transformer as the backbone for feature extraction, which is particularly advantageous for processing high-resolution images. It also includes the Contextual Transformer (CoT) Block to better capture contextual information capture and incorporates the Efficient Multi-Scale Attention (EMA) Block into the Efficiently Adaptive (EA) Block to enhance the model's ability to learn key features and long-range dependencies. The experimental results demonstrate that the EAswin-Mask2former model can achieve a mean Intersection-over-Union (mIoU) of 76.35% in segmenting complex forest burn areas across different seasons, representing improvements of 3.26 and 0.58 percentage points, respectively, over the Mask2former models using ResNet and Swin-Transformer backbones, respectively. Moreover, this method surpasses the performance of the DeepLabV3+ and Segformer models by 4.04 and 1.75 percentage points, respectively. Ultimately, the proposed model offers excellent segmentation performance for both forest and burn areas and can effectively track the evolution of burned forests when combined with unmanned aerial vehicle (UAV) remote sensing images. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Experimental research on the tilting pad bearing under the high temperature of inlet oil

Author: Yang, Peiji, Yuan, Qi, and Chen, Runlin
Published: 2018
Full Text: View/download PDF

11. Model and Algorithm for a Rotor-Bearing System Considering Journal Misalignment.

Author: Zhao, Zhiming, Ma, Junjie, Liu, Qi, and Yang, Peiji
Subjects: FLUID pressure, JOURNAL bearings, REYNOLDS equations, ALGORITHMS, ANGLES, ORBITS (Astronomy)
Abstract: Disturbances caused as a result of the misalignment and axial motion of the journal affect the characteristics of the rotor-bearing system. This paper aims to propose an algorithm for the theoretical analysis of a rotor-bearing system that considers these disturbances. A theoretical model for a journal bearing considering disturbances is given. The dynamic equations for a rigid rotor-bearing system are introduced. A detailed algorithm that can simultaneously solve the rotor-dynamic equations and the Reynolds equation is proposed. The static performance, such as the bearing attitude angle and the fluid film pressure, are given, and dynamic characteristics such as the nonlinear dynamic responses and the axial orbits of a rigid rotor-bearing system are presented. The hydrodynamic effect of the bearing is enhanced by the axial disturbance. Disturbances in the circumferential and radial directions lead to variations in the fluid film thickness distribution in the axial direction and the offset of the fluid film pressure distribution in the axial direction. When these disturbances work together, the variation trend is more obvious and affects the capacity and dynamic characteristics of the bearing. When the L/D value of the bearing increases, the clearance between the journal and the bearing decreases rapidly. When the value reaches a certain limit, contact and collision might occur. The theoretical analysis method and the algorithm proposed for a rotor-bearing system considering several disturbances could enhance the design level for a bearing and rotor-bearing system. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents

Author: Zhang, Yongmao, primary, Wang, Zhichao, additional, Yang, Peiji, additional, Sun, Hongshen, additional, Wang, Zhisheng, additional, and Xie, Lei, additional
Published: 2022
Full Text: View/download PDF

13. Design of Walking and Dressing Aid Device: A Static Analysis Approach

Author: Hu, Xing, primary, Liu, Yikun, additional, Zhong, Dingjun, additional, and Yang, Peiji, additional
Published: 2021
Full Text: View/download PDF

14. Experimental Study on Rheological Disturbance Effect of Gas-Bearing Coal Rock

Author: Wang, Bo, primary, Huang, Zikang, additional, Hu, Shiyu, additional, Lu, Changliang, additional, Yang, Peiji, additional, Wang, Ling, additional, and Huang, Wanpeng, additional
Published: 2021
Full Text: View/download PDF

15. Cross-lingual Text Classification with Heterogeneous Graph Neural Network

Author: Wang, Ziyun, primary, Liu, Xuan, additional, Yang, Peiji, additional, Liu, Shixing, additional, and Wang, Zhisheng, additional
Published: 2021
Full Text: View/download PDF

16. Analysis of the Impacts of Bearing on Vibration Characteristics of Rotor

Author: Yang, Peiji, primary, Yuan, Qi, additional, Huang, Chao, additional, Zhou, Yafeng, additional, Li, Hongliang, additional, and Zhou, Yu, additional
Published: 2017
Full Text: View/download PDF

17. Characteristics and origin of the Middle Proterozoic Dongshuichang chambersite deposit, Jixian, Tianjin, China

Author: Fan, Delian, primary, Yang, Peiji, additional, and Wang, Rao, additional
Published: 1999
Full Text: View/download PDF

18. Introduction to and classification of manganese deposits of China

Author: Fan, Delian, primary and Yang, Peiji, additional
Published: 1999
Full Text: View/download PDF

19. Characteristics of manganese ore deposits in China

Author: Ye Lianjun, Fan Delian, and Yang Peiji
Subjects: Supergene (geology), Proterozoic, Volcanogenic massive sulfide ore deposit, Geochemistry, chemistry.chemical_element, Geology, Manganese, Sedimentary exhalative deposits, chemistry, Geochemistry and Petrology, Carbonate rock, Economic Geology, Sedimentary rock, Oil shale
Abstract: Manganese ore deposits are widely distributed in China. Based on features of regional tectonics and sedimentary environments as well as genetic characteristics, seven important metallogenic provinces can be classified. The commercial manganese ore deposits are distributed mainly on platforms and their marginal areas, especially those of relatively mobile platforms. The main metallogenic epochs of manganese ore deposits of China are Sinian and Devonian, in contrast with the famous Proterozoic, Cretaceous and Oligocene manganese deposits elsewhere in the world. Based on their modes of origin, manganese ore deposits in China can be divided into six types. Among them sedimentary and supergene deposits are the main types accounting for commercial manganese deposits. According to the lithological association of the ore bearing series, sedimentary manganese ore deposits can be classified into: (1) mud-rock type; (2) “black shale” type; (3) carbonate rock type. In addition, eight element association types can also be identified, among which BMn, SMn, CaMgMn and PMn types are quite special for manganese ore deposits in China.
Published: 1988
Full Text: View/download PDF

20. A discussion on origin of the polymetal sulphide ores in the Devonian of Central Hunan Basin.

Author: Hou Kui, Chen Zhiming, Liu Guoliang., Yang Peiji, Hou Kui, Chen Zhiming, Liu Guoliang., and Yang Peiji
Abstract: Polymetallic sulphide ores occur in the stromatopora limestone of the Qizigiao formation. The concentric ring structure of the ores is described and the origin of the lead-zinc sulphides and pyrite deposits are discussed. A model of mineralisation is suggested., Polymetallic sulphide ores occur in the stromatopora limestone of the Qizigiao formation. The concentric ring structure of the ores is described and the origin of the lead-zinc sulphides and pyrite deposits are discussed. A model of mineralisation is suggested.

21. Characteristics of manganese ore deposits in China.

Author: Ye Lianjun, Fan Delian, Yang Peiji, Ye Lianjun, Fan Delian, and Yang Peiji
Abstract: The main metallogenic epochs of manganese ore deposits in China are Sinian and Devonian. Based on their modes of origin, manganese ore deposits in China can be divided into six types: sedimentary; volcanic-sedimentary; metamorphosed-sedimentary; hydrothermally modified sedimentary; hydrothermal; and supergene deposits. Sedimentary and supergene deposits are the main types accounting for commercial manganese deposits., The main metallogenic epochs of manganese ore deposits in China are Sinian and Devonian. Based on their modes of origin, manganese ore deposits in China can be divided into six types: sedimentary; volcanic-sedimentary; metamorphosed-sedimentary; hydrothermally modified sedimentary; hydrothermal; and supergene deposits. Sedimentary and supergene deposits are the main types accounting for commercial manganese deposits.

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

21 results on '"Yang, Peiji"'

1. Optimizing Neural Speech Codec for Low-Bitrate Compression via Multi-Scale Encoding

2. Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models

3. ExpCLIP: Bridging Text and Facial Expressions via Semantic Alignment

4. AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents

5. Wind disturbance-based tomato seedlings growth control

6. Machine learning for polyphenol-based materials

7. Cross-lingual Text Classification with Heterogeneous Graph Neural Network

8. Biomechanical response of the above-ground organs in tomato seedling at different age levels under wind-flow disturbance

9. Study on the Evolutionary Characteristics of Post-Fire Forest Recovery Using Unmanned Aerial Vehicle Imagery and Deep Learning: A Case Study of Jinyun Mountain in Chongqing, China.

10. Experimental research on the tilting pad bearing under the high temperature of inlet oil

11. Model and Algorithm for a Rotor-Bearing System Considering Journal Misalignment.

12. AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents

13. Design of Walking and Dressing Aid Device: A Static Analysis Approach

14. Experimental Study on Rheological Disturbance Effect of Gas-Bearing Coal Rock

15. Cross-lingual Text Classification with Heterogeneous Graph Neural Network

16. Analysis of the Impacts of Bearing on Vibration Characteristics of Rotor

17. Characteristics and origin of the Middle Proterozoic Dongshuichang chambersite deposit, Jixian, Tianjin, China

18. Introduction to and classification of manganese deposits of China

19. Characteristics of manganese ore deposits in China

20. A discussion on origin of the polymetal sulphide ores in the Devonian of Central Hunan Basin.

21. Characteristics of manganese ore deposits in China.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

21 results on '"Yang, Peiji"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources