Search

Your search keyword '"Guo, Yiwei"' showing total 23 results

Search Constraints

Start Over You searched for: Author "Guo, Yiwei" Remove constraint Author: "Guo, Yiwei" Publication Type Reports Remove constraint Publication Type: Reports
23 results on '"Guo, Yiwei"'

Search Results

1. Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding

2. LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec

3. TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration

4. vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders

5. DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation

6. On the Effectiveness of Acoustic BPE in Decoder-Only TTS

7. Attention-Constrained Inference for Robust Decoder-Only Text-to-Speech

8. StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations

9. The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge

10. VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

11. SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention

12. Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations

13. Acoustic BPE for Speech Generation with Discrete Tokens

14. Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition

15. VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching

16. DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech

17. UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding

18. Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge

19. DiffVoice: Text-to-Speech with Latent Diffusion

20. EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance

21. VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature

22. Unsupervised word-level prosody tagging for controllable speech synthesis

23. GlobalWalk: Learning Global-aware Node Embeddings via Biased Sampling

Catalog

Books, media, physical & digital resources