Search

Your search keyword '"Liu, Shujie"' showing total 1,309 results

Search Constraints

Start Over You searched for: Author "Liu, Shujie" Remove constraint Author: "Liu, Shujie"
1,309 results on '"Liu, Shujie"'

Search Results

1. Autoregressive Speech Synthesis without Vector Quantization

2. VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment

3. VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers

4. TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

5. CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations

6. RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

7. WavLLM: Towards Robust and Adaptive Speech Large Language Model

8. Advanced Long-Content Speech Recognition With Factorized Neural Transducer

9. Boosting Large Language Model for Speech Synthesis: An Empirical Study

10. COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning

12. Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction

13. WavMark: Watermarking for Audio Generation

14. SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

19. On decoder-only architecture for speech-to-text and large language model integration

20. OpenNDD: Open Set Recognition for Neurodevelopmental Disorders Detection

21. Accelerating Transducers through Adjacent Token Merging

22. Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

23. VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

24. ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

25. Code-Switching Text Generation and Injection in Mandarin-English ASR

26. Target Sound Extraction with Variable Cross-modality Clues

27. Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

28. Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

29. Game Engine Technology in Cultural Heritage Digitization Application Prospect–Taking the Digital Cave of the Mogao Caves in China as an Example

30. Research on Hydrate Formation Risk in the Wellbore of Deepwater Dual-Source Co-production

31. Numerical Simulation of Breathing Effect Induced by Drilling in Deep-Water Shallow Formations

34. Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

35. BEATs: Audio Pre-Training with Acoustic Tokenizers

36. VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning

37. Exploring WavLM on Speech Enhancement

38. LongFNT: Long-form Speech Recognition with Factorized Neural Transducer

39. LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers

40. Two-Stream Network for Sign Language Recognition and Translation

42. Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation

43. SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training

44. SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data

45. Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training

46. The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task

47. Ultra Fast Speech Separation Model with Teacher Student Learning

48. Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?

49. Speech Pre-training with Acoustic Piece

50. Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data

Catalog

Books, media, physical & digital resources