Search

Your search keyword '"Liu, Shujie"' showing total 1,180 results

Search Constraints

Start Over You searched for: Author "Liu, Shujie" Remove constraint Author: "Liu, Shujie" Publication Year Range Last 10 years Remove constraint Publication Year Range: Last 10 years
1,180 results on '"Liu, Shujie"'

Search Results

1. Autoregressive Speech Synthesis without Vector Quantization

2. VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment

3. VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers

4. TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

5. CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations

6. RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

7. WavLLM: Towards Robust and Adaptive Speech Large Language Model

8. Advanced Long-Content Speech Recognition With Factorized Neural Transducer

12. Boosting Large Language Model for Speech Synthesis: An Empirical Study

13. COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning

16. Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction

17. WavMark: Watermarking for Audio Generation

18. SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

19. On decoder-only architecture for speech-to-text and large language model integration

20. OpenNDD: Open Set Recognition for Neurodevelopmental Disorders Detection

21. Accelerating Transducers through Adjacent Token Merging

22. Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

23. VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

24. ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

25. Game Engine Technology in Cultural Heritage Digitization Application Prospect–Taking the Digital Cave of the Mogao Caves in China as an Example

26. Research on Hydrate Formation Risk in the Wellbore of Deepwater Dual-Source Co-production

27. Numerical Simulation of Breathing Effect Induced by Drilling in Deep-Water Shallow Formations

30. Code-Switching Text Generation and Injection in Mandarin-English ASR

31. Target Sound Extraction with Variable Cross-modality Clues

32. Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

33. Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

34. Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

35. BEATs: Audio Pre-Training with Acoustic Tokenizers

36. VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning

37. Exploring WavLM on Speech Enhancement

38. LongFNT: Long-form Speech Recognition with Factorized Neural Transducer

39. LAMASSU: Streaming Language-Agnostic Multilingual Speech Recognition and Translation Using Neural Transducers

40. Two-Stream Network for Sign Language Recognition and Translation

41. Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation

42. SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training

43. SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data

45. Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training

46. The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task

47. Ultra Fast Speech Separation Model with Teacher Student Learning

48. Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?

49. Speech Pre-training with Acoustic Piece

Catalog

Books, media, physical & digital resources