Search

Your search keyword '"Li, Jinyu"' showing total 2,602 results

Search Constraints

Start Over You searched for: Author "Li, Jinyu" Remove constraint Author: "Li, Jinyu"
2,602 results on '"Li, Jinyu"'

Search Results

1. Target word activity detector: An approach to obtain ASR word boundaries without lexicon

2. Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation

3. Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech

4. Autoregressive Speech Synthesis without Vector Quantization

5. VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment

6. Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation

7. An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS

8. VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers

9. Total-Duration-Aware Duration Modeling for Text-to-Speech Systems

10. TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

11. CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations

12. RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

13. WavLLM: Towards Robust and Adaptive Speech Large Language Model

14. Advanced Long-Content Speech Recognition With Factorized Neural Transducer

15. NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

17. Boosting Large Language Model for Speech Synthesis: An Empirical Study

18. Future Intelligent Data link and Unit-Level Combat System Based on Global Combat Cloud

19. COSMIC: Data Efficient Instruction-tuning For Speech In-Context Learning

22. Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation

23. RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments

24. Enhanced Edge-Perceptual Guided Image Filtering

25. Improving Stability in Simultaneous Speech Translation: A Revision-Controllable Decoding Approach

26. ResidualTransformer: Residual Low-Rank Learning with Weight-Sharing for Transformer Layers

27. t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability

28. DiariST: Streaming Speech Translation with Speaker Diarization

29. SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

30. Deepsea: A Meta-ocean Prototype for Undersea Exploration

31. Pre-training End-to-end ASR Models with Augmented Speech Samples Queried by Text

32. On decoder-only architecture for speech-to-text and large language model integration

33. Token-Level Serialized Output Training for Joint Streaming ASR and ST Leveraging Textual Alignments

34. Accelerating Transducers through Adjacent Token Merging

35. Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

36. Accurate and Structured Pruning for Efficient Automatic Speech Recognition

37. VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

38. PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds

48. Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

49. Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

50. Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation

Catalog

Books, media, physical & digital resources