Search

Your search keyword '"Zheng, Siqi"' showing total 692 results

Search Constraints

Start Over You searched for: Author "Zheng, Siqi" Remove constraint Author: "Zheng, Siqi"
692 results on '"Zheng, Siqi"'

Search Results

1. Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization

2. CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens

3. FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

4. Accompanied Singing Voice Synthesis with Fully Text-controlled Melody

5. Intercity Connectivity and Innovation

6. Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers

7. Self-Distillation Prototypes Network: Learning Robust Speaker Representations without Supervision

8. Dense Outflowing Molecular Gas in Massive Star-forming Regions

9. ERes2NetV2: Boosting Short-Duration Speaker Verification Performance with Computational Efficiency

10. ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec

11. AudioLCM: Text-to-Audio Generation with Latent Consistency Models

12. 3D-Speaker-Toolkit: An Open Source Toolkit for Multi-modal Speaker Verification and Diarization

13. Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models

14. An Embarrassingly Simple Approach for LLM with Strong ASR Capacity

19. Spatial distribution of NH2D in massive star-forming regions

20. Loss Masking Is Not Needed in Decoder-only Transformer for Discrete-token-based ASR

21. Tentative detection of cyanoformamide NCCONH2 in space

22. Sulphur isotopes toward Sagittarius B2 extended envelope in the Galactic Center

23. Mapping Observations of Peptide-like molecules around Sagittarius B2

24. LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

25. Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation

26. FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec

27. UbiPhysio: Support Daily Functioning, Fitness, and Rehabilitation with Action Understanding and Feedback in Natural Language

28. Self-Distillation Prototypes Network: Learning Robust Speaker Representations without Supervision

29. Improving BERT with Hybrid Pooling Network and Drop Mask

30. 3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement

31. Exploring Speaker-Related Information in Spoken Language Understanding for Better Speaker Diarization

32. An Enhanced Res2Net with Local and Global Feature Fusion for Speaker Verification

33. Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings

34. $^{18}$O$/^{17}$O abundance ratio toward a sample of massive star forming regions with parallax distances

35. Imaging Molecular Outflow in Massive Star-forming Regions with HNCO Lines

36. CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking

39. DopplerBAS: Binaural Audio Synthesis Addressing Doppler Effect

40. Contextual Expressive Text-to-Speech

41. Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis

42. Pushing the limits of self-supervised speaker verification using regularized distillation framework

43. A Comparison of Reproducibility Guidelines and Its Implications on Undergraduate Statistical Education

44. Deep Representation Decomposition for Rate-Invariant Speaker Verification

45. PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification

46. The universality in urban commuting across and within cities

47. Reformulating Speaker Diarization as Community Detection With Emphasis On Topological Structure

48. Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data

49. Spatial distribution of HOCN around Sagittarius B2

50. Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios

Catalog

Books, media, physical & digital resources