Back to Search Start Over

Building a Taiwanese Mandarin Spoken Language Model: A First Attempt

Authors :
Yang, Chih-Kai
Fu, Yu-Kuan
Li, Chen-An
Lin, Yi-Cheng
Lin, Yu-Xiang
Chen, Wei-Chih
Chung, Ho Lam
Kuan, Chun-Yi
Huang, Wei-Ping
Lu, Ke-Han
Lin, Tzu-Quan
Wang, Hsiu-Hsuan
Hu, En-Pei
Hsu, Chan-Jan
Tseng, Liang-Hsuan
Chiu, I-Hsiang
Sanga, Ulin
Chen, Xuanjun
Hsu, Po-chun
Yang, Shu-wen
Lee, Hung-yi
Publication Year :
2024

Abstract

This technical report presents our initial attempt to build a spoken large language model (LLM) for Taiwanese Mandarin, specifically tailored to enable real-time, speech-to-speech interaction in multi-turn conversations. Our end-to-end model incorporates a decoder-only transformer architecture and aims to achieve seamless interaction while preserving the conversational flow, including full-duplex capabilities allowing simultaneous speaking and listening. The paper also details the training process, including data preparation with synthesized dialogues and adjustments for real-time interaction. We also developed a platform to evaluate conversational fluency and response coherence in multi-turn dialogues. We hope the release of the report can contribute to the future development of spoken LLMs in Taiwanese Mandarin.<br />Comment: Work in progress

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2411.07111
Document Type :
Working Paper