Author: "Doan, Khai" / Topic: computer science - computation and language - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Doan, Khai"' showing total 3 results

Start Over Author "Doan, Khai" Topic computer science - computation and language

3 results on '"Doan, Khai"'

1. Towards Zero-Shot Text-To-Speech for Arabic Dialects

Author: Doan, Khai Duy, Waheed, Abdul, and Abdul-Mageed, Muhammad
Subjects: Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Zero-shot multi-speaker text-to-speech (ZS-TTS) systems have advanced for English, however, it still lags behind due to insufficient resources. We address this gap for Arabic, a language of more than 450 million native speakers, by first adapting a sizeable existing dataset to suit the needs of speech synthesis. Additionally, we employ a set of Arabic dialect identification models to explore the impact of pre-defined dialect labels on improving the ZS-TTS model in a multi-dialect setting. Subsequently, we fine-tune the XTTS\footnote{https://docs.coqui.ai/en/latest/models/xtts.html}\footnote{https://medium.com/machine-learns/xtts-v2-new-version-of-the-open-source-text-to-speech-model-af73914db81f}\footnote{https://medium.com/@erogol/xtts-v1-techincal-notes-eb83ff05bdc} model, an open-source architecture. We then evaluate our models on a dataset comprising 31 unseen speakers and an in-house dialectal dataset. Our automated and human evaluation results show convincing performance while capable of generating dialectal speech. Our study highlights significant potential for improvements in this emerging area of research in Arabic.
Published: 2024

2. Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability

Author: Chen, Wei-Rui, Adebara, Ife, Doan, Khai Duy, Liao, Qisheng, and Abdul-Mageed, Muhammad
Subjects: Computer Science - Computation and Language
Abstract: ChatGPT has recently emerged as a powerful NLP tool that can carry out a variety of tasks. However, the range of languages ChatGPT can handle remains largely a mystery. To uncover which languages ChatGPT `knows', we investigate its language identification (LID) abilities. For this purpose, we compile Babel-670, a benchmark comprising 670 languages representing 24 language families spoken in five continents. Languages in Babel-670 run the gamut from the very high-resource to the very low-resource. We then study ChatGPT's (both GPT-3.5 and GPT-4) ability to (i) identify language names and language codes (ii) under zero- and few-shot conditions (iii) with and without provision of a label set. When compared to smaller finetuned LID tools, we find that ChatGPT lags behind. For example, it has poor performance on African languages. We conclude that current large language models would benefit from further development before they can sufficiently serve diverse communities., Comment: Accepted to NAACL 2024 Findings
Published: 2023

3. The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages

Author: Zhang, Chiyu, Doan, Khai Duy, Liao, Qisheng, and Abdul-Mageed, Muhammad
Subjects: Computer Science - Computation and Language
Abstract: Instruction tuned large language models (LLMs), such as ChatGPT, demonstrate remarkable performance in a wide range of tasks. Despite numerous recent studies that examine the performance of instruction-tuned LLMs on various NLP benchmarks, there remains a lack of comprehensive investigation into their ability to understand cross-lingual sociopragmatic meaning (SM), i.e., meaning embedded within social and interactive contexts. This deficiency arises partly from SM not being adequately represented in any of the existing benchmarks. To address this gap, we present SPARROW, an extensive multilingual benchmark specifically designed for SM understanding. SPARROW comprises 169 datasets covering 13 task types across six primary categories (e.g., anti-social language detection, emotion recognition). SPARROW datasets encompass 64 different languages originating from 12 language families representing 16 writing scripts. We evaluate the performance of various multilingual pretrained language models (e.g., mT5) and instruction-tuned LLMs (e.g., BLOOMZ, ChatGPT) on SPARROW through fine-tuning, zero-shot, and/or few-shot learning. Our comprehensive analysis reveals that existing open-source instruction tuned LLMs still struggle to understand SM across various languages, performing close to a random baseline in some cases. We also find that although ChatGPT outperforms many LLMs, it still falls behind task-specific finetuned models with a gap of 12.19 SPARROW score. Our benchmark is available at: https://github.com/UBC-NLP/SPARROW, Comment: Accepted by EMNLP 2023 Main conference
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

3 results on '"Doan, Khai"'

1. Towards Zero-Shot Text-To-Speech for Arabic Dialects

2. Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability

3. The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

3 results on '"Doan, Khai"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources