Author: "Valentin Pelloin" / Topic: [info.info-lg]computer science [cs]/machine learning [cs.lg] - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Valentin Pelloin"' showing total 2 results

Start Over Author "Valentin Pelloin" Topic [info.info-lg]computer science [cs]/machine learning [cs.lg]

2 results on '"Valentin Pelloin"'

1. Using ASR-Generated Text for Spoken Language Modeling

Author: Nicolas Hervé, Valentin Pelloin, Benoit Favre, Franck Dary, Antoine Laurent, Sylvain Meignier, Laurent Besacier, Institut National de l'Audiovisuel (INA), Laboratoire d'Informatique de l'Université du Mans (LIUM), Le Mans Université (UM), Laboratoire d'Informatique et Systèmes (LIS), Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Centre National de la Recherche Scientifique (CNRS), Naver Labs Europe [Meylan], and ANR-19-CE23-0004,AISSPER,Intelligence artificielle pour la compréhension du langage parlé contrôlée sémantiquement(2019)
Subjects: [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing, [INFO]Computer Science [cs], [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Abstract: International audience; This papers aims at improving spoken language modeling (LM) using very large amount of automatically transcribed speech. We leverage the INA (French National Audiovisual Institute 1) collection and obtain 19GB of text after applying ASR on 350,000 hours of diverse TV shows. From this, spoken language models are trained either by fine-tuning an existing LM (FlauBERT 2) or through training a LM from scratch. The new models (FlauBERT-Oral) are shared with the community 3 and are evaluated not only in terms of word prediction accuracy but also for two downstream tasks: classification of TV shows and syntactic parsing of speech. Experimental results show that FlauBERT-Oral is better than its initial FlauBERT version demonstrating that, despite its inherent noisy nature, ASR-Generated text can be useful to improve spoken language modeling.
Published: 2022
Full Text: View/download PDF

2. End2End Acoustic to Semantic Transduction

Author: Renato De Mori, Antoine Laurent, Sylvain Meignier, Antoine Caubrière, Valentin Pelloin, Nathalie Camelin, Yannick Estève, Laboratoire d'Informatique de l'Université du Mans (LIUM), Le Mans Université (UM), Laboratoire Informatique d'Avignon (LIA), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI, and McGill University = Université McGill [Montréal, Canada]
Subjects: FOS: Computer and information sciences, Sound (cs.SD), Computer science, Speech recognition, Feature extraction, Word error rate, Context (language use), 02 engineering and technology, Transduction (psychology), Semantics, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], Computer Science - Sound, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Reduction (complexity), 030507 speech-language pathology & audiology, 03 medical and health sciences, [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Audio and Speech Processing (eess.AS), 0202 electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering, Computer Science - Computation and Language, [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD], 020201 artificial intelligence & image processing, Language model, 0305 other medical science, Computation and Language (cs.CL), Spoken language, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: In this paper, we propose a novel end-to-end sequence-to-sequence spoken language understanding model using an attention mechanism. It reliably selects contextual acoustic features in order to hypothesize semantic contents. An initial architecture capable of extracting all pronounced words and concepts from acoustic spans is designed and tested. With a shallow fusion language model, this system reaches a 13.6 concept error rate (CER) and an 18.5 concept value error rate (CVER) on the French MEDIA corpus, achieving an absolute 2.8 points reduction compared to the state-of-the-art. Then, an original model is proposed for hypothesizing concepts and their values. This transduction reaches a 15.4 CER and a 21.6 CVER without any new type of context., Comment: Accepted at IEEE ICASSP 2021
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

2 results on '"Valentin Pelloin"'

1. Using ASR-Generated Text for Spoken Language Modeling

2. End2End Acoustic to Semantic Transduction

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

Publisher

2 results on '"Valentin Pelloin"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources