Start Over

Amélioration des modèles de repli par des sacs de mots et des n-grammes à variables

Authors :: Rubino, Raphaël
Lecouteux, Benjamin
Linares, Georges
Laboratoire Informatique d'Avignon (LIA)
Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI
Laboratoire d'Informatique de Grenoble (LIG )
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole (GETALP )
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
Université Grenoble Alpes [2016-2019] (UGA [2016-2019])
LIG
Source :: [Rapport de recherche] LIG. 2016
Publication Year :: 2016
Publisher :: HAL CCSD, 2016.
Abstract: Les modèles classiques de n-grammes manquent de robustesse sur évènements non observés. La littérature suggère des méthodes de lissage, la plus utilisée d'entre elles étant le Kneyser-Ney modifié. Nous proposons d'améliorer ce modèle en réordonnant les possibilités de replis par rapport à l'information mutuelle portée par les mots ; ainsi que par l'utilisation de n-grammes à variables. Nos résultats montrent un gain significatif par rapport un modèle Kneyser-Ney modifié : 0.6% de gain absolu sans adaptation des modèles acoustiques et 0.4% après adaptation. ABSTRACT Improving back-off models with bag of words and hollow-grams Classical n-grams models lack robustness on unseen events. The literature suggests several smoothing methods : empirically, the most effective of these is the modified Kneser-Ney approach. We propose to improve this back-off model : our method boils down to back-off value reordering, according to the mutual information of the words, and to a new hollow-gram model. Results show that our back-off model yields significant improvements to the baseline, based on the modified Kneser-Ney back-off. We obtain a 0.6% absolute word error rate improvement without acoustic adaptation, and 0.4% after adaptation. MOTS-CLÉS : modèles de langage, modèles de replis.

Subjects :: language model
back-off
low-order interpolation
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]

Details

Language :: French
Database :: OpenAIRE
Journal :: [Rapport de recherche] LIG. 2016
Accession number :: edsair.dedup.wf.001..269f618c35a759106037a7d758f116ba

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Amélioration des modèles de repli par des sacs de mots et des n-grammes à variables

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Amélioration des modèles de repli par des sacs de mots et des n-grammes à variables

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources