Start Over

Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System

Authors :: Wisniewski, Guillaume
Zhu, Lichao
Ballier, Nicolas
Yvon, François
Laboratoire de Linguistique Formelle (LLF - UMR7110)
Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité)
Centre de Linguistique Inter-langues, de Lexicologie, de Linguistique Anglaise et de Corpus (CLILLAC-ARP (URP_3967))
Université Paris Cité (UPCité)
Traitement du Langage Parlé (TLP )
Laboratoire Interdisciplinaire des Sciences du Numérique (LISN)
Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Sciences et Technologies des Langues (STL)
Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
Source :: Actes de BlackboxNLP 2022, BlackboxNLP 2022, BlackboxNLP 2022, Dec 2022, Abu Dhabi, United Arab Emirates
Publication Year :: 2022
Publisher :: HAL CCSD, 2022.
Abstract: International audience; Multiple studies have shown that existing NMT systems demonstrate some kind of "gender bias". As a result, MT output appears to err more often for feminine forms and to amplify social gender misrepresentations, which is potentially harmful to users and practioners of these technologies. This paper continues this line of investigations and reports results obtained with a new test set in strictly controlled conditions. This setting allows us to better understand the multiple inner mechanisms that are causing these biases, which include the linguistic expressions of gender, the unbalanced distribution of masculine and feminine forms in the language, the modelling of morphological variation and the training process dynamics. To counterbalance these effects, we formulate several proposals and notably show that modifying the training loss can effectively mitigate such biases.

Subjects :: [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]

Details

Language :: English
Database :: OpenAIRE
Journal :: Actes de BlackboxNLP 2022, BlackboxNLP 2022, BlackboxNLP 2022, Dec 2022, Abu Dhabi, United Arab Emirates
Accession number :: edsair.od.......165..dd90365b94559c049e022785c3793eb2

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Analyzing Gender Translation Errors to Identify Information Flows between the Encoder and Decoder of a NMT System

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources