Back to Search Start Over

Impoverished Language Technology: The Lack of (Social) Class in NLP

Authors :
Curry, Amanda Cercas
Talat, Zeerak
Hovy, Dirk
Publication Year :
2024

Abstract

Since Labov's (1964) foundational work on the social stratification of language, linguistics has dedicated concerted efforts towards understanding the relationships between socio-demographic factors and language production and perception. Despite the large body of evidence identifying significant relationships between socio-demographic factors and language production, relatively few of these factors have been investigated in the context of NLP technology. While age and gender are well covered, Labov's initial target, socio-economic class, is largely absent. We survey the existing Natural Language Processing (NLP) literature and find that only 20 papers even mention socio-economic status. However, the majority of those papers do not engage with class beyond collecting information of annotator-demographics. Given this research lacuna, we provide a definition of class that can be operationalised by NLP researchers, and argue for including socio-economic class in future language technologies.<br />Comment: Accepted to LREC-COLING 2024

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2403.03874
Document Type :
Working Paper