Back to Search Start Over

Where Did the Political News Event Happen? Primary Focus Location Extraction in Different Languages

Authors :
Bhavani Thuraisingham
Latifur Khan
Maryam Bahojb Imani
Source :
CIC
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Political news reports are populated all over the world in various languages. It has a great value to automatically detect the geolocation from these reports for a better understanding of the associated events. Although various open-source and commercial tools exist to identify geolocation, they fail to identify at a granular level such as locality or city and they do not support most languages. Most of the techniques view the problem in terms of Named Entity Recognition (NER) and identify geolocation information at the country level for a given text. In this paper, we consider English, Spanish and Arabic news articles from different publishers. We define primary focus location as the actual location where the event occurred amongst other focus locations mentioned in the report. Our aim is to extract the primary focus location regardless of the language from articles belonging to different news agencies. We propose a mechanism to identify potential sentences containing focus locations using NER. After that, we perform sentence embedding over words from different languages and then employ a supervised classification mechanism to predict the primary focus location. We also perform bias correction over the training data using a suitable adaptation mechanism to reduce the sampling bias in training data. Our method trains a classifier using bias-corrected training data from news articles published by an agency in one language, while testing the model on news articles published by another agency in a different language. Our empirical results when compared to baseline approaches show superior performance on real-world English, Spanish and Arabic news articles.

Details

Database :
OpenAIRE
Journal :
2019 IEEE 5th International Conference on Collaboration and Internet Computing (CIC)
Accession number :
edsair.doi...........79cdecaa8abdc4c806a2ab463f492e08
Full Text :
https://doi.org/10.1109/cic48465.2019.00017