1. #Election2020: the first public Twitter dataset on the 2020 US Presidential election
- Author
-
Emily Chen, Ashok Deb, and Emilio Ferrara
- Subjects
FOS: Computer and information sciences ,Presidential election ,media_common.quotation_subject ,Twitter ,050801 communication & media studies ,Transportation ,Context (language use) ,ComputingMilieux_LEGALASPECTSOFCOMPUTING ,Politics ,0508 media and communications ,Artificial Intelligence ,Political science ,Voting ,050602 political science & public administration ,Social media ,media_common ,Social and Information Networks (cs.SI) ,Presidential system ,business.industry ,05 social sciences ,Computer Science - Social and Information Networks ,Public relations ,Democracy ,0506 political science ,Computational sociology ,Social media analysis ,business ,Research Article - Abstract
The integrity of democratic political discourse is at the core to guarantee free and fair elections. With social media often dictating the tones and trends of politics-related discussion, it is of paramount important to be able to study online chatter, especially in the run up to important voting events, like in the case of the upcoming November 3, 2020 U.S. Presidential Election. Limited access to social media data is often the first barrier to impede, hinder, or slow down progress, and ultimately our understanding of online political discourse. To mitigate this issue and try to empower the Computational Social Science research community, we decided to publicly release a massive-scale, longitudinal dataset of U.S. politics- and election-related tweets. This multilingual dataset that we have been collecting for over one year encompasses hundreds of millions of tweets and tracks all salient U.S. politics trends, actors, and events between 2019 and 2020. It predates and spans the whole period of Republican and Democratic primaries, with real-time tracking of all presidential contenders of both sides of the isle. After that, it focuses on presidential and vice-presidential candidates. Our dataset release is curated, documented and will be constantly updated on a weekly-basis, until the November 3, 2020 election and beyond. We hope that the academic community, computational journalists, and research practitioners alike will all take advantage of our dataset to study relevant scientific and social issues, including problems like misinformation, information manipulation, interference, and distortion of online political discourse that have been prevalent in the context of recent election events in the United States and worldwide. Our dataset is available at: https://github.com/echen102/us-pres-elections-2020, Our dataset is available at: https://github.com/echen102/us-pres-elections-2020
- Published
- 2021