Back to Search
Start Over
Streaming Social Media Data Analysis for Events Extraction and Warehousing using Hadoop and Storm: Drug Abuse Case Study
- Source :
- KES
- Publication Year :
- 2019
- Publisher :
- Elsevier BV, 2019.
-
Abstract
- In the age of big data, entreprises’ information systems are ingested with data generated from social media which raises the need to integrate it in their business intelligence process for better decision making. However, these new data, streaming, voluminous, unstructured and variant, bring existing data warehousing systems and integration tools to their knees which motivated us to conduct this research work. In this paper, we propose a large scale system based on distributed storage and parallel processing to succeed social media data warehousing. In fact, we combine Storm and Hadoop for structured events extraction from social media data and their integration in the data warehouse. We take the advantage of real time analysis of streaming data offered by Storm and batch processing of large volumes of data of Hadoop which facilitated streaming social media data analysis task. For conceptual representation, we propose a customized multidimensional model in which we add an intermediate table to connect the social media data warehouse with the enterprise data warehouse. We implement it using Oracle 12c and we fed it with events extracted from 1000 000 tweets using Pentaho data integration tool.
- Subjects :
- Computer science
business.industry
Big data
020206 networking & telecommunications
02 engineering and technology
computer.software_genre
Data science
Data warehouse
Oracle
Distributed data store
Business intelligence
0202 electrical engineering, electronic engineering, information engineering
Information system
General Earth and Planetary Sciences
Table (database)
020201 artificial intelligence & image processing
Social media
business
computer
General Environmental Science
Data integration
Subjects
Details
- ISSN :
- 18770509
- Volume :
- 159
- Database :
- OpenAIRE
- Journal :
- Procedia Computer Science
- Accession number :
- edsair.doi...........f4aaf2e872d838c25b3b6f579edf7e0f