1. D3.2 - First version of the framework for the collection, cleaning, integration & anonymization of big data
- Author
-
Gilman, Ekaterina, Kostakos, Panos, Cortés, Marta, Mehmood, Hassan, Riekki, Jukka, Byrne, Andrew, Valta, Katerina, Tekes, Stavros, Kumar, Chandan, Sun, Jun, Staab, Steffen, Pragidis, Ioannis, Tsintzos, Panagiotis, Geronikolaou, Georgios, Filareti Tsalakanidou, Chantas, Ioannis, Papastergios, Georgios, Tzoumaka, Paraskevi, Kelessis, Apostolos, Papafilis, Petros, Christantonis Charistes, Kontos, Vasileios, Mimilidis, Manolis, Chatzis, Charalampos, Doudouliakis, Konstantinos, Yiannis Kompatsiaris, Erkazancı, Alperen, Özkan, Hafize İlhan, M. Serdar Yümlü, Gültekin, Habib, Tosunoğlu, Caner, Beeckman, Rebecca, Looveren, Ronny Van, Leroux, Philip, O'Suilleabhain, Darragh, Daly, Maire, Walsh, Elaine, O'Reilly, Anthony, Thornton, Kieran, and O'Brien, Noreen
- Subjects
Big data ,social data ,data collection ,crawling ,harmonization ,environmental data ,cleaning ,interoperability ,economic data ,data sources ,encryption ,anonymization - Abstract
This report delivers the first version of the CUTLER data collection and pre-processing framework. It describes architectural and software solutions developed so far. Namely, it: i) describes the data sources utilized in the first version of the framework; ii) provides architecture for collecting and storing the data in the local testbeds; iii) discusses cleaning, harmonization, interoperability, and data privacy solutions; iv) describes the software to fetch the data to the platform; as well as v) discusses challenges and future work.
- Published
- 2018
- Full Text
- View/download PDF