Back to Search Start Over

Optimizing the storage of massive electronic pedigrees in HDFS

Authors :
Yin Zhang
Wei Wang
Weili Han
Chang Lei
Source :
IOT
Publication Year :
2012
Publisher :
IEEE, 2012.

Abstract

Benefiting from trustworthily tracking of the processes in the production, processing, storage, transportation and sale phases, an electronic pedigree system becomes an important technology of the Internet of Things. In an electronic pedigree system, small-sized but huge volume of electronic pedigrees in the XML format will be generated, stored, and retrieved. Unfortunately, study of these massive electronic pedigrees' storage in an electronic pedigree system, which is in the form of small XML files, is rarely concerned. We, therefore, try to leverage Hadoop to solve the storage problem of massive electronic pedigrees, by the optimization of storing and accessing massive small XML files in HDFS. First, all correlated small XML files of the same envelope are merged into a larger file to reduce the metadata occupation at NameNode. Second, a prefetching mechanism and a remerging mechanism are used to improve the efficiency of accessing small XML files. Finally, we implement a prototype to evaluate the effectiveness and efficiency comparing with the origin HDFS. The results show that the optimized approach is able to reduce the memory consumption of NameNodes by up to 50%, improve performance of storing by up to 91%, and accelerate accessing by up to 88% in Hadoop.

Details

Database :
OpenAIRE
Journal :
2012 3rd IEEE International Conference on the Internet of Things
Accession number :
edsair.doi...........f2c9163141ab58708be7259003df0552