Back to Search
Start Over
Optimizing small file storage process of the HDFS which based on the indexing mechanism
- Source :
- 2017 IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA).
- Publication Year :
- 2017
- Publisher :
- IEEE, 2017.
-
Abstract
- As an open source implementation of GFS, Hadoop Distributed File System (HDFS) has high efficiency on handling the large files. However, due to its own master-slave structure and the storage of metadata, the efficiency is low when dealing with massive small files. It occupies large amount of NameNode memory, reduces access efficiency, and delays concurrent user access. In order to improve this performance efficiency, this paper studies the method of processing small files on HDFS. According to the file storage process, this paper proposes a small file processing scheme based on index mechanism. Before the file is uploaded to the HDFS cluster, the file size is measured. The small files are indexed and merged. If it is a small file, then it will be indexed and processed. And it will be created an index file to save the index information of the small file. At the same time, this scheme introduces the distributed caching strategy to further optimize the I/O operation of small files, so as to improve the reading speed. Experimental results show that compared with the original HDFS and HAR scheme, this scheme has a great improvement in obtain memory efficiency and consumption of memory resources.
- Subjects :
- Computer science
02 engineering and technology
computer.software_genre
File size
Upload
020401 chemical engineering
Data file
Data_FILES
0202 electrical engineering, electronic engineering, information engineering
Versioning file system
0204 chemical engineering
Distributed File System
File system fragmentation
Flash file system
Indexed file
Database
Computer file
Device file
020206 networking & telecommunications
computer.file_format
Torrent file
Memory-mapped file
File Control Block
Memory management
Self-certifying File System
Journaling file system
Operating system
computer
File storage
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2017 IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA)
- Accession number :
- edsair.doi...........f4424e4596a96c810d1a00f754d78f79
- Full Text :
- https://doi.org/10.1109/icccbda.2017.7951882