Back to Search
Start Over
PPFS: A Scale-out Distributed File System for Post-petascale Systems
- Source :
- HPCC/SmartCity/DSS
- Publication Year :
- 2017
- Publisher :
- Information Processing Society of Japan, 2017.
-
Abstract
- The convergence of high-performance computing and big data, which has become known as the field of extreme big data, is problematic in that file creation in storage systems such as distributed file systems is not optimized. That is, the large workload leads to the simultaneous creation of many files by many processes when creating checkpoints. The need to improve the file creation processes prompted us to design a scale-out distributed file system for post-petascale systems named PPFS. PPFS consists of PPMDS, a scale-out distributed metadata server, and PPOSS, a scalable distributed storage server for flash storage. High file creation performance of PPMDS was achieved by using a key-value store for metadata storage and non-blocking distributed transactions to update multiple entries simultaneously. PPOSS depends on PPOST, an object storage system that manages underlying low-level storage, such as Fusion IO ioDrive, a flash device connected through PCI express supporting OpenNVM. High file creation performance was attained by implementing the PPFS prototype using file creation optimization, termed bulk creation, to reduce the amount of communication between PPMDS and PPOSS. Moreover, to enhance the I/O performance of PPOSS when the client process and PPOSS run on the same node, PPOSS accesses the local storage device directly. The prototype implementation of PPFS with file creation optimization achieves 119,000 operations per second for file creation when using five metadata servers and 128 client processes, thereby exceeding the performance of IndexFS 2.17 times. With local access optimization, PPOSS reached its limit at a block size of 16 KiB, an improvement of 1.5 times compared to before optimization. Furthermore, this evaluation indicates that PPFS has scalability on file creation and IO performance, that is required for post-petascale systems.
- Subjects :
- General Computer Science
Computer science
Distributed computing
Stub file
02 engineering and technology
computer.software_genre
Metadata server
File server
Server
Data file
Metadata management
Data_FILES
0202 electrical engineering, electronic engineering, information engineering
Distributed transaction
Versioning file system
SSH File Transfer Protocol
Distributed File System
Flash file system
File system fragmentation
Distributed database
Computer file
Device file
020206 networking & telecommunications
computer.file_format
Unix file types
Virtual file system
Torrent file
Metadata
Object storage
File Control Block
Self-certifying File System
Journaling file system
Scalability
Operating system
020201 artificial intelligence & image processing
computer
Subjects
Details
- ISSN :
- 18826652
- Volume :
- 25
- Database :
- OpenAIRE
- Journal :
- Journal of Information Processing
- Accession number :
- edsair.doi.dedup.....a2d12a07ffbeac47ed018b94b5673b8d
- Full Text :
- https://doi.org/10.2197/ipsjjip.25.438