2,216 results on '"File system"'
Search Results
2. StorStack: A full-stack design for in-storage file systems
- Author
-
Hu, Juncheng, Chen, Shuo, Wei, Haoyang, Wang, Guoyu, Pei, Chenju, and Che, Xilong
- Published
- 2025
- Full Text
- View/download PDF
3. Reversing File Access Control Using Disk Forensics on Low-Level Flash Memory.
- Author
-
Rother, Caleb and Chen, Bo
- Subjects
FLASH memory ,SYSTEMS design ,METADATA - Abstract
In the history of access control, nearly every system designed has relied on the operating system (OS) to enforce the access control protocols. However, if the OS (and specifically root access) is compromised, there are few if any solutions that can get users back into their system efficiently. In this work, we have proposed a novel approach that allows secure and efficient rollback of file access control after an adversary compromises the OS and corrupts the access control metadata. Our key observation is that the underlying flash memory typically performs out-of-place updates. Taking advantage of this unique feature, we can extract the "stale data" specific for OS access control, by performing low-level disk forensics over the raw flash memory. This allows efficiently rolling back the OS access control to a state pre-dating the compromise. To justify the feasibility of the proposed approach, we have implemented it in a computing device using file system EXT2/EXT3 and open-sourced flash memory firmware OpenNFM. We also evaluated the potential impact of our design on the original system. Experimental results indicate that the performance of the affected drive is not significantly impacted. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Reversing File Access Control Using Disk Forensics on Low-Level Flash Memory
- Author
-
Caleb Rother and Bo Chen
- Subjects
access control ,recovery ,disk forensics ,flash memory ,file system ,Technology (General) ,T1-995 - Abstract
In the history of access control, nearly every system designed has relied on the operating system (OS) to enforce the access control protocols. However, if the OS (and specifically root access) is compromised, there are few if any solutions that can get users back into their system efficiently. In this work, we have proposed a novel approach that allows secure and efficient rollback of file access control after an adversary compromises the OS and corrupts the access control metadata. Our key observation is that the underlying flash memory typically performs out-of-place updates. Taking advantage of this unique feature, we can extract the “stale data” specific for OS access control, by performing low-level disk forensics over the raw flash memory. This allows efficiently rolling back the OS access control to a state pre-dating the compromise. To justify the feasibility of the proposed approach, we have implemented it in a computing device using file system EXT2/EXT3 and open-sourced flash memory firmware OpenNFM. We also evaluated the potential impact of our design on the original system. Experimental results indicate that the performance of the affected drive is not significantly impacted.
- Published
- 2024
- Full Text
- View/download PDF
5. Using the Ethereum Blockchain in the Internet of Things Network for IT Diagnostics
- Author
-
U. A. Vishniakou, YiWei Xia, and Chuyue Yu
- Subjects
internet of things ,ethereum blockchain ,file system ,voice data ,privacy ,Information technology ,T58.5-58.64 - Abstract
The article discusses the use of Ethereum blockchain technology in the Internet of Things (IoT) network for IT diagnostics of patients, which increases data security and user privacy. This integration is proving effective for storing and managing sensitive data of patients with neurological diseases. An integrated system architecture has been developed that combines the IoT network, the IPFS (InterPlanetary File System) file structure with the Ethereum blockchain to create a reliable data storage model. This system ensures efficient, secure and transparent data processing, optimizing the processes of data registration, authorization and verification. Using IPFS for decentralized file storage, along with the Ethereum blockchain to create tamper-proof medical records, provides increased efficiency, scalability and privacy. During the experiments, the process of creating and testing the system was implemented, including setting up the environment, connecting an IPFS node, programming Ethereum smart contracts, sampling voice data and storing their hashes.
- Published
- 2024
- Full Text
- View/download PDF
6. Sequentialized Virtual File System: A Virtual File System Enabling Address Sequentialization for Flash-Based Solid State Drives.
- Author
-
Hwang, Inhwi, Kim, Sunggon, Eom, Hyeonsang, and Son, Yongseok
- Subjects
HARD disks ,REFUSE collection ,COMPUTER systems ,SOLID state drives - Abstract
Solid-state drives (SSDs) are widely adopted in mobile devices, desktop PCs, and data centers since they offer higher throughput, lower latency, and lower power consumption to modern computing systems and applications compared with hard disk drives (HDDs). However, the performance of the SSDs can be degraded depending on the I/O access pattern due to the unique characteristics of SSDs. For example, random I/O operation degrades the SSD performance since it reduces the spatial locality and induces garbage collection (GC) overhead. In this paper, we present an address reshaping scheme in a virtual file system (VFS) called sVFS for improving performance and easy deployment. To do this, it first sequentializes a random access pattern in the VFS layer which is an abstract layer on top of a more concrete file system. Thus, our scheme is independent and easily deployed on any concrete file systems, block layer configuration (e.g., RAID), and devices. Second, we adopt a mapping table for managing sequentialized addresses, which guarantees correct read operations. Third, we support transaction processing for updating the mapping table to avoid sacrificing the consistency. We implement our scheme at the VFS layer in Linux kernel 5.15.34. The evaluation results show that our scheme improve the random write throughput by up to 27%, 36%, 34%, and 2.35× using the microbenchmark and 25%, 22%, 20%, and 3.51× using the macrobenchmark compared with the existing scheme in the case of EXT4, F2FS, XFS, and BTRFS, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. A Lightweight File System Design for Unikernel.
- Author
-
Cho, Kyungwoon and Bahn, Hyokyung
- Subjects
ELECTRONIC file management ,SYSTEMS design - Abstract
Unikernels are specialized operating system (OS) kernels optimized for a single application or service, offering advantages such as rapid boot times, high performance, minimal memory usage, and enhanced security compared to general-purpose OS kernels. Unikernel applications must remain compatible with the runtime environment of general-purpose kernels, either through binary or source compatibility. As a result, many Unikernel projects have prioritized system call compatibility over performance enhancements. In this paper, we explore the design principles of Unikernel file systems and introduce a new file system tailored for Unikernels named ULFS (Ultra Lightweight File System). ULFS provides system call services akin to those of general-purpose OS kernels but achieves superior performance and security with significantly fewer system resources. Specifically, ULFS is developed as a lightweight file system embracing Unikernel design principles. It streamlines system calls, removes unnecessary locks, and omits permission checks for multiple users, utilizing a non-hypervisor architecture. This approach significantly reduces the memory footprint of the file system and enhances performance. Through measurement studies, we assess the performance and memory requirements of various file systems from major Unikernel projects. Our findings demonstrate that ULFS surpasses several existing Unikernel file systems, including Rumpvfs, Ramfs-u, Ramfs-q, 9pfs, and Hcfs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Efficient Data Placement in Deduplication Enabled ZenFS via CRC-Based Prediction
- Author
-
Safdar Jamil, Joseph Ro, Joo-Young Hwang, and Youngjae Kim
- Subjects
Zoned namespace SSD ,data deduplication ,file system ,data placement ,CRC32-checksum ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The Zoned Namespace (ZNS) interface shifts data management responsibility to upper-level applications, requiring them to reclaim space by issuing the zone-reset command to ZNS SSD devices, a process known as garbage collection (GC). Application-level GC can lead to performance degradation due to the high valid data copy overhead, which is further exacerbated by the larger GC units in ZNS SSDs. However, the impact of larger GC units can be mitigated if GC operations are made interruptible, allowing I/O requests to be served during zone resets or block reclamation. Moreover, the adoption of offline data deduplication as a storage optimization technique in ZNS-based file systems like ZenFS presents additional challenges. Offline deduplication must consider lifetime-based file allocation to avoid deduplicating hot data, and placing unique and duplicate data blocks together can further increase valid data copy overhead during GC. To address these issues, we propose DeZNS, an innovative data placement strategy for deduplication-enabled ZenFS. DeZNS tackles the increased valid data copy overhead during GC in offline deduplication by employing a lightweight CRC32 checksum-based method to predict potential duplicates with minimal performance impact, segregating unique and duplicate data blocks. This segregation reduces valid data migration overhead during GC, while the interruptible GC mechanism ensures that ongoing I/O requests are not delayed during zone resets, maintaining ZenFS performance. Additionally, DeZNS integrates an offline deduplication module that operates on segregated zones. Our extensive evaluation shows that DeZNS reduces valid data migration by 28% compared to baseline ZenFS and by up to $2\times $ compared to naive offline deduplication in micro-benchmarks.
- Published
- 2024
- Full Text
- View/download PDF
9. Leveraging Local LLMs for Secure In-System Task Automation With Prompt-Based Agent Classification
- Author
-
Suthir Sriram, C. H. Karthikeya, K. P. Kishore Kumar, Nivethitha Vijayaraj, and Thangavel Murugan
- Subjects
File system ,few-shot prompting ,LangChain ,LLM ,prompt engineering ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Recent progress in the field of intelligence has led to the creation of powerful large language models (LLMs). While these models show promise in improving personal computing experiences concerns surrounding data privacy and security have hindered their integration with sensitive personal information. In this study, a new framework is proposed to merge LLMs with personal file systems, enabling intelligent data interaction while maintaining strict privacy safeguards. The methodology organizes tasks based on LLM agents, which apply designated tags to the tasks before sending them to specific LLM modules. Every module is has its own function, including file search, document summarization, code interpretation, and general tasks, to make certain that all processing happens locally on the user’s device. Findings indicate high accuracy across agents: classification agent managed to get an accuracy rating of 86%, document summarization reached a BERT score of 0.9243. The key point of this framework is that it splits the LLM system into modules, which enables future development by integrating new task-specific modules as required. Findings suggest that integrating local LLMs can significantly improve interactions with file systems without compromising data privacy.
- Published
- 2024
- Full Text
- View/download PDF
10. Forensic Detection of Timestamp Manipulation for Digital Forensic Investigation
- Author
-
Junghoon Oh, Sangjin Lee, and Hyunuk Hwang
- Subjects
File system ,forensics ,anti-forensic countermeasures ,timestamp manipulation ,forensic detection ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
File system forensics is one of the most important areas of digital forensic investigations. To date, various file system forensic methods have been studied, of which anti-forensic countermeasures include deleted file recovery, metadata recovery, and metadata manipulation detection. In particular, manipulation detection of timestamps, which are important file metadata, is one of the key techniques in digital forensic investigations. Existing detection methods for file timestamp manipulation in the New Technology File System (NTFS) have been studied based on various file system and operating system artifacts. This paper compares and analyzes the features and limitations of various existing detection methods and confirms that the NTFS journal-based detection method is the most effectively way to detect timestamp manipulation. However, previous NTFS journal-based detection methods have limitations such as incorrectly identifying normal events as manipulation or detecting manipulation only in limited cases. Therefore, we propose a new detection algorithm that can overcome these limitations. The proposed detection algorithm was implemented as a tool and verified through performance comparison experiments with existing detection methods. The results of experiment showed that the proposed detection algorithm has significantly improved performance by detecting timestamp manipulations that were not detected by previous detection methods and identifying normal events that were misidentified by existing detection methods. Finally, we introduce a case in which existing detection methods and the proposed detection algorithm are applied to malware that performs file timestamp manipulation in real-world advanced persistent threat attacks. The results of which confirm the superiority of the proposed detection algorithm.
- Published
- 2024
- Full Text
- View/download PDF
11. Sequentialized Virtual File System: A Virtual File System Enabling Address Sequentialization for Flash-Based Solid State Drives
- Author
-
Inhwi Hwang, Sunggon Kim, Hyeonsang Eom, and Yongseok Son
- Subjects
operating system ,file system ,virtual file system ,solid-state drive ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Solid-state drives (SSDs) are widely adopted in mobile devices, desktop PCs, and data centers since they offer higher throughput, lower latency, and lower power consumption to modern computing systems and applications compared with hard disk drives (HDDs). However, the performance of the SSDs can be degraded depending on the I/O access pattern due to the unique characteristics of SSDs. For example, random I/O operation degrades the SSD performance since it reduces the spatial locality and induces garbage collection (GC) overhead. In this paper, we present an address reshaping scheme in a virtual file system (VFS) called sVFS for improving performance and easy deployment. To do this, it first sequentializes a random access pattern in the VFS layer which is an abstract layer on top of a more concrete file system. Thus, our scheme is independent and easily deployed on any concrete file systems, block layer configuration (e.g., RAID), and devices. Second, we adopt a mapping table for managing sequentialized addresses, which guarantees correct read operations. Third, we support transaction processing for updating the mapping table to avoid sacrificing the consistency. We implement our scheme at the VFS layer in Linux kernel 5.15.34. The evaluation results show that our scheme improve the random write throughput by up to 27%, 36%, 34%, and 2.35× using the microbenchmark and 25%, 22%, 20%, and 3.51× using the macrobenchmark compared with the existing scheme in the case of EXT4, F2FS, XFS, and BTRFS, respectively.
- Published
- 2024
- Full Text
- View/download PDF
12. A Lightweight File System Design for Unikernel
- Author
-
Kyungwoon Cho and Hyokyung Bahn
- Subjects
Unikernel ,file system ,lightweight ,non-hypervisor ,ULFS ,lock-free file system ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Unikernels are specialized operating system (OS) kernels optimized for a single application or service, offering advantages such as rapid boot times, high performance, minimal memory usage, and enhanced security compared to general-purpose OS kernels. Unikernel applications must remain compatible with the runtime environment of general-purpose kernels, either through binary or source compatibility. As a result, many Unikernel projects have prioritized system call compatibility over performance enhancements. In this paper, we explore the design principles of Unikernel file systems and introduce a new file system tailored for Unikernels named ULFS (Ultra Lightweight File System). ULFS provides system call services akin to those of general-purpose OS kernels but achieves superior performance and security with significantly fewer system resources. Specifically, ULFS is developed as a lightweight file system embracing Unikernel design principles. It streamlines system calls, removes unnecessary locks, and omits permission checks for multiple users, utilizing a non-hypervisor architecture. This approach significantly reduces the memory footprint of the file system and enhances performance. Through measurement studies, we assess the performance and memory requirements of various file systems from major Unikernel projects. Our findings demonstrate that ULFS surpasses several existing Unikernel file systems, including Rumpvfs, Ramfs-u, Ramfs-q, 9pfs, and Hcfs.
- Published
- 2024
- Full Text
- View/download PDF
13. An efficient wear-leveling-aware multi-grained allocator for persistent memory file systems.
- Author
-
Yu, Zhiwang, Zhang, Runyu, Yang, Chaoshu, Nie, Shun, and Liu, Duo
- Abstract
Copyright of Frontiers of Information Technology & Electronic Engineering is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
14. NICFS: a file system based on persistent memory and SmartNIC.
- Author
-
Yang, Yitian and Lu, Youyou
- Abstract
Copyright of Frontiers of Information Technology & Electronic Engineering is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
15. Dynamic Multimedia Encryption Using a Parallel File System Based on Multi-Core Processors.
- Author
-
Khashan, Osama A., Khafajah, Nour M., Alomoush, Waleed, Alshinwan, Mohammad, Alamri, Sultan, Atawneh, Samer, and Alsmadi, Mutasem K.
- Subjects
- *
DATA encryption , *MULTICORE processors , *CRYPTOGRAPHY , *USER-centered system design , *COMPUTER architecture - Abstract
Securing multimedia data on disk drives is a major concern because of their rapidly increasing volumes over time, as well as the prevalence of security and privacy problems. Existing cryptographic schemes have high computational costs and slow response speeds. They also suffer from limited flexibility and usability from the user side, owing to continuous routine interactions. Dynamic encryption file systems can mitigate the negative effects of conventional encryption applications by automatically handling all encryption operations with minimal user input and a higher security level. However, most state-of-the-art cryptographic file systems do not provide the desired performance because their architectural design does not consider the unique features of multimedia data or the vulnerabilities related to key management and multi-user file sharing. The recent move towards multi-core processor architecture has created an effective solution for reducing the computational cost and maximizing the performance. In this paper, we developed a parallel FUSE-based encryption file system called ParallelFS for storing multimedia files on a disk. The developed file system exploits the parallelism of multi-core processors and implements a hybrid encryption method for symmetric and asymmetric ciphers. Usability is significantly enhanced by performing encryption, decryption, and key management in a manner that is fully dynamic and transparent to users. Experiments show that the developed ParallelFS improves the reading and writing performances of multimedia files by approximately 35% and 22%, respectively, over the schemes using normal sequential encryption processing. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Forensic Recovery of File System Metadata for Digital Forensic Investigation
- Author
-
Junghoon Oh, Sangjin Lee, and Hyunuk Hwang
- Subjects
File system ,forensics ,metadata ,forensic recovery ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
File system forensics is one of the most important elements in digital forensic investigations. To date, various file system forensic methods, such as analysis of tree structure and the recovery of deleted file data, have been studied. Among these file system forensic methods, the recovery of file system metadata is a key technique that makes digital forensic investigations possible by recovering metadata when it is not possible to obtain metadata in a regular manner because the file system structure is damaged due to an accident/disaster or cyber terrorism. Previous studies mainly focused on recovering record or entry data, which are the basic units of metadata, using carving techniques via a fixed values or values capable of range prediction at the beginning of the data. However, no studies have been conducted on metadata without such fixed values or values capable of range prediction. $\$ $ LogFile, which is a metadata file of the New Technology File System (NTFS) that is one of the most used file systems at present, contains very important metadata that provide a history of all file system operations during a specific period. However, since there is no fixed value or a value capable of range prediction at the start position of the record, which is the basic unit of $\$ $ LogFile, there have been no studies on recovery using record units, and only recovery by file and page have been possible. If the file header or page header of $\$ $ LogFile is damaged, existing recovery methods cannot properly recover the metadata; in such cases, a record-level recovery method is required to recover the metadata. In this context, we investigated the mechanisms of record storage through a detailed analysis of the $\$ $ LogFile structure and proposed a recovery method for records without fixed values. Our proposed method was implemented as a tool and verified through comparative experiments with existing forensic tools that recover $\$ $ LogFile data. The experimental results showed that the proposed recovery method was able recover all the data that existing tools are unable to recover in situations where the $\$ $ LogFile data were damaged. The implemented tools are released free of charge to contribute digital forensic community. Finally, we explained what important role $\$ $ LogFile played in solving real-world cases and confirm the importance of recovering $\$ $ LogFile data in situations where file systems may be damaged due to accidents and disasters.
- Published
- 2022
- Full Text
- View/download PDF
17. CFFS: A Persistent Memory File System for Contiguous File Allocation With Fine-Grained Metadata
- Author
-
Jen-Kuang Liu and Sheng-De Wang
- Subjects
Non-volatile memory (NVM) ,persistent memory ,operating system ,file system ,memory management ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Extensive research on persistent memory (PM)-aware file systems has led to the development of numerous methods for improving read/write throughput. In particular, accessing or modifying file contents in a similar manner to the memory operations through mmap is a common approach. We designed a file system, CFFS (Contiguous File Allocation with Fine-Grained Metadata File System), to rapidly allocate PM pages to upper layer applications for mmap and to alleviate page fault overheads due to mmap. We optimized the physical contiguity of files in PM to reduce file fragmentation and increase fragment alignment with the goal of reducing software overhead. To achieve this goal, we implemented greedy-based buddy systems and implicit preallocation with a not-most-recently-used (NMRU) policy based on our overall page allocation strategy of considering not only the spatial but also the temporal locality of file access patterns. Furthermore, for efficient and atomic metadata operations, we fully leveraged the byte-addressable property of PM to design fine-grained metadata. CFFS adopts persistent doubly linked lists for directory operations to identify and recover from inconsistencies caused by system failures, doing so without using traditional log mechanisms. In experiments, CFFS showed superior page allocation performance to EXT4-DAX and NOVA did under different PM fragmentation levels. Our allocation algorithm also reduced the cost of page faults for frequently appended files. Finally, CFFS’s lightweight directory operations performed excellently in creating and deleting files of various quantities. In summary, the main contribution of the paper is proposing an efficient page allocation algorithm to improve the performance of subsequent mmaps, based on the strategy of considering not only the spatial but also the temporal locality of file access patterns in PM file systems. Another contribution was the fine-grained and log-free method for atomic directory operations.
- Published
- 2022
- Full Text
- View/download PDF
18. Fast and Low Overhead Metadata Operations for NVM-Based File System Using Slotted Paging.
- Author
-
Lin, Fangzhu, Xiao, Chunhua, Liu, Weichen, Wu, Lin, Shi, Chen, and Ning, Kun
- Subjects
- *
ELECTRONIC file management , *METADATA , *NONVOLATILE memory , *DYNAMIC random access memory , *EDIBLE fats & oils , *JOURNAL writing , *ELECTRIC breakdown - Abstract
Existing nonvolatile memory (NVM)-based file systems can fully leverage the characteristics of NVM to obtain better performance than traditional disk-based file systems. It has the potential capacity to efficiently manage metadata and perform fast metadata operations. However, most NVM-based file systems mainly focus on managing file metadata (inode), while pay little attention to directory metadata (dentry), which also has a noticeable impact on the file system performance. Besides, the traditional journaling technique that guarantees metadata consistency may not yield satisfactory performance on NVM-based file systems. To solve these problems, in this article we propose a fast and low overhead metadata operation mechanism, called FLOMO. It first adopts a novel slotted-paging structure in NVM to reorganize dentry for efficiently performing dentry operations, and utilizes the red–black tree in DRAM to accelerate dentry lookup and the search process of dentry deletion. Moreover, FLOMO presents a selective journaling scheme for metadata updates, which partially logs the changes related to dentry in the proposed slotted page, thereby, mitigating the redundant journaling overhead. To verify FLOMO, we implement it in a typical NVM-based file system, the persistent memory file system (PMFS). Experimental results show that FLOMO accelerates the metadata operations in PMFS by 34.4% $\sim 59$ %, and notably reduces the journaling overhead for metadata, shortening the latency by 59% on average. For real-world applications, FLOMO has higher throughput compared with PMFS, PMFS without journal, and NOVA, achieving up to $2.1\times $ , $1.1\times $ , and $1.3\times $ performance improvement, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. FUSE based file system for efficient storage and retrieval of fragmented multimedia files.
- Author
-
Bhat, Wasim Ahmad
- Subjects
CLOUD storage ,STORAGE ,DOWNLOADING - Abstract
Multimedia content providers (such as cyberlockers) are overwhelmed by huge volumes and large sizes of multimedia files. Cyberlockers provide Quality-of-Service (QoS) downloading to end users as the files are split into fragments. However, merging these fragments on the fly for streaming harms Quality-of-Experience (QoE) of users and overwhelms computational capacity. As a solution, this paper proposes a FUSE based file system, namely fumy , for efficient storage of large multimedia files as small sized fragments (for downloading) and virtually unifying them for their efficient retrieval (for streaming). We discuss virtual unification, namespace manipulation and operation redirection employed by fumy to achieve its goal. We also discuss the implementation details of fumy as a FUSE file system. We evaluated fumy for its performance on extended file systems in various operational and benchmarking configurations. Our results suggest that fumy adds minimal performance overhead to extended file systems when the fragment size is 512 MiB or higher, imposes least performance tax on ext4 file system, and even enhances the performance of extended file systems for specific workloads and fragments sizes. We recommend using ext4 file system with fragment size of 512 MiB for different workloads to achieve optimal performance of fumy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Porównanie wydajności systemu plików EXT4 i NTFS.
- Author
-
Sterniczuk, Bartosz Piotr
- Abstract
Copyright of Journal of Computer Sciences Institute is the property of Lublin University of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2022
- Full Text
- View/download PDF
21. File system capabilities in the context of a separation kernel : A practical approach by extending the capability model of the S3K kernel
- Author
-
Jogården, Tobias and Jogården, Tobias
- Abstract
The security and reliability of operating system kernels are critical. Trusting software to operate a car, a train, or an aeroplane requires high confidence in fault tolerance and resilience to attacks. The Simple Secure Separation Kernel (S3K) is one such effort currently being developed at KTH. S3K is a microkernel and uses capability-based security to achieve fine-grained resource access control. However, supported capability types are currently limited to time scheduling, memory access, interprocess communication, and monitors (modification of another process). This thesis tackles the topic of capability design for persistent storage, namely file systems. The scope is limited to basic kernel support and a file server running in userspace, which means abstracting over file systems rather than creating new ones. File systems typically use access control lists to encode permissions with user and group identifiers. However, S3K does not aim to implement a full multi-user system and extending the capability model to include the file system is a better fit. Four implementations using a path capability approach are presented and evaluated. These are named FullAPI, MinAPI, MaxQuota, and CreateQuota. FullAPI has a number of file system related system calls, while MinAPI only has path capabilities and implements file system operations in userspace. MaxQuota and CreateQuota are built on top of FullAPI and tackle the problem of disk quotas. The results show that the added complexity to the kernel is small if the vendored drivers are ignored. Path capabilities are an especially good fit for MinAPI, as it allows the kernel to have minimal cost to deriving them compared to an inode approach. The MaxQuota approach performs poorly as it requires frequent disk usage checks. The CreateQuota approach is a better approach as it only requires a check against a single capability. In conclusion, file system capabilities are an excellent abstraction for kernels and achieve better securi, Hög säkerhet och tillförlitlighet hos operativsystemkärnor är kritiskt. Det krävs god motståndskraft mot attacker för att lita på programvara att styra en bil, ett tåg eller ett flygplan. En sådan operativsystemkärna är Simple Secure Separation Kernel (S3K) som utvecklas vid KTH. S3K är en mikrokärna och använder kapabilitetsbaserad säkerhet för att uppnå granulär kontroll av resursåtkomst. Dock är stödet för kapabilitetstyper för närvarande begränsat till schedulering, minnesåtkomst, interprocesskommunikation och monitorer (modifikation av en annan process). Den här uppsatsen behandlar ämnet kapabilitetsdesign för persistent lagring, det vill säga filsystem. Omfattningen är begränsad till stöd i kärnan, vilket innebär abstraktion över befintliga filsystem och inte att designa nya. Filsystem använder vanligtvis listor för att bemärka behörigheter genom användar- och gruppidentifierare. S3K syftar dock inte till att implementera ett fullständigt fleranvändarsystem och att utöka kapabilitetsmodellen för att inkludera filsystemet är en bättre lösning. Fyra implementationer presenteras och utvärderas, alla baserade på ett tillvägagångssätt med filsökvägar som kapabiliteter. Dessa kallas FullAPI, MinAPI, MaxQuota och CreateQuota. FullAPI har ett större antal systemanrop relaterade till filsystem, medan MinAPI endast har nya systemanrop relaterade till skapa filsökvägs-kapabiliteter och implementerar filsystemoperationer utanför kärnan. MaxQuota och CreateQuota bygger på FullAPI och hanterar problemet diskkvoter. Resultaten visar att den ökade komplexiteten i kärnan är liten om de externa drivrutinerna som kopierats ignoreras. Filsökvägar som kapabiliteter är särskilt väl lämpade för MinAPI, eftersom kärnan har minimal kostnad för att derivera nya kapabiliteter jämfört med ett tillvägagångssätt där inoder används. MaxQuota-prestandan är dålig eftersom den kräver frekventa kontroller av diskutrymmesanvändning. CreateQuota är en bättre lösning eftersom den endast kräver att
- Published
- 2024
22. Team 5 - Infrastructure and DevOps Fall 2023
- Author
-
Adeyemi Aina, Amritha Subramanian, Hung-Wei Hsu, Shalini Rama, Vasundhara Gowrishankar, Yu-Chung Cheng, Adeyemi Aina, Amritha Subramanian, Hung-Wei Hsu, Shalini Rama, Vasundhara Gowrishankar, and Yu-Chung Cheng
- Abstract
The project aims to revolutionize information retrieval from extensive academic repositories like theses and dissertations by developing an advanced system. Unlike conventional search engines, it focuses on handling complex academic documents. Six dedicated teams oversee different facets: Knowledge Graph, Search and Indexing, Object Detection and Topic Analysis, Language Models, Integration, and User Interaction. The infrastructure and DevOps team is responsible for integration, orchestrates collaborative efforts, manages database access, and ensures seamless communication among components via APIs. The team oversees the container utilization in the CI/CD pipeline, maintains the container cluster, and tailors APIs for specific team needs. Expressing gratitude for previous contributions, the team has made notable progress in migrating to Endeavour, establishing a robust CI/CD pipeline, updating the database schema, tackling Kafka challenges, and deploying authentication services while creating accessible filesystem and database APIs for other teams.
- Published
- 2024
23. Design and Implementation of a Distributed Versioning File System for Cloud Rendering
- Author
-
Kyungwoon Cho and Hyokyung Bahn
- Subjects
Cloud ,rendering ,file system ,cooperative caching ,versioning file system ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Rendering is widely used for visual effects in animations, video games, and movies. As the computational load in rendering workloads fluctuates greatly over time, it is attractive to use cloud infrastructures for cost-effective rendering. However, we analyze that cloud rendering has several technical challenges involved in the handling of rendering input data. In this article, we analyze the workload characteristics of popular rendering projects, and find out the following three observations. First, total size of rendering input files reaches tens to hundreds of gigabytes, and uploading these large data to cloud increases the startup latency of rendering significantly. Second, the consistency requirement of file systems in cloud rendering is complicated compared to that of traditional distributed file systems. Third, file accesses in rendering are highly skewed such that the top 20% files account for 60-80% of total accesses, whereas 40-70% are never used or used only once. Based on these observations, we design and implement a new file system for cloud rendering, which has the function of version control, on-demand fetch, and distributed cooperative caching for rendering data. This allows for minimizing data transmission overhead caused by the large input data of rendering and satisfying the rendering data consistency. Measurement studies under synthetic and real workloads show that the proposed file system performs better than the conventional uploading scheme and NFS by 55.4% and 29.5% on average, respectively.
- Published
- 2021
- Full Text
- View/download PDF
24. Parallelizing Shared File I/O Operations of NVM File System for Manycore Servers
- Author
-
June-Hyung Kim, Youngjae Kim, Safdar Jamil, Chang-Gyu Lee, and Sungyong Park
- Subjects
Operating system ,file system ,non-volatile memory ,manycore CPU ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
NOVA, a state-of-the-art non-volatile memory (NVM) file system, has limited performance due to its coarse-grained per-file lock when multiple threads perform I/Os to a shared file in a manycore environment. For instance, a writer lock blocks other threads attempting to access the same file, although they access different regions of a file. When multiple threads reading the same file share a cache line containing a reader counter, performance can be significantly degraded due to cache consistency protocol as we increase the number of readers. This paper proposes a fine-grained segment-based range lock (SRL) that divides a file into multiple segments and manages a lock variable dynamically for each segment. Consequently, write operations can be parallelized without blocking unless there is a conflict in accessing the same range in a file. Moreover, SRL maintains a reader counter per segment that allows multiple reader threads to perform read operations without causing a performance bottleneck. We evaluated an SRL-based NOVA on an Intel Optane DC persistent memory (PM) manycore server. The benchmarking results showed that the average write throughput of the SRL-based NOVA is $3\times $ higher than the original NOVA, and the average read throughput scales linearly, while the original NOVA does not scale.
- Published
- 2021
- Full Text
- View/download PDF
25. Understanding and Exploring Serverless Cloud Computing
- Author
-
Schleier-Smith, Johann Markus
- Subjects
Computer science ,Amdahl's law ,cloud computing ,file system ,POSIX ,serverless computing ,transactions - Abstract
The past few years have seen a wave of enthusiasm for serverless computing, and we begin this work by analyzing the marketplace trends and underlying technical factors that have shaped the movement. We find that serverless computing addresses programming challenges in the same class as those that high-level programming languages address, suggesting that serverless computing may be viewed as high-level programming for distributed systems.We next turn our attention to one of the key shortcomings of serverless: the lack of integration between compute and state. We develop FaaSFS, a distributed file system that is compatible with POSIX applications but uses a novel consistency model with relaxed real-time ordering constraints. We call this model externally consistent sequential consistency (ECSC) and use it to scale a pre-existing single-server application to 10,000 serverless processes. We also show that under reasonable assumptions ECSC is indistinguishable from linearizability, a widely accepted strong form of consistency.Lastly, we explore whether serverless computing might lead to the demise of server hardware. By applying Amdahl's law and scaling rules for interconnect costs, we show that applications that rely on coordination protocols are particularly dependent on large servers for scalability. In contrast, those implemented with coordination-free protocols can run well on collections of small, low-cost servers or on disaggregated hardware. These approaches will likely continue to coexist, suggesting that a need for underlying server hardware will remain even as serverless abstractions thrive.
- Published
- 2022
26. Bridging Mismatched Granularity Between Embedded File Systems and Flash Memory.
- Author
-
Zhang, Runyu, Liu, Duo, Shen, Zhaoyan, She, Xiongxiong, Yang, Chaoshu, Chen, Xianzhang, Tan, Yujuan, and Wang, Chengliang
- Subjects
- *
FLASH memory , *MIDDLEWARE , *NONVOLATILE memory , *BRIDGES - Abstract
The mismatch between logical and physical I/O granularity inhibits the deployment of embedded file systems. Most existing embedded file systems manage logical space with a small unit, which is no longer the case of the flash operation granularity. Manually enlarging the logical I/O granularity of file systems requires enormous transplanting efforts. Moreover, large logical pages signify the write amplification problem, which turns to severe space consumption and performance collapse. This article designs a novel storage middleware, NV-middle, for legacy-embedded file systems with large-capacity flash memories. Legacy-embedded storage schemes can be smoothly transplanted into new platforms with different hardware read/write granularity. Moreover, the legacy optimization schemes can be maximally reserved, without inducing write amplification problems. We implement NV-middle with the state-of-the-art embedded file system, YAFFS2. Comprehensive evaluations show that NV-middle can achieve times of performance improvement over manually transplanted YAFFS2 with various workloads. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
27. PhantomFS-v2: Dare You to Avoid This Trap
- Author
-
Jione Choi, Hwiwon Lee, Younggi Park, Huy Kang Kim, Junghee Lee, Youngjae Kim, Gyuho Lee, Shin-Woo Shim, and Taekyu Kim
- Subjects
Deception technology ,file system ,honeypot ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
It has been demonstrated that deception technologies are effective in detecting advanced persistent threats and zero-day attacks which cannot be detected by traditional signature-based intrusion detection techniques. Especially, a file-based deception technology is promising because it is very difficult (if not impossible) to commit an attack without reading and modifying any file. It can play as an additional security barrier because malicious file access can be detected even if an adversary succeeds in gaining access to a host. However, PhantomFS still has a problem that is common to deception technologies. Once a deception technology is known to adversaries, it is unlikely to succeed in alluring adversaries. In this paper, we classify adversaries who are aware of PhantomFS according to their knowledge level and permission of PhantomFS. Then we analyze the attack surface and develop a defense strategy to limit the attack vectors. We extend PhantomFS to realize the strategy. Specifically, we introduce multiple hidden interfaces and detection of file execution. We evaluate the security and performance overhead of the proposed technique. We demonstrate that the extended PhantomFS is secure against intelligent adversaries by penetration testing. The extended PhantomFS offers higher detection accuracy with lower false alarm rate compared to existing techniques. It is also demonstrated that the overhead is negligible in terms of response time and CPU time.
- Published
- 2020
- Full Text
- View/download PDF
28. AvaTar: Zero-Copy Archiving With New Kernel-Level Operations
- Author
-
Hyunchan Park, Youngpil Kim, and Seehwan Yoo
- Subjects
Merging and splitting ,archiving and extraction ,zero-copy ,file system ,cloud storage system ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The problem associated with current file archiving systems is a slow processing time owing to unnecessary data copying. To address this problem, a novel archiving system with zero-copy merging and splitting operations, referred to as AvaTar, is presented herein. For the operations, instead of copying the data, the block allocation information of the files is manipulated at the kernel level. We implemented kernel-level archiving primitives in a Linux kernel, called AvaTar-FS, and a user-level archiving tool, called AvaTar agent. Our evaluation results indicated that AvaTar required only 0.48 s to extract 1,024 files from a 4 GB archive file, which is 132-times faster when compared with traditional GNU Tar archiving. AvaTar affords practical benefits in uploading files to a real-world cloud storage system, and successfully completes the transfer of 1,024 files to Amazon Web Service cloud storage within 60.55% of the processing time required through a traditional approach.
- Published
- 2020
- Full Text
- View/download PDF
29. PhantomFS: File-Based Deception Technology for Thwarting Malicious Users
- Author
-
Junghee Lee, Jione Choi, Gyuho Lee, Shin-Woo Shim, and Taekyu Kim
- Subjects
Deception technology ,file system ,honeypot ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
File-based deception technologies can be used as an additional security barrier when adversaries have successfully gained access to a host evading intrusion detection systems. Adversaries are detected if they access fake files. Though previous works have mainly focused on using user data files as decoys, this concept can be applied to system files. If so, it is expected to be effective in detecting malicious users because it is very difficult to commit an attack without accessing a single system file. However, it may suffer from excessive false alarms by legitimate system services such as file indexing and searching. Legitimate users may also access fake files by mistake. This paper addresses this issue by introducing a hidden interface. Legitimate users and applications access files through the hidden interface which does not show fake files. The hidden interface can also be utilized to hide sensitive files by hiding them from the regular interface. By experiments, we demonstrate the proposed technique incurs negligible performance overhead, and it is an effective countermeasure to various attack scenarios and practical in that it does not generate false alarms for legitimate applications and users.
- Published
- 2020
- Full Text
- View/download PDF
30. Separation of Virtual Machine I/O in Cloud Systems
- Author
-
Hyokyung Bahn and Jisun Kim
- Subjects
Journaling ,file system ,virtualized system ,virtualization ,buffer cache ,commit ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Virtualization is widely used in modern computer systems ranging from personal computers to cloud servers as it provides various heterogeneous platforms at low cost. However, due to its nested software structure in host and guest machines that are difficult to harmonize, it is challenging to manage resources efficiently in virtualized systems. In this article, we anatomize the overhead of virtualization associated with file system journaling and discover the excessively frequent commits that take place in virtualized systems. This is because the host triggers commit on every write request from all guest machines. This also generates unnecessary write traffic to storage. To remedy these problems, we propose the VM-separated commit, and implement it on QEMU-KVM and Ext4. Specifically, we devise a data structure that manages modified file blocks from each guest as a separate list and split the running transaction list into two sub-transactions based on this data structure upon a commit request from a guest. Measurement studies with Filebench and IOzone benchmarks show that the proposed policy improves the I/O throughput by 19.5% on average and up to 64.2% over existing systems. It also reduces the variability in performance.
- Published
- 2020
- Full Text
- View/download PDF
31. An Empirical Performance Evaluation of Transactional Solid-State Drives
- Author
-
Yongseok Son, Heon Young Yeom, and Hyuck Han
- Subjects
Solid-state drives ,file system ,distributed file system ,performance ,consistency ,transaction ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Solid-state drives (SSDs) have accelerated the architectural evolution of storage systems with several characteristics (e.g., out-of-place update) compared with hard disk drives (HDD). Out-of-place update of SSDs naturally can support transaction mechanism which is commonly used in systems to provide crash consistency. Thus, transactional functionality has been recently implemented inside solid-state drives (SSDs). However, this approach must be re-evaluated for enterprise storage with a standard interface to investigate their benefits in a more realistic and standard fashion. In this article, we explore the implications and challenges of transactional SSDs with different experiments. To evaluate the potential benefit of transactional SSDs, we design and implement the transactional functionality in a Samsung enterprise-class and SATA-based SSD (i.e., SM843TN) called TxSSD. We modify the local file systems (i.e., ext4 and btrfs) and a distributed parallel file system (i.e., Lustre) to utilize TxSSDs. Our modified file systems with TxSSDs provide crash consistency without redundant writes. We evaluate our file systems by using multiple micro and macro benchmarks. We analyze the performance results and demonstrate that TxSSDs may generate an overhead for supporting transactional functionality inside SSD.
- Published
- 2020
- Full Text
- View/download PDF
32. Precise Performance Characterization of Antivirus on the File System Operations
- Author
-
Mohammed Al-Saleh and Hanan Hamdan
- Subjects
antivirus ,performance ,file system ,minifilter dr ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The Antivirus (AV) is of an important concern to the end-users community. Mainly, the AV achieves security by scanning data against its database of virus signatures. In addition, the AV tries to reach a pleasant balance between security and United States of Americability. When to scan data is an important design decision an AV has to make. Because AVs are equipped with on-access scanners that scan files when necessary, we want to have a fine-grained approach that provides us with high precision explanation of the performance impact of the AVs on different file system operations. Microsofts minifilter driver technology helps us achieve exactly what we want. By deploying a minifilter driver, we show that most overhead of the tested AVs are greatly imposed on the OPEN operation. Interestingly, we also show that the AV greatly enhances the timing for the READ operation. Finally, the WRITE and CLEANUP operations show almost no differences in terms of performance.
- Published
- 2019
- Full Text
- View/download PDF
33. Experiences of Converging Big Data Analytics Frameworks with High Performance Computing Systems
- Author
-
Cheng, Peng, Lu, Yutong, Du, Yunfei, Chen, Zhiguang, Hutchison, David, Series Editor, Kanade, Takeo, Series Editor, Kittler, Josef, Series Editor, Kleinberg, Jon M., Series Editor, Mattern, Friedemann, Series Editor, Mitchell, John C., Series Editor, Naor, Moni, Series Editor, Pandu Rangan, C., Series Editor, Steffen, Bernhard, Series Editor, Terzopoulos, Demetri, Series Editor, Tygar, Doug, Series Editor, Weikum, Gerhard, Series Editor, Yokota, Rio, editor, and Wu, Weigang, editor
- Published
- 2018
- Full Text
- View/download PDF
34. Towards a worldwide storage infrastructure
- Author
-
Quintard, Julien, Bacon, Jean, and Beresford, Alastair
- Subjects
004.2 ,File system ,Peer-to-peer ,Decentralised ,Access control ,Administration ,Naming ,Byzantine - Abstract
Peer-to-peer systems have recently gained a lot of attention in the academic community especially through the design of KBR (Key-Based Routing) algorithms and DHT (Distributed Hash Table)s. On top of these constructs were built promising applications such as video streaming applications but also storage infrastructures benefiting from the availability and resilience of such scalable network protocols. Unfortunately, rare are the storage systems designed to be scalable and fault-tolerant to Byzantine behaviour, conditions required for such systems to be deployed in an environment such as the Internet. Furthermore, although some means of access control are often provided, such file systems fail to offer the end-users the flexibility required in order to easily manage the permissions granted to potentially hundreds or thousands of end-users. In addition, as for centralised file systems which rely on a special user, referred to as root on Unices, distributed file systems equally require some tasks to operate at the system level. The decentralised nature of these systems renders impossible the use of a single authoritative entity for performing such tasks since implicitly granting her superprivileges, unacceptable configuration for such decentralised systems. This thesis addresses both issues by providing the file system objects a completely decentralised access control and administration scheme enabling users to express access control rules in a flexible way but also to request administrative tasks without the need for a superuser. A prototype has been developed and evaluated, proving feasible the deployment of such a decentralised file system in large-scale and untrustworthy environments.
- Published
- 2012
- Full Text
- View/download PDF
35. Defuse: Decoupling Metadata and Data Processing in FUSE Framework for Performance Improvement
- Author
-
Wenrui Yan, Jie Yao, and Qiang Cao
- Subjects
FUSE ,file system ,file mapping ,performance optimization ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
A popular user-space file system framework, FUSE, has been widely used for building various customized file systems (cFS) on top of the underlying kernel file system (kFS). A FUSE-based cFS gains adequate flexibility by developing its specific functions in user space, but brings extra user-kernel mode switches in the request processing flow owing to forwarding all requests from the FUSE kernel driver to the user-space daemon, thus degrading the overall performance. We observe that a file data request does not need to forward to the user-space daemon when its file-to-file mapping between the FUSE-based cFS and the underlying kFS remains unchanged. Based on the insight, we propose a modified FUSE framework - DeFUSE that decouples the processing flow of the metadata and data requests. The metadata requests still follow the original flow to reserve flexibility while the data requests are directly executed in the DeFUSE kernel driver maintaining the file-to-file mappings in the kernel, effectively eliminating the unnecessary mode switches. We have implemented the DeFUSE framework and ported three representative FUSE-based cFSs to DeFUSE. The result shows that for data-centric workloads, the throughput of DeFUSE-based cFSs increases up to 3.5X for write and 3.8X for read respectively, compared to their corresponding FUSE-based implementations. DeFUSE is available on Github.
- Published
- 2019
- Full Text
- View/download PDF
36. Design of flight data recorder based on ARM microcontroller
- Author
-
Liu Kun, Xu Zhe, and Li Feifei
- Subjects
flight data recorder ,unmanned helicopter ,ARM microcontroller ,file system ,SD card ,Electronics ,TK7800-8360 - Abstract
To record the flight data of unmanned helicopter in real time, and for the convenience of subsequent analysis of the helicopter′s flight state and mathematical modeling, in this paper, a data recorder with the features of good practicability, generality and convenient to transplantation is designed with software and hardware solution. This solution is designed with ARM microcontroller combined with SD card and the file system, and commonly used communication interfaces are added to connect with different devices using different communication protocols. By actual running tests, the flight data can be well stored in the file set up before, and provides the convenience for subsequent data analysis, curve drawing and so on. Also, it has certain significance for popularization and practical value.
- Published
- 2018
- Full Text
- View/download PDF
37. Fast, Durable, and Safe Data Management Support for Persistent Memory
- Author
-
Hoseinzadeh, Morteza
- Subjects
Computer science ,File System ,Non-Volatile Memory ,Persistent Memory ,Persistent Memory Programming ,Programming Library ,Tiering Storage System - Abstract
Emerging fast, byte-addressable Persistent Memory (PM) considerably increases the storage performance compared to traditional disks and makes it possible to build complex data structures that can survive system failures. However, programming for PM is challenging, not least because it combines well-known programming challenges like locking, memory management, and pointer safety with novel PM-specific bug types. It also requires logging updates to PM to facilitate recovery after a crash. A misstep in any area can corrupt data, leak resources, or prevent successful recovery after a crash. Additionally, its high price limits its capacity to scale as same as traditional block devices.This dissertation first presents Corundum, a Rust-based library with an idiomatic PM programming interface that leverages Rust’s type system to avoid the most common PM programming bugs statically. Corundum lets programmers develop persistent data structures using familiar Rust constructs and have confidence that they will be free of those bugs. We have implemented Corundum and found its performance to be as good as or better than Intel's widely-used PMDK library, HP's Atlas, Mnemosyne, and go-pmem.Then, the dissertation presents Carbide, a robust multilingual programming framework that allows developing safe data structures in Corundum and using them in C++. Carbide strictly checks the data structure implementation for PM safety and allows unrestrictedly using them in C++ with an accurate persistent type checking process. Our experimental results show that Carbide not only does not incur a significant slowdown, but it is also faster than purely using Corundum in some of our benchmarks due to the flexibility that it provides. Finally, it introduces Ziggurat, a tiered file system that combines PM and slow disks to create a high-performance and extensive storage system. Ziggurat steers incoming writes to PM, DRAM, or disk depending on application access patterns, write size, and the likelihood that the application will stall until the write completes. Our experimental results show that with a small amount of PM and a large SSD, Ziggurat achieves up to 38.9X and 46.5X throughput improvement compared with EXT4 and XFS running on an SSD alone, respectively.
- Published
- 2021
38. A Case for Application-Managed Flash.
- Author
-
Koo, Jinhyung, Chung, Chanwoo, Arvind, and Lee, Sungjin
- Subjects
- *
SOLID state drives , *FLASH memory , *RANDOM access memory , *COMPUTER architecture - Abstract
We propose a new I/O architecture for NAND flash-based SSDs, called application-managed flash (AMF) and present two case studies to show its usefulness. In a typical SSD controller, an intermediate software layer, called the flash translation layer (FTL), is employed between NAND flash chips and a host interface. The main responsibility of an FTL is to provide interoperability with conventional HDDs, but this interoperability comes at the cost of extra hardware resources and degraded I/O performance. The proposed AMF refactors the flash storage architecture so that an SSD controller exposes append-only segments, which do not permit overwriting. This refactoring dramatically improves performance of applications and reduces hardware costs by allowing applications to directly manage flash storage with minimal supports from the SSD controller. In order to understand the benefits of AMF, we study two popular applications: a log-structured file system (F2FS) and a key-value store (RocksDB). Our experiments show that the DRAM in the flash controller is reduced by 128X and the performances of the file system and the key-value store improve by 80 and 54 percent, respectively, over conventional SSDs. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
39. Variable Block Scheme for Minimizing File I/O.
- Author
-
YoungJun Yoo, Sun Sopharath, Jin Kim, SungBong Jang, and YoungWoong Ko
- Subjects
FILES (Records) ,ROCK glaciers - Abstract
In the conventional operating system, file modification overhead is very high because data block should be rewritten to the storage system when we delete or insert several bytes in disk system. To reduce the overhead of file modification, we have to provide byte stream operation to the block device, which can delete bytes of data without block-alignment consideration. In this paper, we explain a variable length block scheme to modify the contents of a file in a situation where some data is deleted. The variable length block is a technique that minimizes block writing by creating a bumper area in a block to save the contents of all files in a situation where some bytes are deleted and storing only the block where the modification occurred. With this approach, we can mimics byte stream operation on block-based disk storage system. In experiment result, we show that the performance of the proposed system is superior to conventional file system. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
40. GPFS HPSS Integration: Implementation Experience
- Author
-
Hazen, Damian
- Subjects
Mathematics and Computing ,GPFS ,HPSS ,file system - Abstract
In 2005 NERSC and IBM Global Services Federal began work to develop an integrated HSM solution using the GPFS file system and the HPSS hierarchical storage system. It was foreseen that this solution would play a key role in data management at NERSC, and fill a market niche for IBM. As with many large and complex software projects, there were a number of unforeseen difficulties encountered during implementation. As the effort progressed, it became apparent that DMAPI alone could not be used to tie two distributed, high performance systems together without serious impact on performance. This document discusses the evolution of the development effort, from one which attempted to synchronize the GPFS and HPSS name spaces relying solely on GPFS?s implementation of the DMAPI specification, to one with a more traditional HSM functionality that had no synchronized namespace in HPSS, and finally to an effort, still underway, which will provide traditional HSM functionality, but requires features from the GPFS Information Lifecycle Management (ILM) to fully achieve this goal in a way which is scalable and meets the needs of sites with aggressive performance requirements. The last approach makes concessions to portability by using file system features such as ILM and snapshotting in order to achieve a scalable design.
- Published
- 2008
41. LDFS: A Low Latency In-Line Data Deduplication File System
- Author
-
Yongtao Zhou, Yuhui Deng, Laurence T. Yang, Ru Yang, and Lei Si
- Subjects
Massive data ,inline data deduplication ,file system ,low latency ,disk bottleneck ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Due to the rapid proliferation of sensors and intelligent devices, the cyber-physical-social computing and networking (CPSCN) is emerging as a new computing paradigm. Massive data have been generated in the CPSCN environment. The traditional data deduplication is not able to handle the CPSCN environment due to the involved long latency. This paper presents a low latency in-line data deduplication file system (LDFS). The LDFS decouples the unique data block and fingerprint index by writing the address of data blocks to the corresponding file recipe and fingerprint index, thus avoiding accessing fingerprint index on the path of the read operation. For every unique data block, the LDFS assigns a globally unique ID, and thus, the LDFS only requires one disk access to obtain the corresponding data block reference count using the global ID. In order to guarantee the write performance, the LDFS employs finer granularity lock to optimize the block flushing strategy of write buffer. Experimental results demonstrate that the LDFS significantly enhances the read and write performance on the critical path in contrast to the traditional deduplication file system LessFS. Meanwhile, the LDFS achieves almost the same deduplication ratio (40.8) as that of LessFS.
- Published
- 2018
- Full Text
- View/download PDF
42. Secure decentralized electronic health records sharing system based on blockchains
- Author
-
Farag Sallabi, Juhar Ahmed Abdella, Mohamed Adel Serhani, and Khaled Shuaib
- Subjects
File system ,Blockchain ,General Computer Science ,Computer science ,business.industry ,020206 networking & telecommunications ,Denial-of-service attack ,02 engineering and technology ,computer.software_genre ,Centralized database ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Single point of failure ,business ,Byzantine fault tolerance ,Database transaction ,computer ,Computer network - Abstract
Blockchain technology has a great potential for improving efficiency, security and privacy of Electronic Health Records (EHR) sharing systems. However, existing solutions relying on a centralized database are susceptible to traditional security problems such as Denial of Service (DoS) attacks and a single point of failure similar to traditional database systems. In addition, past solutions exposed users to privacy linking attacks and did not tackle performance and scalability challenges. In this paper, we propose a permissioned Blockchain based healthcare data sharing system that integrates Blockchain technology, decentralized file system and threshold signature to address the aforementioned problems. The proposed system is based on Istanbul Byzantine Fault Tolerant (IBFT) consensus algorithm and Interplanetary File System (IPFS). We implemented the proposed system on an enterprise Ethereum Blockchain known as Hyperledger Besu. We evaluated and compared the performance of the proposed system based on various performance metrics such as transaction latency, throughput and failure rate. Experiments were conducted on a variable network size and number of transactions. The experimental results indicate that the proposed system performs better than existing Blockchain based systems. Moreover, the decentralized file system provides better security than existing traditional centralized database systems while providing the same level of performance.
- Published
- 2022
- Full Text
- View/download PDF
43. Distributed Ledger Technology-based land transaction system with trusted nodes consensus mechanism
- Author
-
Dharmender Singh Kushwaha, Shivani Agrawal, and Amrendra Singh Yadav
- Subjects
File system ,Blockchain ,General Computer Science ,Computer science ,Node (networking) ,020206 networking & telecommunications ,02 engineering and technology ,Computer security ,computer.software_genre ,Broadcasting (networking) ,Land registration ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,computer ,Database transaction ,Block (data storage) - Abstract
Blockchain technology is an emerging solution to various financial applications that require a secure and tamper-proof transactions system. One such application is the land registry. Management of transactions for land registration is a tedious task. It is highly insecure and subject to forgery land records, Issues in verification, mediators, etc. This article presents a scalable and novel property/land registration framework based on blockchain technology. The transparent and immutable nature of blockchain technology makes it reliable for intrusive and fraudulent land registration activities. The framework uses the InterPlanetary File System (IPFS), a Peer-to-Peer (P2P) swarm network. It allows the sharing and storing of data in an efficient, decentralized, and transparent manner. This article also proposes a Trusted Node Consensus Algorithm (TNCA) based on the miners' trust value that minimizes the computational overhead of broadcasting the new block. The proposed approach - trusted node consensus algorithm requires lesser time to add the block into the blockchain, and message exchange per consensus is 58.94% lesser than the traditional Proof-of-Work (PoW) approach and 26.44% lesser time than the existing load-balanced method.
- Published
- 2022
- Full Text
- View/download PDF
44. Training and Placement Cell Automation
- Author
-
Singh, Amit Kumar, Kaushik, Ayush, Chandana, G. V., Chitra, A., Mala, M., Singh, Amit Kumar, Kaushik, Ayush, Chandana, G. V., Chitra, A., and Mala, M.
- Abstract
The objective of the project is to automate the training and placement unit of AMC Engineering College by minimizing manual work and enhancing optimization, abstraction, and security. The proposed solution is a web application that can be accessed by authorized personnel throughout the organization via secure login credentials. The application will have features that cater to both students and administration staff. Students will be able to fill out a registration form, and the information will be stored in the system for future reference. Once registered, students won't have to repeat the process, making it a one-time exercise. The system will enable administration staff to manage student information related to placement, including maintaining their details, generating a list of requested candidates, and searching for information posted by students. The benefits of this project are evident: it will reduce manual work and enhance efficiency, allowing the college to achieve full IT deployment. It will also enable quick access to placement-related activities, making the process more convenient for both students and staff. In summary, this project has the potential to streamline the Training and Placement unit of AMC Engineering College, making it more efficient and effective.
- Published
- 2023
45. CS5604: Team 1 ETD Collection Management
- Author
-
Jain, Tanya, Bhagat, Hirva, Lee, Wen-Yu, Thukkaraju, Ashrith Reddy, Sethi, Raghav, Jain, Tanya, Bhagat, Hirva, Lee, Wen-Yu, Thukkaraju, Ashrith Reddy, and Sethi, Raghav
- Abstract
Academic institutions the world over are known to produce hundreds of thousands of ETDs (Electronic Theses and Dissertations) every year. At the end of an academic year, we are left with large volumes of ETD data that are rarely used for further research or ever cited in future work, writings, or publications. As part of the CS5604: Information Storage and Retrieval graduate-level course at Virginia Polytechnic Institute and State University (Virginia Tech), we collectively created a search engine for a collection of more than 500,000 ETDs from academic institutions in the United States, which constitutes the class-wide project. This system enables users to ingest, pre-process, and store ETDs in a repository; apply deep learning models to perform topic modeling, text segmentation, chapter summarization, and classification, backed by a DevOps, user experience and integrations team. We are Team 1 or the “ETD Collection Management” team. During the course of the Fall 2022 semester at Virginia Tech, we were responsible for setting up the repository of ETDs, which encompasses broadly the following three components: (1) setting up a database, (2) storing digital objects in a file system, and (3) creating a knowledge graph. Our work enabled other teams to efficiently retrieve the stored ETD data, and perform appropriate pre-processing operations, and during the final few months of the semester, to apply the aforementioned deep learning models to the ETD collection we created. The key deliverable for Team 1 was to create an interactive user interface to perform CRUD operations (create, retrieve, update, and delete) in order to interact with the repository of ETDs, which is essentially an extrapolation of the work already taken up at Virginia Tech’s Digital Library Research Laboratory. Owing to the fact that the other teams had no direct access to the repository set up by us, we designed a host of Application Programming Interfaces (APIs) which are elaborated in depth in the
- Published
- 2023
46. Do Metadata-based Deleted-File-Recovery (DFR) Tools Meet NIST Guidelines?
- Author
-
Andrew Meyer and Sankardas Roy
- Subjects
deleted file recovery ,digital forensics ,metadata ,nist guidelines ,file system ,fat ,ntfs ,Technology - Abstract
Digital forensics (DF) tools are used for post-mortem investigation of cyber-crimes. CFTT (Computer ForensicsTool Testing) Program at National Institute of Standards and Technology (NIST) has defined expectations for aDF tool’s behavior. Understanding these expectations and how DF tools work is critical for ensuring integrityof the forensic analysis results. In this paper, we consider standardization of one class of DF tools which arefor Deleted File Recovery (DFR). We design a list of canonical test file system images to evaluate a DFR tool.Via extensive experiments we find that many popular DFR tools do not satisfy some of the standards, and wecompile a comparative analysis of these tools, which could help the user choose the right tool. Furthermore,one of our research questions identifies the factors which make a DFR tool fail. Moreover, we also providecritique on applicability of the standards. Our findings is likely to trigger more research on compliance ofstandards from the researcher community as well as the practitioners.
- Published
- 2019
- Full Text
- View/download PDF
47. Scientific Data Management Center for Enabling Technologies
- Author
-
Liao, Wei-keng
- Published
- 2011
- Full Text
- View/download PDF
48. Data Deduplication for Efficient Cloud Storage and Retrieval.
- Author
-
Misal, Rishikesh and Perumal, Boominathan
- Published
- 2019
49. Efficient and Consistent NVMM Cache for SSD-Based File System.
- Author
-
Chen, Youmin, Lu, Youyou, Chen, Pei, and Shu, Jiwu
- Subjects
- *
DYNAMIC random access memory , *SOLID state drives , *FILES (Records) , *RANDOM access memory - Abstract
Buffer caching is an effective approach to improve the system performance and extend the lifetime of SSDs. However, the frequent synchronization operations in most real-world applications limit such advantages. This paper proposes to adopt emerging non-volatile main memories (NVMMs) to relieve the above problems while achieving both efficient and consistent cache management. To this end, an adaptive fine-grained cache (AFCM) scheme is proposed, which is motivated by our observation that the file data in many synchronized pages is partially updated for a wide range of workloads, implying that fine-grained cache management can save the NVMM cache space wasted by the clean parts. To reduce the cache index overhead introduced by fine-grained cache management, AFCM employs a Hybrid Cache based on DRAM and NVMM, with which the normal read and write operations are served without performance penalty. We also propose the Transactional Copy-on-Write mechanism to guarantee the crash consistency of both NVMM cache space and file system image. Our experimental results show that AFCM provides up to 84 percent performance improvement and 63 percent SSD write reduction on average compared to the conventional coarse-grained cache management scheme. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
50. Do Metadata-based Deleted-File-Recovery (DFR) Tools Meet NIST Guidelines?
- Author
-
Meyer, Andrew and Roy, Sankardas
- Subjects
METADATA ,ELECTRONIC file management ,CYBERTERRORISM ,COMPUTER crimes ,HARD disks - Abstract
Digital forensics (DF) tools are used for post-mortem investigation of cyber-crimes. CFTT (Computer Forensics Tool Testing) Program at National Institute of Standards and Technology (NIST) has defined expectations for a DF tool's behavior. Understanding these expectations and how DF tools work is critical for ensuring integrity of the forensic analysis results. In this paper, we consider standardization of one class of DF tools which are for Deleted File Recovery (DFR). We design a list of canonical test file system images to evaluate a DFR tool. Via extensive experiments we find that many popular DFR tools do not satisfy some of the standards, and we compile a comparative analysis of these tools, which could help the user choose the right tool. Furthermore, one of our research questions identifies the factors which make a DFR tool fail. Moreover, we also provide critique on applicability of the standards. Our findings is likely to trigger more research on compliance of standards from the researcher community as well as the practitioners. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.