1. RAPTOR: A Five-Safes approach to a secure, cloud native and serverless genomics data repository
- Author
-
Chih Chuan Shih, Jieqi Chen, Ai Shan Lee, Nicolas Bertin, Maxime Hebrard, Chiea Chuen Khor, Zheng Li, Joanna Hui Juan Tan, Wee Yang Meah, Su Qin Peh, Shi Qi Mok, Kar Seng Sim, Jianjun Liu, Ling Wang, Eleanor Wong, Jingmei Li, Aung Tin, Ching-Yu Cheng, Chew-Kiat Heng, Jian-Min Yuan, Woon-Puay Koh, Seang Mei Saw, Yechiel Friedlander, Xueling Sim, Jin Fang Chai, Yap Seng Chong, Sonia Davila, Liuh Ling Goh, Eng Sing Lee, Tien Yin Wong, Neerja Karnani, Khai Pang Leong, Khung Keong Yeo, John C Chambers, Su Chi Lim, Rick Siow Mong Goh, Patrick Tan, and Rajkumar Dorajoo
- Subjects
History ,Polymers and Plastics ,Business and International Management ,Industrial and Manufacturing Engineering - Abstract
Genomic researchers are increasingly utilizing commercial cloud platforms (CCPs) to manage their data and analytics needs. Commercial clouds allow researchers to grow their storage and analytics capacity on demand, keeping pace with expanding project data footprints and enabling researchers to avoid large capital expenditures while paying only for IT capacity consumed by their project. Cloud computing also allows researchers to overcome common network and storage bottlenecks encountered when combining or re-analysing large datasets. However, cloud computing presents a new set of challenges. Without adequate security controls, the risk of unauthorised access may be higher for data stored on the cloud. In addition, regulators are increasingly mandating data access patterns and specific security protocols on the storage and use of genomic data to safeguard rights of the study participants. While CCPs provide tools for security and regulatory compliance, utilising these tools to build the necessary controls required for cloud solutions is not trivial as such skill sets are not commonly found in a genomics lab. The Research Assets Provisioning and Tracking Online Repository (RAPTOR) by the Genome Institute of Singapore is a cloud native genomics data repository and analytics platform focusing on security and regulatory compliance. Using a “five-safes” framework (Safe Purpose, Safe People, Safe Settings, Safe Data and Safe Output), RAPTOR provides security and governance controls to data contributors and users leveraging cloud computing for sharing and analysis of large genomic datasets without the risk of security breaches or running afoul of regulations. RAPTOR can also enable data federation with other genomic data repositories using GA4GH community-defined standards, allowing researchers to boost the statistical power of their work and overcome geographic and ancestry limitations of data sets
- Published
- 2022