RAPTOR: A Five-Safes approach to a secure, cloud native and serverless genomics data repository

biorxiv(2022)

引用 0|浏览23
暂无评分
摘要
Genomic researchers are increasingly utilizing commercial cloud platforms (CCPs) to manage their data and analytics needs. Commercial clouds allow researchers to grow their storage and analytics capacity on demand, keeping pace with expanding project data footprints and enabling researchers to avoid large capital expenditures while paying only for IT capacity consumed by their project. Cloud computing also allows researchers to overcome common network and storage bottlenecks encountered when combining or re-analysing large datasets. However, cloud computing presents a new set of challenges. Without adequate security controls, the risk of unauthorised access may be higher for data stored on the cloud. In addition, regulators are increasingly mandating data access patterns and specific security protocols on the storage and use of genomic data to safeguard rights of the study participants. While CCPs provide tools for security and regulatory compliance, utilising these tools to build the necessary controls required for cloud solutions is not trivial as such skill sets are not commonly found in a genomics lab. The Research Assets Provisioning and Tracking Online Repository (RAPTOR) by the Genome Institute of Singapore is a cloud native genomics data repository and analytics platform focusing on security and regulatory compliance. Using a “five-safes” framework (Safe Purpose, Safe People, Safe Settings, Safe Data and Safe Output), RAPTOR provides security and governance controls to data contributors and users leveraging cloud computing for sharing and analysis of large genomic datasets without the risk of security breaches or running afoul of regulations. RAPTOR can also enable data federation with other genomic data repositories using GA4GH community-defined standards, allowing researchers to boost the statistical power of their work and overcome geographic and ancestry limitations of data sets ### Competing Interest Statement The authors have declared no competing interest. * AAI : Authentication and Authorisation Infrastructure AMI : Amazon Machine Image API : Application Programming Interface AWS : Amazon Web Services BAA : Business Associate Agreement CCP : Commercial Cloud Platform CREST : Council of Registered Ethical Security Testers CSP : Cloud Service Provider CVE : Common Vulnerabilities and Exposure DAC : Data Access Committee DRS : Data Repository Service DUO : Data Use Ontology EBS : Elastic Block Storage EC2 : Elastic Computing Cloud EMR : Elastic Map Reduce GA4GH : Global Alliance for Genomics and Health GCP : Google Cloud Platform HIPPA : Health Insurance Portability and Accountability Act IAM : Identity Access Management IMDA : Inforcomm Media Development Authority KMS : Key Management System MAF : Minor Allele Frequency MTCS : Multi-Tiered Cloud Security POAG : Primary Open Angle Glaucoma QLDB : Quantum Ledger Database RAPTOR : Research Assets Provisioning and Tracking Online Repository RDP : Remote Desktop Protocol S3 : Simple Storage Service SCHS : Singapore Chinese Health Study SSH : Secure Shell TCP/IP : Transmission Control Protocol/Internet Protocol TES : Task Execution Service TOPMed : Trans-Omics in Precision Medicine URI : Universal Resource Identifier VPC : Virtual Private Cloud WES : Workflow Execution Service
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要