RAPTOR: A Five-Safes approach to a secure, cloud native and serverless genomics data repository
biorxiv(2022)
摘要
Genomic researchers are increasingly utilizing commercial cloud platforms (CCPs) to manage their data and analytics needs. Commercial clouds allow researchers to grow their storage and analytics capacity on demand, keeping pace with expanding project data footprints and enabling researchers to avoid large capital expenditures while paying only for IT capacity consumed by their project. Cloud computing also allows researchers to overcome common network and storage bottlenecks encountered when combining or re-analysing large datasets. However, cloud computing presents a new set of challenges. Without adequate security controls, the risk of unauthorised access may be higher for data stored on the cloud. In addition, regulators are increasingly mandating data access patterns and specific security protocols on the storage and use of genomic data to safeguard rights of the study participants. While CCPs provide tools for security and regulatory compliance, utilising these tools to build the necessary controls required for cloud solutions is not trivial as such skill sets are not commonly found in a genomics lab. The Research Assets Provisioning and Tracking Online Repository (RAPTOR) by the Genome Institute of Singapore is a cloud native genomics data repository and analytics platform focusing on security and regulatory compliance. Using a “five-safes” framework (Safe Purpose, Safe People, Safe Settings, Safe Data and Safe Output), RAPTOR provides security and governance controls to data contributors and users leveraging cloud computing for sharing and analysis of large genomic datasets without the risk of security breaches or running afoul of regulations. RAPTOR can also enable data federation with other genomic data repositories using GA4GH community-defined standards, allowing researchers to boost the statistical power of their work and overcome geographic and ancestry limitations of data sets
### Competing Interest Statement
The authors have declared no competing interest.
* AAI
: Authentication and Authorisation Infrastructure
AMI
: Amazon Machine Image
API
: Application Programming Interface
AWS
: Amazon Web Services
BAA
: Business Associate Agreement
CCP
: Commercial Cloud Platform
CREST
: Council of Registered Ethical Security Testers
CSP
: Cloud Service Provider
CVE
: Common Vulnerabilities and Exposure
DAC
: Data Access Committee
DRS
: Data Repository Service
DUO
: Data Use Ontology
EBS
: Elastic Block Storage
EC2
: Elastic Computing Cloud
EMR
: Elastic Map Reduce
GA4GH
: Global Alliance for Genomics and Health
GCP
: Google Cloud Platform
HIPPA
: Health Insurance Portability and Accountability Act
IAM
: Identity Access Management
IMDA
: Inforcomm Media Development Authority
KMS
: Key Management System
MAF
: Minor Allele Frequency
MTCS
: Multi-Tiered Cloud Security
POAG
: Primary Open Angle Glaucoma
QLDB
: Quantum Ledger Database
RAPTOR
: Research Assets Provisioning and Tracking Online Repository
RDP
: Remote Desktop Protocol
S3
: Simple Storage Service
SCHS
: Singapore Chinese Health Study
SSH
: Secure Shell
TCP/IP
: Transmission Control Protocol/Internet Protocol
TES
: Task Execution Service
TOPMed
: Trans-Omics in Precision Medicine
URI
: Universal Resource Identifier
VPC
: Virtual Private Cloud
WES
: Workflow Execution Service
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要