Differentially Private Data Generation with Missing Data

CoRR(2023)

引用 0|浏览11
暂无评分
摘要
Despite several works that succeed in generating synthetic data with differential privacy (DP) guarantees, they are inadequate for generating high-quality synthetic data when the input data has missing values. In this work, we formalize the problems of DP synthetic data with missing values and propose three effective adaptive strategies that significantly improve the utility of the synthetic data on four real-world datasets with different types and levels of missing data and privacy requirements. We also identify the relationship between privacy impact for the complete ground truth data and incomplete data for these DP synthetic data generation algorithms. We model the missing mechanisms as a sampling process to obtain tighter upper bounds for the privacy guarantees to the ground truth data. Overall, this study contributes to a better understanding of the challenges and opportunities for using private synthetic data generation algorithms in the presence of missing data.
更多
查看译文
关键词
private data generation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要