Distributed optimal subsampling for quantile regression with massive data

Journal of Statistical Planning and Inference(2024)

引用 0|浏览0
暂无评分
摘要
Methods for reducing distributed subsample sizes have increasingly become popular statistical problems in the big data era. Existing works of optimal subsample selection on the massive linear and generalized linear models with distributed data sources have been solidly investigated and widely applied. Nevertheless, few studies have developed distributed optimal subsample selection procedures for quantile regression in massive data. In such settings, the distributed optimal subsampling probabilities and subset sizes selection criteria need to be established simultaneously. In this work, we propose a distributed subsampling technique for the quantile regression models. The estimation approach is based on a two-step algorithm for the distributed subsampling procedures. Furthermore, the theoretical results, such as consistency and asymptotic normality of resultant estimators, are rigorously established under some regularity conditions. The empirical evaluation and performance of the proposed subsampling method are conducted in simulation experiments and real data applications.
更多
查看译文
关键词
Massive data,Distributed data sources,Quantile regression,Optimal subsampling probabilities,Optimal distributed subset sizes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要