Investigating GPU-Accelerated Kernel Density Estimators for Join Selectivity Estimation

semanticscholar(2016)

引用 0|浏览0
暂无评分
摘要
Kernel Density Estimators are a well-known tool from statistics to estimate probability density functions based on samples drawn from an unknown distribution. The estimate is provided by centering local probability density functions so called kernels on the sample points and averaging. The technique was successfully applied to the task of computing base table selectivities for hyper-rectangular range queries over real-valued attributes in relational databases. It was shown that estimate computations and error-driven hyper-parameter optimization can be efficiently accelerated using a GPU. In this thesis, we extend upon this approach by using the categorical kernel that centers a probability mass functions on discrete valued attributes. The resulting models provide an estimate to the joint relative frequency distribution on those attributes and can be used to estimate the probability of base table selectivities subject to conjunctive equality predicates. We show how and under which assumptions independent KDE models for two different tables may be used to efficiently compute the selectivity of joins subject to such selections. We provide an algorithm to compute estimates and gradients required for hyper-parameter optimization. Furthermore, we sketch how these computations can be implemented on a GPU. We show how our approach can be generalized to multiple subsequent joins, highlight properties of the estimator and provide experimental evaluation using an artificial and a real-world dataset.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要