Message Passing Parallel Inplace Access Processor Stylus Server Coordinator Query Plan Stylus Server Intermediate Result Stylus Server Metadata Stylus Server

semanticscholar(2017)

引用 0|浏览9
暂无评分
摘要
RDF is one of the most commonly used knowledge representation forms. Many highly influential knowledge bases, such as Freebase and PubChemRDF, are in RDF format. An RDF data set is usually represented as a collection of subject-predicate-object triples. Despite the flexibility of RDF triples, it is challenging to serve SPARQL queries on RDF data efficiently by directly managing triples due to the following two reasons. First, heavy joins on a large number of triples are needed for query processing, resulting in a large number of data scans and large redundant intermediate results; Second, weakly-typed triple representation provides suboptimal random access – typically with logarithmic complexity. This data access challenge, unfortunately, cannot be easily met by a better query optimizer as large graph processing is extremely I/O-intensive. In this paper, we argue that strongly-typed graph representation is the key to high-performance RDF query processing. We propose Stylus – a strongly-typed store for serving massive RDF data. Stylus exploits a strongly-typed storage scheme to boost the performance of RDF query processing. The storage scheme is essentially a materialized join view on entities, it thus can eliminate a large number of unnecessary joins on triples. Moreover, it is equipped with a compact representation for intermediate results and an efficient graphdecomposition based query planner. Experimental results on both synthetic and real-life RDF data sets confirm that the proposed approach can dramatically boost the performance of SPARQL query processing. PVLDB Reference Format: Liang He, Bin Shao, Yatao Li, Huanhuan Xia, Yanghua Xiao, Enhong Chen, Liang Jeff Chen. Stylus: A Strongly-Typed Store for Serving Massive RDF Data. PVLDB, 11(2): xxxx-yyyy, 2018. DOI: 10.14778/3149193.3149200 ∗This work was done in Microsoft Research Asia. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Articles from this volume were invited to present their results at The 44th International Conference on Very Large Data Bases, August 2018, Rio de Janeiro, Brazil. Proceedings of the VLDB Endowment, Vol. 11, No. 2 Copyright 2017 VLDB Endowment 2150-8097/17/10... $ 10.00. DOI: 10.14778/3149193.3149200
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要