DISE: A Distributed in-Memory SPARQL Processing Engine over Tensor Data

2020 IEEE 14th International Conference on Semantic Computing (ICSC)(2020)

引用 6|浏览173
暂无评分
摘要
SPARQL is a W3C standard for querying the data stored as Resource Description Framework (RDF). The SPARQL queries are represented using triple-patterns, and the querying process searches for these patterns in given RDF. Most of the existing SPARQL evaluators provide centralized, DBMS inspired solutions consuming high resources and offering limited flexibility. To deal with the increasing size of RDF data, it is important to develop scalable and efficient solutions for distributed SPARQL query evaluation. In this paper, we present DISE - an open-source implementation of distributed in-memory SPARQL engine that can scale out to a cluster of machines. DISE represents the RDF graph as a three-way distributed tensor for querying large-scale RDF datasets. This distributed tensor representation offers opportunities for novel distributed applications. DISE translates the SPARQL queries into Spark-tensor operations by exploiting the information about the query complexity and creating a dynamic execution plan. We have tested the scalability and efficiency of DISE on different datasets. The results for this new representation based querying have been found scalable, efficient and comparable to a related approach.
更多
查看译文
关键词
SPARQL,Scalable,distributed,SPARK,query,RDF,SANSA,tensor
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要