Stream Reasoning

Alessandra Mileo,Minh Dao-Tran,Thomas Eiter,Michael Fink

Encyclopedia of Database Systems（2018）

引用 0|浏览5

暂无评分

摘要

ion techniques like data filtering or summarizing are available and can be readily used. At the global level in a distributed setting, the main challenge is that global information such as the whole routing strategy or communication branches are invisible to nodes in the system. Each node merely knows parent and child nodes, and none has a clear picture of the system status to decide about the routing strategy, or to drop input/intermediate results at bottlenecks to improve global performance. To overcome this, the nodes need sophisticated methods to communicate enough information such that they can realize the need to adjust the global strategy and inform other nodes. Furthermore, a number of parameters needs a fine-grained investigation, namely, (i) the system size, i.e., the number of distributed nodes, (ii) the density and topology of the communication network, and (iii) the size of the knowledge base at each local node. The key aspect is to understand how these parameters scale while performance is still met; what is the limitation in increasing one parameter while fixing the others. At the local and the global level, several reasoning tasks such as inconsistency checking, query answering, etc., and restrictions such as approximate semantics, syntactic restrictions, etc. are of interest. Investigating their computational complexity for various query languages will give hints on the theoretical scalability border. For example, with Datalog, depending on the version, the complexity ranges from AC0 to Σ 2 under well-founded and answer set semantics [?]. Benchmarking and comparison. The main stream processing/reasoning approaches have been independently proposed and few works compare them more extensively. Linear Road [?] is currently the only complete benchmarking system for DSMSs, simulating an ad-hoc toll system for motor vehicle expressways. It uses L-Rating as a single metric to measure performance, i.e., the number of expressways a system can process while meeting time and correctness constraints. Other interesting aspects for comparison are still open, e.g., expressiveness, manageable query patterns, scalability under varying static data size, number of simultaneous queries, etc. A preliminary work [?] proposed methods to cross-compare LSD processing engines on aspects such as functionality, correctness, and performance. They capture more basic evaluation tasks such as projection, filter, join, etc., rather than reasoning aspects, but can serve as a starting point for a more general comparison. The lack of extensive comparison is due to several obstacles. On the practical side, comparing stream engines is non-trivial for several reasons: (i) they are based on different semantics which a benchmarking system must respect; (ii) there is pushedvs. pull-based evaluation (also called datavs. time-driven, eager vs. periodical evaluation): the former triggers computation when new input arrives, while the latter runs it periodically independent of input arrival; thus output may be missed (resp., repeated) when the input rate is too high (resp., slow) compared to the processing rate; (iii) the engines are fragile as any runtime deviation (e.g. processing delay) can lead to inaccurate output, especially for queries with aggregates; (iv) some engines are black boxes. On the theoretical side, there is no formal foundations on which existing stream processing/reasoning approaches can be captured and thus compared systematically. On both the theoretical and the practical sides, the main issues are sufficiently generic measurements and metrics, along with methods to realize them for a fair comparison.

查看译文

关键词

stream

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要