Introducing Polyglot-Based Data-Flow Awareness to Time-Series Data Stores

IEEE ACCESS(2022)

引用 1|浏览17
暂无评分
摘要
The rising interest in extracting value from data has led to a broad proliferation of monitoring infrastructures, most notably composed by sensors, intended to collect this new oil. Thus, gathering data has become fundamental for a great number of applications, such as predictive maintenance techniques or anomaly detection algorithms. However, before data can be refined into insights and knowledge, it has to be efficiently stored and prepared for its later retrieval. As a consequence of this sensor and IoT boom, Time-Series databases (TSDB), designed to manage sensor data, became the fastest-growing database category since 2019. Here we propose a holistic approach intended to improve TSDB's performance and efficiency. More precisely, we introduce and evaluate a novel polyglot-based approximation, aimed to tailor the data store, not only to time-series data-as it is done conventionally- but also to the data flow itself: From its ingestion, until its retrieval. In order to evaluate the approach, we materialize it in an alternative implementation of NagareDB, a resource-efficient time-series database, based on MongoDB, in turn, the most popular NoSQL storage solution. After implementing our approach into the database, we observe a global speed up, solving queries up to 12 times faster than MongoDB's recently launched Time-series capability, as well as generally outperforming InfluxDB, the most popular time-series database. Our polyglot-based data-flow aware solution can ingest data more than two times faster than MongoDB, InfluxDB, and NagareDB's original implementation, while using the same disk space as InfluxDB, and half of the requested by MongoDB.
更多
查看译文
关键词
Databases, Data models, Sensor phenomena and characterization, Monitoring, Time series analysis, Structured Query Language, Real-time systems, Cascading polyglot persistence, data-flow awareness, data cascade, data store, data stream, MongoDB, multi-model database, NagareDB, time-series database
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要