Scalable Time Series Compound Infrastructure

Noura S. Alghamdi, Liang Zhang, Elke A. Rundensteiner, Mohamed Y. Eltabakh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Objects ranging from a patient's history of medical tests to an IoT device's series of sensor maintenance records leave digital traces in the form of big time series. These time series objects do not only span exceedingly long time periods (sometimes years), but are also characterized by intermittent yet interrelated time series measurements punctuated by long gaps of silence. This prevalent data type, which we refer to as Time Series Compound objects (or, TSC), has been largely overlooked in the literature. Unique challenges arise when managing, querying and analyzing repositories of these big TSC objects. These include appropriate similarity semantics with time misalignment resiliency, efficient storage of excessively long and complex objects, and TSC-holistic indexing. We demonstrate that state-of-the-art time series systems, although effective at indexing and searching regular time series data, fail to support such big TSC data. In this work, we introduce the first comprehensive solution for managing TSC objects as first class citizen. We introduce new similarity-match semantics as well as a compact misalignment-resilient representation for TSCs. Upon this foundation, we then design a TSC-aware distributed indexing infrastructure Sloth that supports scalable storage, indexing and querying of TB-scale TSC datasets. Our experimental study demonstrates that for TB-scale datasets, the query response time of Sloth is up to one order of magnitude faster than that of existing systems, while the mean average precision (mAP) for approximate kNN similarity match query results by Sloth is 70% more accurate than existing solutions.

Original languageEnglish
Title of host publicationSIGMOD 2022 - Proceedings of the 2022 International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1685-1698
Number of pages14
ISBN (Electronic)9781450392495
DOIs
Publication statusPublished - 10 Jun 2022
Externally publishedYes
Event2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022 - Virtual, Online, United States
Duration: 12 Jun 202217 Jun 2022

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022
Country/TerritoryUnited States
CityVirtual, Online
Period12/06/2217/06/22

Keywords

  • KNN approximate query
  • distributed indexing
  • similarity search
  • sloth
  • time series compound

Fingerprint

Dive into the research topics of 'Scalable Time Series Compound Infrastructure'. Together they form a unique fingerprint.

Cite this