DRS+: Load Shedding Meets Resource Auto-Scaling in Distributed Stream Processing

Kailin Tang, Zhifeng Hao, Ruichu Cai, Tom Z.J. Fu, Yin Yang, Li Wang, Marianne Winslett, Zhenjie Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Distributed stream processing is nowadays a leading paradigm for managing massive streaming data and performing real-Time analytics on such streams. Since data volume and distribution in the input streams often change over time, a dynamic resource scheduler is often employed to ensure that the system meets response time constraints while being cost effective in terms of resource usage. Currently, resource schedulers in distributed streaming systems generally assume that each input must be completely processed, and allocate resources accordingly. In practice, this assumption can often be relaxed, since many stream analytics tasks do not require exact outputs. To our knowledge, however, no existing resource scheduler takes this fact into consideration, leading to unnecessary over-provisioning of resources.This paper presents DRS+, a novel dynamic resource scheduler that integrates load shedding into resource auto-scaling strategies. DRS+ is based on a unified model that establishes the relationship between response time, result accuracy and resource consumption, given the current workload statistics. Using this model, DRS+ computes the best resource allocation plan and load shedding strategy, and executes them through an efficient protocol that minimizes the computation and communication overhead at each operator. We have implemented DRS+ based on Apache Storm, and evaluated it using real dataset. The results demonstrate that DRS+ achieves low resource consumption and high result utility, while satisfying real-Time response constraints.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE 22nd International Conference on High Performance Computing and Communications, IEEE 18th International Conference on Smart City and IEEE 6th International Conference on Data Science and Systems, HPCC-SmartCity-DSS 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages292-301
Number of pages10
ISBN (Electronic)9781728176499
DOIs
Publication statusPublished - Dec 2020
Event22nd IEEE International Conference on High Performance Computing and Communications, 18th IEEE International Conference on Smart City and 6th IEEE International Conference on Data Science and Systems, HPCC-SmartCity-DSS 2020 - Virtual, Fiji, Fiji
Duration: 14 Dec 202016 Dec 2020

Publication series

NameProceedings - 2020 IEEE 22nd International Conference on High Performance Computing and Communications, IEEE 18th International Conference on Smart City and IEEE 6th International Conference on Data Science and Systems, HPCC-SmartCity-DSS 2020

Conference

Conference22nd IEEE International Conference on High Performance Computing and Communications, 18th IEEE International Conference on Smart City and 6th IEEE International Conference on Data Science and Systems, HPCC-SmartCity-DSS 2020
Country/TerritoryFiji
CityVirtual, Fiji
Period14/12/2016/12/20

Fingerprint

Dive into the research topics of 'DRS+: Load Shedding Meets Resource Auto-Scaling in Distributed Stream Processing'. Together they form a unique fingerprint.

Cite this