TY - GEN
T1 - Optimizing cross-platform data movement
AU - Kruse, Sebastian
AU - Kaoudi, Zoi
AU - Quiane-Ruiz, Jorge Arnulfo
AU - Chawla, Sanjay
AU - Naumann, Felix
AU - Contreras-Rojas, Bertty
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/4
Y1 - 2019/4
N2 - Data analytics are moving beyond the limits of a single data processing platform. A cross-platform query optimizer is necessary to enable applications to run their tasks over multiple platforms efficiently and in a platform-agnostic manner. For the optimizer to be effective, it must consider data movement costs across different data processing platforms. In this paper, we present the graph-based data movement strategy used by Rheem, our open-source cross-platform system. In particular, we (i) model the data movement problem as a new graph problem, which we prove to be NP-hard, and (ii) propose a novel graph exploration algorithm, which allows Rheem to discover multiple hidden opportunities for cross-platform data processing.
AB - Data analytics are moving beyond the limits of a single data processing platform. A cross-platform query optimizer is necessary to enable applications to run their tasks over multiple platforms efficiently and in a platform-agnostic manner. For the optimizer to be effective, it must consider data movement costs across different data processing platforms. In this paper, we present the graph-based data movement strategy used by Rheem, our open-source cross-platform system. In particular, we (i) model the data movement problem as a new graph problem, which we prove to be NP-hard, and (ii) propose a novel graph exploration algorithm, which allows Rheem to discover multiple hidden opportunities for cross-platform data processing.
KW - Cross-platform
KW - Data movement
KW - Polystore
KW - Query opimization
UR - http://www.scopus.com/inward/record.url?scp=85067927337&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2019.00162
DO - 10.1109/ICDE.2019.00162
M3 - Conference contribution
AN - SCOPUS:85067927337
T3 - Proceedings - International Conference on Data Engineering
SP - 1642
EP - 1645
BT - Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
PB - IEEE Computer Society
T2 - 35th IEEE International Conference on Data Engineering, ICDE 2019
Y2 - 8 April 2019 through 11 April 2019
ER -