TY - GEN
T1 - Enhancing data migration performance via parallel data compression
AU - Lee, Jonghyun
AU - Winslett, M.
AU - Ma, Xiaosong
AU - Yu, Shengke
N1 - Publisher Copyright:
© 2002 IEEE.
PY - 2002
Y1 - 2002
N2 - Scientific simulations often produce large volumes of output that are moved to another platform for visualization or storage. This long-distance migration is slow due to the data size and slow network. Compression can improve migration performance by reducing the data size, but compression is computation-intensive and so can raise costs. In this work, we show how to reduce data migration cost by incorporating compression into migration. We analyze eight scientific data sets, and propose three approaches for parallel compression of scientific data. Our results show that with reasonably fast processors and typical parallel configurations, the compression cost for large scientific data is outweighed by the performance gain obtained by migrating less data. We found that a client-side compression approach (CC) can improve I/O and migration performance by an order of magnitude. In our experiments, CC always matches or outperforms migration without compression when we overlap migration with computation, even for not very compressible dense floating point data. We also present a variant of CC that is well suited for use with implementations of two-phase I/O.
AB - Scientific simulations often produce large volumes of output that are moved to another platform for visualization or storage. This long-distance migration is slow due to the data size and slow network. Compression can improve migration performance by reducing the data size, but compression is computation-intensive and so can raise costs. In this work, we show how to reduce data migration cost by incorporating compression into migration. We analyze eight scientific data sets, and propose three approaches for parallel compression of scientific data. Our results show that with reasonably fast processors and typical parallel configurations, the compression cost for large scientific data is outweighed by the performance gain obtained by migrating less data. We found that a client-side compression approach (CC) can improve I/O and migration performance by an order of magnitude. In our experiments, CC always matches or outperforms migration without compression when we overlap migration with computation, even for not very compressible dense floating point data. We also present a variant of CC that is well suited for use with implementations of two-phase I/O.
UR - http://www.scopus.com/inward/record.url?scp=84966559314&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2002.1015528
DO - 10.1109/IPDPS.2002.1015528
M3 - Conference contribution
AN - SCOPUS:84966559314
T3 - Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002
SP - 444
EP - 451
BT - Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th International Parallel and Distributed Processing Symposium, IPDPS 2002
Y2 - 15 April 2002 through 19 April 2002
ER -