TY - GEN
T1 - Skel
T2 - 7th IEEE International Conference on e-Science Workshops, eScienceW 201
AU - Logan, Jeremy
AU - Klasky, Scott
AU - Lofstead, Jay
AU - Abbasi, Hasan
AU - Ethier, Stéphane
AU - Grout, Ray
AU - Ku, Seung Hoe
AU - Liu, Qing
AU - Ma, Xiaosong
AU - Parashar, Manish
AU - Podhorszki, Norbert
AU - Schwan, Karsten
AU - Wolf, Matthew
PY - 2011
Y1 - 2011
N2 - Massively parallel computations consist of a mixture of computation, communication, and I/O. Of these three components, implementing an effective parallel I/O solution has often been overlooked by application scientists and has typically been added to large scale simulations only when existing serial techniques have failed. As scientists' teams scaled their codes to run on hundreds of processors, it was common to call on an I/O expert to implement a set of more scalable I/O routines. These routines were easily separated from the calculations and communication, and in many cases, an I/O kernel was derived from the application which could be used for testing I/O performance independent of the application. These I/O kernels developed a life of their own used as a broad measure for comparing different I/O techniques. Unfortunately, as years passed and computation and communication changes required changes to the I/O, the separate I/O kernel used for benchmarking remained static, no longer providing an accurate indicator of the I/O performance of the simulation, and making I/O research less relevant for the application scientists. In this paper we describe a new approach to this problem where I/O kernels are replaced with skeletal I/O applications that are automatically generated from an abstract set of simulation I/O parameters. We realize this abstraction by leveraging the ADIOS [1] middleware's XML I/O specification with additional runtime parameters. Skeletal applications offer all of the benefits of I/O kernels including allowing I/O optimizations to focus on useful I/O patterns. Moreover, since they are automatically generated, it is easy to produce an updated I/O skeleton whenever the simulation's I/O changes. In this paper we analyze the performance of automatically generated I/O skeletal applications for the S3D and GTS codes. We show that these skeletal applications achieve performance comparable to that of the production applications. We wrap up the paper with a discussion of future changes to make the skeletal application better approximate the actual I/O performed in the simulation.
AB - Massively parallel computations consist of a mixture of computation, communication, and I/O. Of these three components, implementing an effective parallel I/O solution has often been overlooked by application scientists and has typically been added to large scale simulations only when existing serial techniques have failed. As scientists' teams scaled their codes to run on hundreds of processors, it was common to call on an I/O expert to implement a set of more scalable I/O routines. These routines were easily separated from the calculations and communication, and in many cases, an I/O kernel was derived from the application which could be used for testing I/O performance independent of the application. These I/O kernels developed a life of their own used as a broad measure for comparing different I/O techniques. Unfortunately, as years passed and computation and communication changes required changes to the I/O, the separate I/O kernel used for benchmarking remained static, no longer providing an accurate indicator of the I/O performance of the simulation, and making I/O research less relevant for the application scientists. In this paper we describe a new approach to this problem where I/O kernels are replaced with skeletal I/O applications that are automatically generated from an abstract set of simulation I/O parameters. We realize this abstraction by leveraging the ADIOS [1] middleware's XML I/O specification with additional runtime parameters. Skeletal applications offer all of the benefits of I/O kernels including allowing I/O optimizations to focus on useful I/O patterns. Moreover, since they are automatically generated, it is easy to produce an updated I/O skeleton whenever the simulation's I/O changes. In this paper we analyze the performance of automatically generated I/O skeletal applications for the S3D and GTS codes. We show that these skeletal applications achieve performance comparable to that of the production applications. We wrap up the paper with a discussion of future changes to make the skeletal application better approximate the actual I/O performed in the simulation.
UR - http://www.scopus.com/inward/record.url?scp=84863078107&partnerID=8YFLogxK
U2 - 10.1109/eScienceW.2011.26
DO - 10.1109/eScienceW.2011.26
M3 - Conference contribution
AN - SCOPUS:84863078107
SN - 9780769545981
T3 - Proceedings - 7th IEEE International Conference on e-Science Workshops, eScienceW 2011
SP - 191
EP - 198
BT - Proceedings - 7th IEEE International Conference on e-Science Workshops, eScienceW 2011
Y2 - 5 December 2011 through 8 December 2011
ER -