TY - GEN
T1 - DMA-assisted, intranode communication in GPU accelerated systems
AU - Ji, Feng
AU - Aji, Ashwin M.
AU - Dinan, James
AU - Buntinas, Darius
AU - Balaji, Pavan
AU - Thakur, Rajeev
AU - Feng, Wu Chun
AU - Ma, Xiaosong
PY - 2012
Y1 - 2012
N2 - Accelerator awareness has become a pressing issue in data movement models, such as MPI, because of the rapid deployment of systems that utilize accelerators. In our previous work, we developed techniques to enhance MPI with accelerator awareness, thus allowing applications to easily and efficiently communicate data between accelerator memories. In this paper, we extend this work with techniques to perform efficient data movement between accelerators within the same node using a DMA-assisted, peer-to-peer intranode communication technique that was recently introduced for NVIDIA GPUs. We present a detailed design of our new approach to intranode communication and evaluate its improvement to communication and application performance using micro-kernel benchmarks and a 2D stencil application kernel.
AB - Accelerator awareness has become a pressing issue in data movement models, such as MPI, because of the rapid deployment of systems that utilize accelerators. In our previous work, we developed techniques to enhance MPI with accelerator awareness, thus allowing applications to easily and efficiently communicate data between accelerator memories. In this paper, we extend this work with techniques to perform efficient data movement between accelerators within the same node using a DMA-assisted, peer-to-peer intranode communication technique that was recently introduced for NVIDIA GPUs. We present a detailed design of our new approach to intranode communication and evaluate its improvement to communication and application performance using micro-kernel benchmarks and a 2D stencil application kernel.
KW - GPU
KW - Intranode communication
KW - MPI
UR - http://www.scopus.com/inward/record.url?scp=84870460850&partnerID=8YFLogxK
U2 - 10.1109/HPCC.2012.69
DO - 10.1109/HPCC.2012.69
M3 - Conference contribution
AN - SCOPUS:84870460850
SN - 9780769547497
T3 - Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012
SP - 461
EP - 468
BT - Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012
T2 - 14th IEEE International Conference on High Performance Computing and Communications, HPCC-2012 - 9th IEEE International Conference on Embedded Software and Systems, ICESS-2012
Y2 - 25 June 2012 through 27 June 2012
ER -