TY - GEN
T1 - RFS
T2 - 2004 IEEE International Conference on Cluster Computing, ICCC 2004
AU - Lee, Jonghyun
AU - Ma, Xiaosong
AU - Ross, Robert
AU - Thakur, Rajeev
AU - Winslett, Marianne
PY - 2004
Y1 - 2004
N2 - Scientific applications often need to access remote file systems. Because of slow networks and large data size, however, remote I/O can become an even more serious performance bottleneck than local I/O performance. In this work, we present RFS, a high-performance remote I/O facility for ROMIO, which is a well-known MPI-IO implementation. Our simple, portable, and flexible design eliminates the shortcomings of previous remote I/O efforts. In particular, RFS improves the remote I/O performance by adopting active buffering with threads (ABT), which hides I/O cost by aggressively buffering the output data using available memory and performing background I/O using threads while computation is taking place. Our experimental results show that RFS with ABT can significantly reduce the remote I/O visible cost, achieving up to 92% of the theoretical peak throughput. The computation slowdown caused by concurrent I/O activities was 0.2-6.2%, which is dwarfed by the overall performance improvement in application turnaround time.
AB - Scientific applications often need to access remote file systems. Because of slow networks and large data size, however, remote I/O can become an even more serious performance bottleneck than local I/O performance. In this work, we present RFS, a high-performance remote I/O facility for ROMIO, which is a well-known MPI-IO implementation. Our simple, portable, and flexible design eliminates the shortcomings of previous remote I/O efforts. In particular, RFS improves the remote I/O performance by adopting active buffering with threads (ABT), which hides I/O cost by aggressively buffering the output data using available memory and performing background I/O using threads while computation is taking place. Our experimental results show that RFS with ABT can significantly reduce the remote I/O visible cost, achieving up to 92% of the theoretical peak throughput. The computation slowdown caused by concurrent I/O activities was 0.2-6.2%, which is dwarfed by the overall performance improvement in application turnaround time.
UR - http://www.scopus.com/inward/record.url?scp=20444440714&partnerID=8YFLogxK
U2 - 10.1109/CLUSTR.2004.1392604
DO - 10.1109/CLUSTR.2004.1392604
M3 - Conference contribution
AN - SCOPUS:20444440714
SN - 0780386949
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
SP - 71
EP - 81
BT - 2004 IEEE International Conference on Cluster Computing, ICCC 2004
Y2 - 20 September 2004 through 23 September 2004
ER -