High performance statistical computing with parallel R: Applications to biology and climate modelling

Nagiza F. Samatova*, Marcia Branstetter, Auroop R. Ganguly, Robert Hettich, Shiraj Khan, Guruprasad Kora, Jiangtian Li, Xiaosong Ma, Chongle Pan, Arie Shoshani, Srikanth Yoginath

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)

Abstract

Ultrascale computing and high-throughput experimental technologies have enabled the production of scientific data about complex natural phenomena. With this opportunity, comes a new problem - the massive quantities of data so produced. Answers to fundamental questions about the nature of those phenomena remain largely hidden in the produced data. The goal of this work is to provide a scalable high performance statistical data analysis framework to help scientists perform interactive analyses of these raw data to extract knowledge. Towards this goal we have been developing an open source parallel statistical analysis package, called Parallel R, that lets scientists employ a wide range of statistical analysis routines on high performance shared and distributed memory architectures without having to deal with the intricacies of parallelizing these routines.

Original languageEnglish
Article number069
Pages (from-to)505-509
Number of pages5
JournalJournal of Physics: Conference Series
Volume46
Issue number1
DOIs
Publication statusPublished - 1 Oct 2006
Externally publishedYes

Fingerprint

Dive into the research topics of 'High performance statistical computing with parallel R: Applications to biology and climate modelling'. Together they form a unique fingerprint.

Cite this