TY - GEN
T1 - OceanRT
T2 - 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD 2014
AU - Zhang, Shiming
AU - Yang, Yin
AU - Fan, Wei
AU - Lan, Liang
AU - Yuan, Mingxuan
PY - 2014
Y1 - 2014
N2 - We demonstrate OceanRT, a novel cloud-based infrastructure that performs online analytics in real time, over large-scale temporal data such as call logs from a telecommunication company. Apart from proprietary systems for which few details have been revealed, most existing big-data analytics systems are built on top of an offline, MapReduce-style infrastructure, which inherently limits their efficiency. In contrast, OceanRT employs a novel computing architecture consisting of interconnected Access Query Engines (AQEs), as well as a new storage scheme that ensures data locality and fast access for temporal data. Our preliminary evaluation shows that OceanRT can be up to 10× faster than Impala [10], 12× faster than Shark [5], and 200× faster than Hive [13]. The demo will show how OceanRT manages a real call log dataset (around 5TB per day) from a large mobile network operator in China. Besides presenting the processing of a few preset queries, we also allow the audience to issue ad hoc HiveQL [13] queries, watch how OceanRT answers them, and compare the speed of OceanRT with its competitors.
AB - We demonstrate OceanRT, a novel cloud-based infrastructure that performs online analytics in real time, over large-scale temporal data such as call logs from a telecommunication company. Apart from proprietary systems for which few details have been revealed, most existing big-data analytics systems are built on top of an offline, MapReduce-style infrastructure, which inherently limits their efficiency. In contrast, OceanRT employs a novel computing architecture consisting of interconnected Access Query Engines (AQEs), as well as a new storage scheme that ensures data locality and fast access for temporal data. Our preliminary evaluation shows that OceanRT can be up to 10× faster than Impala [10], 12× faster than Shark [5], and 200× faster than Hive [13]. The demo will show how OceanRT manages a real call log dataset (around 5TB per day) from a large mobile network operator in China. Besides presenting the processing of a few preset queries, we also allow the audience to issue ad hoc HiveQL [13] queries, watch how OceanRT answers them, and compare the speed of OceanRT with its competitors.
KW - Design
KW - Management
KW - Performance
UR - http://www.scopus.com/inward/record.url?scp=84904327816&partnerID=8YFLogxK
U2 - 10.1145/2588555.2594513
DO - 10.1145/2588555.2594513
M3 - Conference contribution
AN - SCOPUS:84904327816
SN - 9781450323765
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1099
EP - 1102
BT - SIGMOD 2014 - Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
PB - Association for Computing Machinery
Y2 - 22 June 2014 through 27 June 2014
ER -