Parameter Database: Data-centric Synchronization for Scalable Machine Learning

Naman Goel, Divyakant Agrawal, Sanjay Chawla, Ahmed Khalifa Elmagarmid

Research output: Book/ReportCommissioned reportpeer-review

Abstract

We propose a new data-centric synchronization framework for carrying out of machine learning (ML) tasks in a distributed environment. Our framework exploits the iterative nature of ML algorithms and relaxes the application agnostic bulk synchronization parallel (BSP) paradigm that has previously been used for distributed machine learning. Data-centric synchronization complements function-centric synchronization based on using stale updates to increase the throughput of distributed ML computations. Experiments to validate our framework suggest that we can attain substantial improvement over BSP while guaranteeing sequential correctness of ML tasks.
Original languageEnglish
DOIs
Publication statusPublished - 21 Jun 2015

Fingerprint

Dive into the research topics of 'Parameter Database: Data-centric Synchronization for Scalable Machine Learning'. Together they form a unique fingerprint.

Cite this