Red-RF: Reduced Random Forest for Big Data Using Priority Voting & Dynamic Data Reduction

Hussein Mohsen, Hasan Kurban, Kurt Zimmer, Mark Jenne, Mehmet M. Dalkilic

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Citations (Scopus)

Abstract

Random Forests have been used as effective ensemble models for classification. We present in this paper a new type of Random Forests (RFs) called Red(uced) RF that adopts a new dynamic data reduction principle and a new voting mechanism called Priority Vote Weighting (PV) which improve accuracy, execution time and AUC values compared to Breiman's RF. Red-RF also shows that the strength of a random forest can increase without noticeably increasing correlation between the trees. We then compare performance of Red-RF and Breiman's RF in 8 experiments that involve classification problems with datasets of different sizes. Finally, we conduct 2 additional experiments that involve considerably big datasets with one million points in each.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE International Congress on Big Data, BigData Congress 2015
EditorsLatifur Khan, Carminati Barbara
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages118-125
Number of pages8
ISBN (Electronic)9781467372787
DOIs
Publication statusPublished - 17 Aug 2015
Externally publishedYes
Event4th IEEE International Congress on Big Data, BigData Congress 2015 - New York City, United States
Duration: 27 Jun 20152 Jul 2015

Publication series

NameProceedings - 2015 IEEE International Congress on Big Data, BigData Congress 2015

Conference

Conference4th IEEE International Congress on Big Data, BigData Congress 2015
Country/TerritoryUnited States
CityNew York City
Period27/06/152/07/15

Keywords

  • big data
  • classification
  • random forests
  • weighted voting

Fingerprint

Dive into the research topics of 'Red-RF: Reduced Random Forest for Big Data Using Priority Voting & Dynamic Data Reduction'. Together they form a unique fingerprint.

Cite this