K Nearest Neighbor OveRsampling approach: An open source python package for data augmentation

Ashhadul Islam*, Samir Brahim Belhaouari, Atiq Ur Rehman, Halima Bensmail

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Data is present in abundance, but the problem of imbalanced dataset crops up time and again, vexing classifiers and reducing accuracy. This paper introduces K Nearest Neighbor OveRsampling (KNNOR) Algorithm — a novel data augmentation technique that considers the distribution of data and takes into account the k nearest neighbors while generating artificial data points. The KNNOR algorithm has outperformed the state-of-the-art augmentation algorithms by enabling classifiers to achieve much higher accuracy after injecting artificial minority datapoints into imbalanced datasets. This method is useful especially in health datasets where an imbalance is common and can even be applied to images of lower dimensions.

Original languageEnglish
Article number100272
JournalSoftware Impacts
Volume12
DOIs
Publication statusPublished - May 2022

Keywords

  • Data augmentation
  • Imbalanced data
  • Machine learning
  • Nearest neighbor

Fingerprint

Dive into the research topics of 'K Nearest Neighbor OveRsampling approach: An open source python package for data augmentation'. Together they form a unique fingerprint.

Cite this