p-ClustVal: A Novel p-Adic Approach for Enhanced Clustering of High-Dimensional scRNASeq Data (Extended Abstract)

Parichit Sharma, Sarthak Mishra, Hasan Kurban, Mehmet Dalkilic

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper introduces p-ClustVal, a novel data transformation technique inspired by p-adic number theory that significantly enhances cluster discernibility in genomics data, specifically Single Cell RNA Sequencing (scRNASeq). By lever-aging p-adic-valuation, p-ClustVal integrates with and augments widely used clustering algorithms and dimension reduction techniques, amplifying their effectiveness in discovering meaningful structure from data. The transformation uses a data-centric heuristic to determine optimal parameters, without relying on ground truth labels, making it more user-friendly. p-ClustVal reduces overlap between clusters by employing alternate metric spaces inspired by p-adic-valuation, a significant shift from conventional methods. Our comprehensive evaluation spanning 30 experiments and over 1200 observations, shows that p-ClustVal improves performance in 91% of cases, and boosts the performance of classical and state of the art (SOTA) methods. This work contributes to data analytics and genomics by introducing a unique data transformation approach, enhancing downstream clustering algorithms, and providing empirical evidence of p-ClustVal's efficacy.

Original languageEnglish
Title of host publication2024 IEEE 11th International Conference on Data Science and Advanced Analytics, DSAA 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350364941
DOIs
Publication statusPublished - 10 Oct 2024
Event11th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2024 - San Diego, United States
Duration: 6 Oct 202410 Oct 2024

Publication series

Name2024 IEEE 11th International Conference on Data Science and Advanced Analytics, DSAA 2024

Conference

Conference11th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2024
Country/TerritoryUnited States
CitySan Diego
Period6/10/2410/10/24

Keywords

  • Data-Centric AI
  • Single Cell RNA Sequencing
  • Unsupervised Learning
  • p-Adic Numbers

Fingerprint

Dive into the research topics of 'p-ClustVal: A Novel p-Adic Approach for Enhanced Clustering of High-Dimensional scRNASeq Data (Extended Abstract)'. Together they form a unique fingerprint.

Cite this