OutPyR: Bayesian inference for RNA-Seq outlier detection

Edin Salkovic*, Mostafa M. Abbas, Samir Brahim Belhaouari, Khaoula Errafii, Halima Bensmail

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

High-throughput RNA sequencing technologies (RNA-Seq) have recently started being used as a tool for helping diagnose rare genetic disorders, as they can indicate abnormal gene expression counts — a telltale sign of genetic pathology. Existing solutions either require a large number of samples or do not provide proper statistical significance testing. We present a Bayesian model (OutPyR) for identifying abnormal RNA-Seq gene expression counts in datasets, particularly those with a small number of samples. The model incorporates recently introduced data-augmentation techniques to efficiently and accurately infer parameters of the underlying negative binomial process, while also assessing the uncertainty of the inference, and giving the possibility to generate simulated data. The model's software implementation is object oriented and thus easily extensible, provides parameter-trace exploration, fault-tolerance and recovery during the parameter estimation process. We also develop a p-value based outlier score that naturally stems from our model. We apply the model to real and simulated datasets, for different organisms and tissues, and present comparisons with existing models. Our model is implemented purely in Python and its standalone source code is available at https://github.com/esalkovic/outpyr.

Original languageEnglish
Article number101245
JournalJournal of Computational Science
Volume47
DOIs
Publication statusPublished - Nov 2020

Keywords

  • Bayesian modeling
  • Outlier detection
  • RNA-Seq

Fingerprint

Dive into the research topics of 'OutPyR: Bayesian inference for RNA-Seq outlier detection'. Together they form a unique fingerprint.

Cite this