Genome sequencing data analysis for rare disease gene discovery

Umm Kulthum Ismail Umlai, Dhinoth Kumar Bangarusamy, Xavier Estivill, Puthen Veettil Jithesh*

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

8 Citations (Scopus)


Rare diseases occur in a smaller proportion of the general population, which is variedly defined as less than 200 000 individuals (US) or in less than 1 in 2000 individuals (Europe). Although rare, they collectively make up to approximately 7000 different disorders, with majority having a genetic origin, and affect roughly 300 million people globally. Most of the patients and their families undergo a long and frustrating diagnostic odyssey. However, advances in the field of genomics have started to facilitate the process of diagnosis, though it is hindered by the difficulty in genome data analysis and interpretation. A major impediment in diagnosis is in the understanding of the diverse approaches, tools and datasets available for variant prioritization, the most important step in the analysis of millions of variants to select a few potential variants. Here we present a review of the latest methodological developments and spectrum of tools available for rare disease genetic variant discovery and recommend appropriate data interpretation methods for variant prioritization. We have categorized the resources based on various steps of the variant interpretation workflow, starting from data processing, variant calling, annotation, filtration and finally prioritization, with a special emphasis on the last two steps. The methods discussed here pertain to elucidating the genetic basis of disease in individual patient cases via trio- or family-based analysis of the genome data. We advocate the use of a combination of tools and datasets and to follow multiple iterative approaches to elucidate the potential causative variant.

Original languageEnglish
Article numberbbab363
Number of pages28
JournalBriefings in Bioinformatics
Issue number1
Publication statusPublished - 17 Jan 2022


  • Bioinformatics tools
  • Next-generation sequencing
  • Rare diseases
  • Trio-analysis
  • Variant analysis
  • Variant annotation
  • Variant filtration
  • Variant interpretation
  • Variant prioritization
  • Whole genome sequencing


Dive into the research topics of 'Genome sequencing data analysis for rare disease gene discovery'. Together they form a unique fingerprint.

Cite this