TY - JOUR
T1 - Genome sequencing data analysis for rare disease gene discovery
AU - Umlai, Umm Kulthum Ismail
AU - Bangarusamy, Dhinoth Kumar
AU - Estivill, Xavier
AU - Jithesh, Puthen Veettil
N1 - Publisher Copyright:
© 2021 The Author(s) 2021. Published by Oxford University Press. All rights reserved.
PY - 2022/1/17
Y1 - 2022/1/17
N2 - Rare diseases occur in a smaller proportion of the general population, which is variedly defined as less than 200 000 individuals (US) or in less than 1 in 2000 individuals (Europe). Although rare, they collectively make up to approximately 7000 different disorders, with majority having a genetic origin, and affect roughly 300 million people globally. Most of the patients and their families undergo a long and frustrating diagnostic odyssey. However, advances in the field of genomics have started to facilitate the process of diagnosis, though it is hindered by the difficulty in genome data analysis and interpretation. A major impediment in diagnosis is in the understanding of the diverse approaches, tools and datasets available for variant prioritization, the most important step in the analysis of millions of variants to select a few potential variants. Here we present a review of the latest methodological developments and spectrum of tools available for rare disease genetic variant discovery and recommend appropriate data interpretation methods for variant prioritization. We have categorized the resources based on various steps of the variant interpretation workflow, starting from data processing, variant calling, annotation, filtration and finally prioritization, with a special emphasis on the last two steps. The methods discussed here pertain to elucidating the genetic basis of disease in individual patient cases via trio- or family-based analysis of the genome data. We advocate the use of a combination of tools and datasets and to follow multiple iterative approaches to elucidate the potential causative variant.
AB - Rare diseases occur in a smaller proportion of the general population, which is variedly defined as less than 200 000 individuals (US) or in less than 1 in 2000 individuals (Europe). Although rare, they collectively make up to approximately 7000 different disorders, with majority having a genetic origin, and affect roughly 300 million people globally. Most of the patients and their families undergo a long and frustrating diagnostic odyssey. However, advances in the field of genomics have started to facilitate the process of diagnosis, though it is hindered by the difficulty in genome data analysis and interpretation. A major impediment in diagnosis is in the understanding of the diverse approaches, tools and datasets available for variant prioritization, the most important step in the analysis of millions of variants to select a few potential variants. Here we present a review of the latest methodological developments and spectrum of tools available for rare disease genetic variant discovery and recommend appropriate data interpretation methods for variant prioritization. We have categorized the resources based on various steps of the variant interpretation workflow, starting from data processing, variant calling, annotation, filtration and finally prioritization, with a special emphasis on the last two steps. The methods discussed here pertain to elucidating the genetic basis of disease in individual patient cases via trio- or family-based analysis of the genome data. We advocate the use of a combination of tools and datasets and to follow multiple iterative approaches to elucidate the potential causative variant.
KW - Bioinformatics tools
KW - Next-generation sequencing
KW - Rare diseases
KW - Trio-analysis
KW - Variant analysis
KW - Variant annotation
KW - Variant filtration
KW - Variant interpretation
KW - Variant prioritization
KW - Whole genome sequencing
UR - http://www.scopus.com/inward/record.url?scp=85123813266&partnerID=8YFLogxK
U2 - 10.1093/bib/bbab363
DO - 10.1093/bib/bbab363
M3 - Review article
C2 - 34498682
AN - SCOPUS:85123813266
SN - 1467-5463
VL - 23
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 1
M1 - bbab363
ER -