TY - JOUR
T1 - BRCA1-specific machine learning model predicts variant pathogenicity with high accuracy
AU - Khandakji, Mohannad
AU - Habish, Hind Hassan Ahmed
AU - Abdulla, Nawal Bakheet Salem
AU - Kusasi, Sitti Apsa Albani
AU - Abdou, Nema Mahmoud Ghobashy
AU - Al-Mulla, Hajer Mahmoud M.A.
AU - Al Sulaiman, Reem Jawad A.A.
AU - Jassoum, Salha M.Bu
AU - Mifsud, Borbala
N1 - Publisher Copyright:
© 2023 The Authors.
PY - 2023/8
Y1 - 2023/8
N2 - Identification of novel BRCA1 variants outpaces their clinical annotation which highlights the importance of developing accurate computational methods for risk assessment. Therefore our aim was to develop a BRCA1-specific machine learning model to predict the pathogenicity of all types of BRCA1 variants and to apply this model and our previous BRCA2-specific model to assess BRCA variants of uncertain significance (VUS) among Qatari patients with breast cancer. We developed an XGBoost model that utilizes variant information such as position frequency and consequence as well as prediction scores from numerous in silico tools. We trained and tested the model with BRCA1 variants that were reviewed and classified by the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium. In addition we tested the model’s performance on an independent set of missense variants of uncertain significance with experimentally determined functional scores. The model per-formed excellently in predicting the pathogenicity of ENIGMA-classified variants (accuracy: 99.9%) and in predicting the functional consequence of the independent set of missense variants (accuracy: 93.4%). Moreover it predicted 2 115 potentially pathogenic variants among the 31 058 unreviewed BRCA1 variants in the BRCA exchange database. Using two BRCA-specific models we did not identify any pathogenic BRCA1 variants among those found in patients in Qatar but predicted four potentially pathogenic BRCA2 variants, which could be prioritized for functional validation.
AB - Identification of novel BRCA1 variants outpaces their clinical annotation which highlights the importance of developing accurate computational methods for risk assessment. Therefore our aim was to develop a BRCA1-specific machine learning model to predict the pathogenicity of all types of BRCA1 variants and to apply this model and our previous BRCA2-specific model to assess BRCA variants of uncertain significance (VUS) among Qatari patients with breast cancer. We developed an XGBoost model that utilizes variant information such as position frequency and consequence as well as prediction scores from numerous in silico tools. We trained and tested the model with BRCA1 variants that were reviewed and classified by the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium. In addition we tested the model’s performance on an independent set of missense variants of uncertain significance with experimentally determined functional scores. The model per-formed excellently in predicting the pathogenicity of ENIGMA-classified variants (accuracy: 99.9%) and in predicting the functional consequence of the independent set of missense variants (accuracy: 93.4%). Moreover it predicted 2 115 potentially pathogenic variants among the 31 058 unreviewed BRCA1 variants in the BRCA exchange database. Using two BRCA-specific models we did not identify any pathogenic BRCA1 variants among those found in patients in Qatar but predicted four potentially pathogenic BRCA2 variants, which could be prioritized for functional validation.
KW - BRCA2
KW - VUS
KW - breast cancer
KW - in silico predictions
KW - ovarian cancer
UR - http://www.scopus.com/inward/record.url?scp=85165518539&partnerID=8YFLogxK
U2 - 10.1152/physiolgenomics.00033.2023
DO - 10.1152/physiolgenomics.00033.2023
M3 - Article
C2 - 37335020
AN - SCOPUS:85165518539
SN - 1094-8341
VL - 55
SP - 315
EP - 323
JO - Physiological Genomics
JF - Physiological Genomics
IS - 8
ER -