Comparison of pathogenicity prediction tools on somatic variants.

Fiche publication


Date publication

octobre 2020

Journal

The Journal of molecular diagnostics : JMD

Auteurs

Membres identifiés du Cancéropôle Est :
Pr HARLE Alexandre


Tous les auteurs :
Suybeng V, Koeppel F, Harlé A, Rouleau E

Résumé

Genomic sequencing has been increasingly used over the last decade as part of the management of patients with cancer. Interpretation of somatic variants and their pathogenicity is often complex. Pathogenicity prediction tools are commonly used as part of the expert interpretation of somatic variants, but most of these tools were initially developed for germline variants. The aim of this study was to benchmark their performance on somatic variants. To achieve this, we assembled a « gold standard » list of 4,319 somatic SNVs, classified as oncogenic (N=2,996) or neutral (N=1,323), based either on their presence in curated databases or on their allele frequency (AF) in the general population. We annotated these variants with the most commonly used prediction tools using dbNSFP and UMD-Predictor and we computed performance calculations. The stratification of the prediction tools based on Matthews correlation coefficient and area under the ROC curve allowed to identify the most performing ones, namely CADD, Eigen/Eigen-PC, Polyphen-2, PROVEAN, UMD-Predictor and REVEL. Interestingly, SIFT, which is a commonly used prediction tool for somatic variants, was ranked in the second performance category. Combining tools two by two only marginally improved performances, mainly because of the occurrence of discordant predictions.

Référence

J Mol Diagn. 2020 Oct 1;: