#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Integrating Multiple Genomic Data to Predict Disease-Causing Nonsynonymous Single Nucleotide Variants in Exome Sequencing Studies


The detection of causative nonsynonymous single nucleotide variants (SNVs) is essential for the understanding of the pathogenesis of human inherited diseases. In this paper, we propose a statistical method called SPRING (Snv PRioritization via the INtegration of Genomic data) to combine six functional effect scores calculated by existing methods and five association scores derived from multiple genomic data sources to estimate the statistical significance that a nonsynonymous SNV is pathogenic for a query disease. We find that SPRING is effective in identifying disease-causing SNVs for diseases whose genetic bases are either partly known or completely unknown across a variety of inheritance styles. With real exome sequencing data, we show the qualified potential of SPRING in not only the detection of causative SNVs in simulation studies but also the identification of pathogenic de novo mutations for autism, epileptic encephalopathies and intellectual disability.


Vyšlo v časopise: Integrating Multiple Genomic Data to Predict Disease-Causing Nonsynonymous Single Nucleotide Variants in Exome Sequencing Studies. PLoS Genet 10(3): e32767. doi:10.1371/journal.pgen.1004237
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1004237

Souhrn

The detection of causative nonsynonymous single nucleotide variants (SNVs) is essential for the understanding of the pathogenesis of human inherited diseases. In this paper, we propose a statistical method called SPRING (Snv PRioritization via the INtegration of Genomic data) to combine six functional effect scores calculated by existing methods and five association scores derived from multiple genomic data sources to estimate the statistical significance that a nonsynonymous SNV is pathogenic for a query disease. We find that SPRING is effective in identifying disease-causing SNVs for diseases whose genetic bases are either partly known or completely unknown across a variety of inheritance styles. With real exome sequencing data, we show the qualified potential of SPRING in not only the detection of causative SNVs in simulation studies but also the identification of pathogenic de novo mutations for autism, epileptic encephalopathies and intellectual disability.


Zdroje

1. CooperGM, ShendureJ (2011) Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nature Reviews Genetics 12: 628–640.

2. ChoiM, SchollUI, JiW, LiuT, TikhonovaIR, et al. (2009) Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proceedings of the National Academy of Sciences 106: 19096–19101.

3. NgSB, TurnerEH, RobertsonPD, FlygareSD, BighamAW, et al. (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461: 272–276.

4. NgSB, BuckinghamKJ, LeeC, BighamAW, TaborHK, et al. (2010) Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 42: 30–35.

5. BamshadMJ, NgSB, BighamAW, TaborHK, EmondMJ, et al. (2011) Exome sequencing as a tool for Mendelian disease gene discovery. Nature Reviews Genetics 12: 745–755.

6. VissersLE, de LigtJ, GilissenC, JanssenI, SteehouwerM, et al. (2010) A de novo paradigm for mental retardation. Nature genetics 42: 1109–1112.

7. O'RoakBJ, DeriziotisP, LeeC, VivesL, SchwartzJJ, et al. (2011) Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nature genetics 43: 585–589.

8. GirardSL, GauthierJ, NoreauA, XiongL, ZhouS, et al. (2011) Increased exonic de novo mutation rate in individuals with schizophrenia. Nat Genet 43: 860–863.

9. BodmerW, BonillaC (2008) Common and rare variants in multifactorial susceptibility to common diseases. Nature genetics 40: 695–701.

10. WuJ, JiangR (2013) Prediction of Deleterious Nonsynonymous Single-Nucleotide Polymorphism for Human Diseases. The Scientific World Journal 2013 Article ID 675851.

11. KongA, FriggeML, MassonG, BesenbacherS, SulemP, et al. (2012) Rate of de novo mutations and the importance of father/'s age to disease risk. Nature 488: 471–475.

12. RivièreJ-B, van BonBW, HoischenA, KholmanskikhSS, O'RoakBJ, et al. (2012) De novo mutations in the actin genes ACTB and ACTG1 cause Baraitser-Winter syndrome. Nature genetics 44: 440–444.

13. XuB, Ionita-LazaI, RoosJL, BooneB, WoodrickS, et al. (2012) De novo gene mutations highlight patterns of genetic and neural complexity in schizophrenia. Nature genetics 44: 1365–1369.

14. LiM-X, KwanJS, BaoS-Y, YangW, HoS-L, et al. (2013) Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies. PLoS genetics 9: e1003143.

15. KumarP, HenikoffS, NgPC (2009) Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature protocols 4: 1073–1081.

16. AdzhubeiIA, SchmidtS, PeshkinL, RamenskyVE, GerasimovaA, et al. (2010) A method and server for predicting damaging missense mutations. Nature methods 7: 248–249.

17. ChunS, FayJC (2009) Identification of deleterious mutations within three human genomes. Genome research 19: 1553–1561.

18. SchwarzJM, RödelspergerC, SchuelkeM, SeelowD (2010) MutationTaster evaluates disease-causing potential of sequence alterations. Nature methods 7: 575–576.

19. CooperGM, StoneEA, AsimenosG, ProgramNCS, GreenED, et al. (2005) Distribution and intensity of constraint in mammalian genomic sequence. Genome research 15: 901–913.

20. Siepel A, Pollard KS, Haussler D (2006) New methods for detecting lineage-specific selection. Springer. pp. 190–205.

21. JiangR, YangH, SunF, ChenT (2006) Searching for interpretable rules for disease mutations: a simulated annealing bump hunting strategy. BMC Bioinformatics 7: 417.

22. YueP, MoultJ (2006) Identification and analysis of deleterious human SNPs. Journal of molecular biology 356: 1263–1274.

23. JiangR, YangH, ZhouL, KuoC-CJ, SunF, et al. (2007) Sequence-based prioritization of nonsynonymous single-nucleotide polymorphisms for the study of disease mutations. The American Journal of Human Genetics 81: 346–360.

24. BrombergY, RostB (2007) SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic acids research 35: 3823–3835.

25. LehmannKV, ChenT (2013) Exploring functional variant discovery in non-coding regions with SInBaD. Nucleic Acids Res 41: e7.

26. LiuX, JianX, BoerwinkleE (2011) dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions. Human mutation 32: 894–899.

27. BairochA, ApweilerR, WuCH, BarkerWC, BoeckmannB, et al. (2005) The universal protein resource (UniProt). Nucleic acids research 33: D154–D159.

28. JacqueminE, De VreeJML, CresteilD, SokalEM, SturmE, et al. (2001) The wide spectrum of multidrug resistance 3 deficiency: from neonatal cholestasis to cirrhosis of adulthood. Gastroenterology 120: 1448–1458.

29. LucenaJ-F, HerreroJI, QuirogaJ, SangroB, Garcia-FoncillasJ, et al. (2003) A multidrug resistance 3 gene mutation causing cholelithiasis, cholestasis of pregnancy, and adulthood biliary cirrhosis. Gastroenterology 124: 1037–1042.

30. DixonP, WeerasekeraN, LintonK, DonaldsonO, ChambersJ, et al. (2000) Heterozygous MDR3 missense mutation associated with intrahepatic cholestasis of pregnancy: evidence for a defect in protein trafficking. Human molecular genetics 9: 1209–1217.

31. MüllenbachR, LintonK, WiltshireS, WeerasekeraN, ChambersJ, et al. (2003) ABCB4 gene sequence variation in women with intrahepatic cholestasis of pregnancy. Journal of medical genetics 40: e70–e70.

32. Pauli-MagnusC, LangT, MeierY, Zodan-MarinT, JungD, et al. (2004) Sequence analysis of bile salt export pump (ABCB11) and multidrug resistance p-glycoprotein 3 (ABCB4, MDR3) in patients with intrahepatic cholestasis of pregnancy. Pharmacogenetics and Genomics 14: 91–102.

33. RosmorducO, HermelinB, BoellePY, ParcR, TabouryJ, et al. (2003) ABCB4 gene mutation—associated cholelithiasis in adults. Gastroenterology 125: 452–459.

34. RosmorducO, HermelinB, PouponR (2001) MDR3 gene defect in adults with symptomatic intrahepatic and gallbladder cholesterol cholelithiasis. Gastroenterology 120: 1459–1467.

35. StoreyJD (2002) A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64: 479–498.

36. StoreyJD (2003) The positive false discovery rate: A Bayesian interpretation and the q-value. Annals of Statistics 2013–2035.

37. AltshulerD, DalyM, KruglyakL (2000) Guilt by association. Nat Genet 26: 135–137.

38. GeorgeRA, LiuJY, FengLL, Bryson-RichardsonRJ, FatkinD, et al. (2006) Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic acids research 34: e130–e130.

39. JiangR, GanM, HeP (2011) Constructing a gene semantic similarity network for the inference of disease genes. BMC systems biology 5: S2.

40. KöhlerS, BauerS, HornD, RobinsonPN (2008) Walking the interactome for prioritization of candidate disease genes. The American Journal of Human Genetics 82: 949–958.

41. AertsS, LambrechtsD, MaityS, Van LooP, CoessensB, et al. (2006) Gene prioritization through genomic data fusion. Nature biotechnology 24: 537–544.

42. ChenY, HaoJ, JiangW, HeT, ZhangX, et al. (2013) Identifying potential cancer driver genes by genomic data integration. Sci Rep 3: 3538 doi:10.1038/srep03538

43. WeissLA, EscaygA, KearneyJA, TrudeauM, MacDonaldBT, et al. (2003) Sodium channels SCN1A, SCN2A and SCN3A in familial autism. Mol Psychiatry 8: 186–194.

44. KamiyaK, KanedaM, SugawaraT, MazakiE, OkamuraN, et al. (2004) A nonsense mutation of the sodium channel gene SCN2A in a patient with intractable epilepsy and mental decline. The Journal of neuroscience 24: 2690–2698.

45. LiaoY, AnttonenA-K, LiukkonenE, GailyE, MaljevicS, et al. (2010) SCN2A mutation associated with neonatal epilepsy, late-onset episodic ataxia, myoclonus, and pain. Neurology 75: 1454–1458.

46. LiaoY, DeprezL, MaljevicS, PitschJ, ClaesL, et al. (2010) Molecular correlates of age-dependent seizures in an inherited neonatal-infantile epilepsy. Brain 133: 1403–1414.

47. BerkovicSF, HeronSE, GiordanoL, MariniC, GuerriniR, et al. (2004) Benign familial neonatal-infantile seizures: characterization of a new sodium channelopathy. Annals of neurology 55: 550–557.

48. SherryS, WardM-H, KholodovM, BakerJ, PhanL, et al. (2001) dbSNP: the NCBI database of genetic variation. Nucleic acids research 29: 308–311.

49. YangJJ (2010) Distribution of Fisher's combination statistic when the tests are dependent. Journal of Statistical Computation and Simulation 80: 1–12.

50. HamoshA, ScottAF, AmbergerJS, BocchiniCA, McKusickVA (2005) Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic acids research 33: D514–D517.

51. HaiderS, BallesterB, SmedleyD, ZhangJ, RiceP, et al. (2009) BioMart Central Portal—unified access to biological data. Nucleic acids research 37: W23–W27.

52. GibbsRA, BelmontJW, HardenbolP, WillisTD, YuF, et al. (2003) The international HapMap project. Nature 426: 789–796.

53. SnelB, LehmannG, BorkP, HuynenMA (2000) STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic acids research 28: 3442–3444.

54. PearsonWR (1991) Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11: 635–650.

55. BatemanA, CoinL, DurbinR, FinnRD, HollichV, et al. (2004) The Pfam protein families database. Nucleic acids research 32: D138–D141.

56. KanehisaM, GotoS, KawashimaS, OkunoY, HattoriM (2004) The KEGG resource for deciphering the genome. Nucleic acids research 32: D277–D280.

57. van DrielMA, BruggemanJ, VriendG, BrunnerHG, LeunissenJA (2006) A text-mining analysis of the human phenome. European journal of human genetics 14: 535–542.

58. WuX, LiuQ, JiangR (2009) Align human interactome with phenome to identify causative genes and networks underlying disease families. Bioinformatics 25: 98–104.

59. BeckerKG, BarnesKC, BrightTJ, WangSA (2004) The genetic association database. Nature genetics 36: 431–432.

60. XueY, ChenY, AyubQ, HuangN, BallEV, et al. (2012) Deleterious-and Disease-Allele Prevalence in Healthy Individuals: Insights from Current Predictions, Mutation Databases, and Population-Scale Resequencing. The American Journal of Human Genetics 91: 1022–1032.

61. HoischenA, van BonBW, GilissenC, ArtsP, van LierB, et al. (2010) De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nature genetics 42: 483–485.

62. LiY, BögershausenN, AlanayY, KiperPÖS, PlumeN, et al. (2011) A mutation screen in patients with Kabuki syndrome. Human genetics 130: 715–724.

63. HoischenA, van BonBW, Rodríguez-SantiagoB, GilissenC, VissersLE, et al. (2011) De novo nonsense mutations in ASXL1 cause Bohring-Opitz syndrome. Nature genetics 43: 729–731.

64. XuB, RoosJL, DexheimerP, BooneB, PlummerB, et al. (2011) Exome sequencing supports a de novo mutational paradigm for schizophrenia. Nature genetics 43: 864–868.

65. SandersSJ, MurthaMT, GuptaAR, MurdochJD, RaubesonMJ, et al. (2012) De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485: 237–241.

66. IossifovI, RonemusM, LevyD, WangZ, HakkerI, et al. (2012) De novo gene disruptions in children on the autistic spectrum. Neuron 74: 285–299.

67. O'RoakBJ, VivesL, GirirajanS, KarakocE, KrummN, et al. (2012) Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 485: 246–250.

68. NealeBM, KouY, LiuL, Ma'ayanA, SamochaKE, et al. (2012) Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485: 242–245.

69. IossifovI, RonemusM, LevyD, WangZ, HakkerI, et al. (2012) De novo gene disruptions in children on the autistic spectrum. Neuron 74: 285–299.

70. AllenAS, BerkovicSF, CossetteP, DelantyN, DlugosD, et al. (2013) De novo mutations in epileptic encephalopathies. Nature 501: 217–221.

71. de LigtJ, WillemsenMH, van BonBW, KleefstraT, YntemaHG, et al. (2012) Diagnostic exome sequencing in persons with severe intellectual disability. N Engl J Med 367: 1921–1929.

72. RauchA, WieczorekD, GrafE, WielandT, EndeleS, et al. (2012) Range of genetic mutations associated with severe non-syndromic sporadic intellectual disability: an exome sequencing study. Lancet 380: 1674–1682.

73. RobinsonPN, KöhlerS, BauerS, SeelowD, HornD, et al. (2008) The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. American journal of human genetics 83: 610–615.

74. AltschulSF, MaddenTL, SchäfferAA, ZhangJ, ZhangZ, et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research 25: 3389–3402.

75. KondorRI, LaffertyJ Diffusion kernels on graphs and other discrete input spaces; 2002. 315–322.

76. AshburnerM, BallCA, BlakeJA, BotsteinD, ButlerH, et al. (2000) Gene Ontology: tool for the unification of biology. Nature genetics 25: 25–29.

77. ResnikP (2011) Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11: 95–130.

78. SmithTF, WatermanMS (1981) Identification of common molecular subsequences. Journal of molecular biology 147: 195–197.

79. FisherRA, GenetikerS, GeneticianS, BritainG, GénéticienS (1970) Statistical methods for research workers: Oliver and Boyd Edinburgh.

80. BenjaminiY, HochbergY (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B (Methodological) 57: 289–300.

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2014 Číslo 3
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Eozinofilní granulomatóza s polyangiitidou
nový kurz
Autori: doc. MUDr. Martina Doubková, Ph.D.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#