#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Simultaneous Discovery, Estimation and Prediction Analysis of Complex Traits Using a Bayesian Mixture Model


Most genome-wide association studies performed to date have focused on testing individual genetic markers for associations with phenotype. Recently, methods that analyse the joint effects of multiple markers on genetic variation have provided further insights into the genetic basis of complex human traits. In addition, there is increasing interest in using genotype data for genetic risk prediction of disease. Often disparate analytical methods are used for each of these tasks. We propose a flexible novel approach that simultaneously performs identification of susceptibility loci, inference on the genetic architecture and provides polygenic risk prediction in the same statistical model. We illustrate the broad applicability of the approach by considering both simulated and real data. In the analysis of seven common diseases we show large differences in the proportion of genetic variation due to loci with different effect sizes and differences in prediction accuracy between complex traits. These findings are important for future studies and the understanding of the complex genetic architecture of common diseases.


Vyšlo v časopise: Simultaneous Discovery, Estimation and Prediction Analysis of Complex Traits Using a Bayesian Mixture Model. PLoS Genet 11(4): e32767. doi:10.1371/journal.pgen.1004969
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1004969

Souhrn

Most genome-wide association studies performed to date have focused on testing individual genetic markers for associations with phenotype. Recently, methods that analyse the joint effects of multiple markers on genetic variation have provided further insights into the genetic basis of complex human traits. In addition, there is increasing interest in using genotype data for genetic risk prediction of disease. Often disparate analytical methods are used for each of these tasks. We propose a flexible novel approach that simultaneously performs identification of susceptibility loci, inference on the genetic architecture and provides polygenic risk prediction in the same statistical model. We illustrate the broad applicability of the approach by considering both simulated and real data. In the analysis of seven common diseases we show large differences in the proportion of genetic variation due to loci with different effect sizes and differences in prediction accuracy between complex traits. These findings are important for future studies and the understanding of the complex genetic architecture of common diseases.


Zdroje

1. Hoggart CJ, Whittaker JC, De Iorio M, Balding DJ (2008) Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies. PLoS Genet 4.

2. de los Campos G, Gianola D, Allison DB (2010) Predicting genetic predisposition in humans: the promise of whole-genome markers. Nat Rev Genet 11: 880–886. doi: 10.1038/nrg2898 21045869

3. Beavis WD (1998) QTL analysis: Power, precision, and accuracy. In: Paterson AH, editor. Molecular dissection of complex traits. Boca Raton, FL: CRC Press.

4. Chatterjee N, Wheeler B, Sampson J, Hartge P, Chanock SJ, et al. (2013) Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat Genet 45: 400–405, 405e401–403. doi: 10.1038/ng.2579 23455638

5. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, et al. (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42: 565–569. doi: 10.1038/ng.608 20562875

6. Goddard M (2009) Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136: 245–257. doi: 10.1007/s10709-008-9308-0 18704696

7. McLachlan GJ, Basford KE (1988) Mixture models: inference and applications to clustering. New York, N.Y.: M. Dekker. xi, 253 p. p.

8. Erbe M, Hayes BJ, Matukumalli LK, Goswami S, Bowman PJ, et al. (2012) Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. J Dairy Sci 95: 4114–4129. doi: 10.3168/jds.2011-5019 22720968

9. Zhou X, Carbonetto P, Stephens M (2013) Polygenic Modeling with Bayesian Sparse Linear Mixed Models. PLoS Genet 9.

10. Yang J, Lee SH, Goddard ME, Visscher PM (2011) GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88: 76–82. doi: 10.1016/j.ajhg.2010.11.011 21167468

11. Lee SH, Wray NR, Goddard ME, Visscher PM (2011) Estimating missing heritability for disease from genome-wide association studies. Am J Hum Genet 88: 294–305. doi: 10.1016/j.ajhg.2011.02.002 21376301

12. de Los Campos G, Vazquez AI, Fernando R, Klimentidis YC, Sorensen D (2013) Prediction of complex human traits using the genomic best linear unbiased predictor. PLoS Genet 9: e1003608. doi: 10.1371/journal.pgen.1003608 23874214

13. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81: 559–575. 17701901

14. Zhou X, Stephens M (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet 44: 821–824. doi: 10.1038/ng.2310 22706312

15. Meuwissen TH, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157: 1819–1829. 11290733

16. Habier D, Fernando RL, Kizilkaya K, Garrick DJ (2011) Extension of the bayesian alphabet for genomic selection. BMC Bioinformatics 12: 186. doi: 10.1186/1471-2105-12-186 21605355

17. The Welcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661–678. 17554300

18. Speed D, Hemani G, Johnson MR, Balding DJ (2012) Improved heritability estimation from genome-wide SNPs. Am J Hum Genet 91: 1011–1021. doi: 10.1016/j.ajhg.2012.10.010 23217325

19. Lee SH, Goddard ME, Visscher PM, van der Werf JH (2010) Using the realized relationship matrix to disentangle confounding factors for the estimation of genetic variance components of complex traits. Genet Sel Evol 42: 22. doi: 10.1186/1297-9686-42-22 20546624

20. Evans DM, Visscher PM, Wray NR (2009) Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk. Hum Mol Genet 18: 3525–3531. doi: 10.1093/hmg/ddp295 19553258

21. Kooperberg C, LeBlanc M, Obenchain V (2010) Risk prediction using genome-wide association studies. Genet Epidemiol 34: 643–652. doi: 10.1002/gepi.20509 20842684

22. Abraham G, Kowalczyk A, Zobel J, Inouye M (2013) Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease. Genet Epidemiol 37: 184–195. doi: 10.1002/gepi.21698 23203348

23. Wray NR, Yang J, Goddard ME, Visscher PM (2010) The genetic interpretation of area under the ROC curve in genomic profiling. PLoS Genet 6: e1000864. doi: 10.1371/journal.pgen.1000864 20195508

24. Wei Z, Wang K, Qu HQ, Zhang H, Bradfield J, et al. (2009) From disease association to risk assessment: an optimistic view from genome-wide association studies on type 1 diabetes. PLoS Genet 5: e1000678. doi: 10.1371/journal.pgen.1000678 19816555

25. Sillanpaa MJ, Corander J (2002) Model choice in gene mapping: what and why. Trends Genet 18: 301–307. 12044359

26. Viallefont V, Raftery AE, Richardson S (2001) Variable selection and Bayesian model averaging in case-control studies. Stat Med 20: 3215–3230. 11746314

27. Guan YT, Stephens M (2011) Bayesian Variable Selection Regression for Genome-Wide Association Studies and Other Large-Scale Problems. Annals of Applied Statistics 5: 1780–1815.

28. Peltola T, Marttinen P, Jula A, Salomaa V, Perola M, et al. (2012) Bayesian variable selection in searching for additive and dominant effects in genome-wide data. PLoS One 7: e29115. doi: 10.1371/journal.pone.0029115 22235263

29. Goddard ME, Wray NR, Verbyla K, Visscher PM (2009) Estimating Effects and Making Predictions from Genome-Wide Marker Data. Statistical Science 24: 517–529.

30. Dudbridge F (2013) Power and predictive accuracy of polygenic risk scores. PLoS Genet 9: e1003348. doi: 10.1371/journal.pgen.1003348 23555274

31. Dempster ER, Lerner IM (1950) Heritability of Threshold Characters. Genetics 35: 212–236. 17247344

32. Karkkainen HP, Sillanpaa MJ (2013) Fast Genomic Predictions via Bayesian G-BLUP and Multilocus Models of Threshold Traits Including Censored Gaussian Data. G3 (Bethesda) 3: 1511–1523. doi: 10.1534/g3.113.007096 23821618

33. Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics. Essex, England: Longman. xiii, 464 p. p.

34. Lee SH, DeCandia TR, Ripke S, Yang J, Sullivan PF, et al. (2012) Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nat Genet 44: 247–250. doi: 10.1038/ng.1108 22344220

35. Yang J, Manolio TA, Pasquale LR, Boerwinkle E, Caporaso N, et al. (2011) Genome partitioning of genetic variation for complex traits using common SNPs. Nat Genet 43: 519–525. doi: 10.1038/ng.823 21552263

36. Lee SH, Harold D, Nyholt DR, Goddard ME, Zondervan KT, et al. (2013) Estimation and partitioning of polygenic variation captured by common SNPs for Alzheimer's disease, multiple sclerosis and endometriosis. Hum Mol Genet 22: 832–841. doi: 10.1093/hmg/dds491 23193196

37. Gianola D (2013) Priors in whole-genome regression: the bayesian alphabet returns. Genetics 194: 573–596. doi: 10.1534/genetics.113.151753 23636739

38. Purcell SM, Wray NR, Stone JL, Visscher PM, O'Donovan MC, et al. (2009) Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460: 748–752. doi: 10.1038/nature08185 19571811

39. Wray NR, Goddard ME, Visscher PM (2008) Prediction of individual genetic risk of complex disease. Curr Opin Genet Dev 18: 257–263. doi: 10.1016/j.gde.2008.07.006 18682292

40. International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437: 1299–1320. 16255080

41. Lu JT, Wang Y, Gibbs RA, Yu F (2012) Characterizing linkage disequilibrium and evaluating imputation power of human genomic insertion-deletion polymorphisms. Genome Biol 13: R15. doi: 10.1186/gb-2012-13-2-r15 22377349

42. Shepherd RK, Meuwissen TH, Woolliams JA (2010) Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers. BMC Bioinformatics 11: 529. doi: 10.1186/1471-2105-11-529 20969788

43. Stahl EA, Wegmann D, Trynka G, Gutierrez-Achury J, Do R, et al. (2012) Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat Genet 44: 483–489. doi: 10.1038/ng.2232 22446960

44. Brondum RF, Su G, Lund MS, Bowman PJ, Goddard ME, et al. (2012) Genome position specific priors for genomic prediction. BMC Genomics 13: 543. doi: 10.1186/1471-2164-13-543 23050763

45. Long N, Dickson SP, Maia JM, Kim HS, Zhu Q, et al. (2013) Leveraging Prior Information to Detect Causal Variants via Multi-Variant Regression. PLoS Comput Biol 9: e1003093. doi: 10.1371/journal.pcbi.1003093 23762022

46. Meuwissen T, Goddard M (2010) Accurate prediction of genetic values for complex traits by whole-genome resequencing. Genetics 185: 623–631. doi: 10.1534/genetics.110.116590 20308278

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2015 Číslo 4
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Získaná hemofilie - Povědomí o nemoci a její diagnostika
nový kurz

Eozinofilní granulomatóza s polyangiitidou
Autori: doc. MUDr. Martina Doubková, Ph.D.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#