#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Beyond Missing Heritability: Prediction of Complex Traits


Despite rapid advances in genomic technology, our ability to account for phenotypic variation using genetic information remains limited for many traits. This has unfortunately resulted in limited application of genetic data towards preventive and personalized medicine, one of the primary impetuses of genome-wide association studies. Recently, a large proportion of the “missing heritability” for human height was statistically explained by modeling thousands of single nucleotide polymorphisms concurrently. However, it is currently unclear how gains in explained genetic variance will translate to the prediction of yet-to-be observed phenotypes. Using data from the Framingham Heart Study, we explore the genomic prediction of human height in training and validation samples while varying the statistical approach used, the number of SNPs included in the model, the validation scheme, and the number of subjects used to train the model. In our training datasets, we are able to explain a large proportion of the variation in height (h2 up to 0.83, R2 up to 0.96). However, the proportion of variance accounted for in validation samples is much smaller (ranging from 0.15 to 0.36 depending on the degree of familial information used in the training dataset). While such R2 values vastly exceed what has been previously reported using a reduced number of pre-selected markers (<0.10), given the heritability of the trait (∼0.80), substantial room for improvement remains.


Vyšlo v časopise: Beyond Missing Heritability: Prediction of Complex Traits. PLoS Genet 7(4): e32767. doi:10.1371/journal.pgen.1002051
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1002051

Souhrn

Despite rapid advances in genomic technology, our ability to account for phenotypic variation using genetic information remains limited for many traits. This has unfortunately resulted in limited application of genetic data towards preventive and personalized medicine, one of the primary impetuses of genome-wide association studies. Recently, a large proportion of the “missing heritability” for human height was statistically explained by modeling thousands of single nucleotide polymorphisms concurrently. However, it is currently unclear how gains in explained genetic variance will translate to the prediction of yet-to-be observed phenotypes. Using data from the Framingham Heart Study, we explore the genomic prediction of human height in training and validation samples while varying the statistical approach used, the number of SNPs included in the model, the validation scheme, and the number of subjects used to train the model. In our training datasets, we are able to explain a large proportion of the variation in height (h2 up to 0.83, R2 up to 0.96). However, the proportion of variance accounted for in validation samples is much smaller (ranging from 0.15 to 0.36 depending on the degree of familial information used in the training dataset). While such R2 values vastly exceed what has been previously reported using a reduced number of pre-selected markers (<0.10), given the heritability of the trait (∼0.80), substantial room for improvement remains.


Zdroje

1. ManolioTACollinsFSCoxNJGoldsteinDBHindorffLA 2009 Finding the missing heritability of complex diseases. Nature 461 747 753

2. ClarkeAJCooperDN 2010 GWAS: heritability missing in action? Eur J Hum Genet 18 859 861

3. HuebingerRMGarnerHRBarberRC 2010 Pathway genetic load allows simultaneous evaluation of multiple genetic associations. Burns 36 787 792

4. ParkJHWacholderSGailMHPetersUJacobsKB 2010 Estimation of effect size distribution from genome-wide association studies and implications for future discoveries. Nat Genet 42 570 575

5. VisscherPMMedlandSEFerreiraMARMorleyKIZhuG 2006 Assumption-free estimation of heritability from genome-wide identity-by-descent sharing between full siblings. PLoS Genet 2 e41 doi:10.1371/journal.pgen.0020041

6. BodmerWTomlinsonI 2010 Rare genetic variants and the risk of cancer. Curr Opin Genet Dev 20 262 267

7. ForerLSchönherrSWeissensteinerHHaiderFKlucknerT 2010 CONAN: copy number variation analysis software for genome-wide association studies. BMC Bioinformatics 11 318

8. MaherB 2008 The case of the missing heritibility. Nature 456 18 21

9. DominiczakAFMcBrideMW 2003 Genetics of common ploygenic stroke. Nat Genet 35 116 117

10. GorielyAWilkieAOM 2010 Missing heritability: paternal age effect mutations and selfish spermatogonia. Nat Rev Genet 11 589 589

11. EichlerEEFlintJGibsonGKongALealSM 2010 Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11 446 450

12. YangJBenyaminBMcEvoyBPGordonSHendersAK 2010 Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42 565 569

13. GibsonG 2010 Hints of hidden heritability in GWAS. Nat Genet 42 558 560

14. de los CamposGGianolaDAllisonDB 2010 Predicting genetic predisposition in humans: the promise of whole-genome markers. Nat Rev Genet 11 880 886

15. PaynterNPChasmanDIPareGBuringJECookNR 2010 Association between a literature-based genetic risk score and cardiovascular events in women. JAMA 303 631 637

16. HillWG 2010 Understanding and using quantitative genetic variation. Phil Trans R Soc B 365 73 85

17. MeuwissenTHEHayesBJGoddardME 2001 Prediction of total genetic value using genome-wide dense marker maps. Genetics 157 1819 1829

18. VisscherPMYangKGoddardME 2010 A commentary on ‘Common SNPs explain a large proportion of the heritability for human height’ by Yang et al. Twin Res Hum Genet 13 517 524

19. SilventoinenKSammalistoSPerolaMBoomsmaDICornesBK 2003 Heritability of adult body height: a comparative study of twin cohorts in eight countries. Twin Res 6 399 408

20. MacgregorSCornesBMartinNVisscherP 2006 Bias, precision and heritability of self-reported and clinically measured height in Australian twins. Hum Genet 120 571 580

21. FisherRA 1918 The correlation between relatives on the supposition of Mendelian inheritance. Phil Trans R Soc Edinb 52 399 433

22. WrightS 1921 Systems of mating. I–V. Genetics 6

23. PurcellSMWrayNRStoneJLVisscherPM International Schizophrenia Consortium 2009 Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460 748 752

24. TeslovichTMusunuruKSmithAEdmondsonAStylianouI 2010 Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466 707 713

25. SpeliotesEKWillerCJBerndtSIMondaKLThorleifssonG 2010 Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 42 937 948

26. HastieTTibshiraniRFriedmanJ 2009 The elements of statistical learning: Data mining, inference, and prediction New York Springer-Verlag

27. DawberTRMeadorsGFMooreFE 1951 Epidemiological approaches to heart disease: the Framingham Study. Am J Public Health 41 279 286

28. DawberTRKannelWBLyellLP 1963 An approach to longitudinal studies in a community: the Framingham Study. Ann N Y Acad Sci 107 539 556

29. ParkTCasellaG 2008 The Bayesian Lasso. J Am Stat Assoc 103 681 686

30. HayesBJGoddardME 2008 Prediction of breeding values using marker-derived relationship matrices. J Anim Sci 86 2089 2092

31. SpiegelhalterDJBestNGCarlinBPVan Der LindeA 2002 Bayesian measures of model complexity and fit. J Roy Stat Soc Ser B (Stat Method) 64 583 639

32. GoddardM 2009 Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136 245 257

33. GianolaDde los CamposGHillWGManfrediEFernandoR 2009 Additive genetic variability and the Bayesian alphabet. Genetics 183 347 363

34. HabierDFernandoRLDekkersJCM 2007 The Impact of Genetic Relationship Information on Genome-Assisted Breeding Values. Genetics 177 2389 2397

35. HabierDTetensJSeefriedF-RLichtnerPThallerG 2010 The impact of genetic relationship information on genomic breeding values in German Holstein cattle. Genet Sel Evol 21 5

36. Perez-CabalMAVazquezAIGianolaDRosaGJMWeigelKA 2010 Accuracy of genomic predictions in USA Holstein cattle from different training-testing designs. Proceedings of the 9th World Congress on Genetics Applied to Livestock Production # 563 and book of abstracts, p 150 August 1–6, Leipzig, Germany

37. WeigelKAde los CamposGGonzález-RecioONayaHWuXL 2009 Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. J Dairy Sci 92 5248 5257

38. VazquezAIRosaGJMWeigelKAde los CamposGGianolaD 2010 Predictive ability of subsets of single nucleotide polymorphisms with and without parent average in US Holsteins. J Dairy Sci 93 5942 5949

39. Lango AllenHEstradaKLettreGBerndtSIWeedonMN 2010 Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467 832 838

40. HayesBJBowmanPJChamberlainAJGoddardME 2009 Genomic selection in dairy cattle: Progress and challenges. J Dairy Sci 92 433 443

41. GoddardMEHayesBJ 2009 Mapping genes for complex traits in domestic animals and their use in breeding programmes. Nat Rev Genet 10 381 391

42. CrossaJde los CamposGPerezPGianolaDBurguenoJ 2010 Prediction of Genetic Values of Quantitative Traits in Plant Breeding Using Pedigree and Molecular Markers. Genetics 186 713 724

43. de los CamposGNayaHGianolaDCrossaJLegarraA 2009 Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics 182 375 385

44. Van RadenPMWiggansGRVan TassellCPSonstegardTSSchenkelFS 2009 Benefits from cooperation in genomics. Interbull Bulletin 39 67 72

45. PriceALZaitlenNAReichDPattersonN 2010 New approaches to population stratification in genome-wide association studies. Nat Rev Genet 11 459 463

46. de los CamposGGianolaDRosaGJMWeigelKACrossaJ 2010 Semi-parametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods. Genetics Res 92 295 308

47. CampbellCDOgburnELLunettaKLLyonHNFreedmanML 2005 Demonstrating stratification in a European American population. Nat Genet 37 868 872

48. de RoosAPWHayesBJGoddardME 2009 Reliability of Genomic Predictions Across Multiple Populations. Genetics 183 1545 1553

49. LynchMRitlandK 1999 Estimation of pairwise relatedness with molecular markers. Genetics 152 1753 1766

50. EdingHMeuwissenTHE 2001 Marker-based estimates of between and within population kinships for the conservation of genetic diversity. J Anim Breed Genet 118 141 159

51. Van RadenPMVan TassellCPWiggansGRSonstegardTSSchnabelRD 2009 Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci 92 16 24

52. YiNXuS 2008 Bayesian LASSO for quantitative trait loci mapping. Genetics 179 1045 1055

53. de los CamposGPerezP 2010 BLR: Bayesian linear regression. R package version 1.1. http://www.R-project.org/

54. HendersonCR 1975 Best linear unbiased estimation and prediction under a selection model. Biometrics 31 423 447

55. HadfieldJDWilson AlastairJGarantDSheldon BenCKruuk LoeskeEB 2010 The Misuse of BLUP in Ecology and Evolution. Am Nat 175 116 125

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2011 Číslo 4
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Získaná hemofilie - Povědomí o nemoci a její diagnostika
nový kurz

Eozinofilní granulomatóza s polyangiitidou
Autori: doc. MUDr. Martina Doubková, Ph.D.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#