Mining the Human Phenome Using Allelic Scores That Index Biological Intermediates

English version České info

It is common practice in genome-wide association studies (GWAS) to focus on the relationship between disease risk and genetic variants one marker at a time. When relevant genes are identified it is often possible to implicate biological intermediates and pathways likely to be involved in disease aetiology. However, single genetic variants typically explain small amounts of disease risk. Our idea is to construct allelic scores that explain greater proportions of the variance in biological intermediates, and subsequently use these scores to data mine GWAS. To investigate the approach's properties, we indexed three biological intermediates where the results of large GWAS meta-analyses were available: body mass index, C-reactive protein and low density lipoprotein levels. We generated allelic scores in the Avon Longitudinal Study of Parents and Children, and in publicly available data from the first Wellcome Trust Case Control Consortium. We compared the explanatory ability of allelic scores in terms of their capacity to proxy for the intermediate of interest, and the extent to which they associated with disease. We found that allelic scores derived from known variants and allelic scores derived from hundreds of thousands of genetic markers explained significant portions of the variance in biological intermediates of interest, and many of these scores showed expected correlations with disease. Genome-wide allelic scores however tended to lack specificity suggesting that they should be used with caution and perhaps only to proxy biological intermediates for which there are no known individual variants. Power calculations confirm the feasibility of extending our strategy to the analysis of tens of thousands of molecular phenotypes in large genome-wide meta-analyses. We conclude that our method represents a simple way in which potentially tens of thousands of molecular phenotypes could be screened for causal relationships with disease without having to expensively measure these variables in individual disease collections.

Vyšlo v časopise: Mining the Human Phenome Using Allelic Scores That Index Biological Intermediates. PLoS Genet 9(10): e32767. doi:10.1371/journal.pgen.1003919
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1003919

Souhrn

Zdroje

1. FerenceBA, YooW, AleshI, MahajanN, MirowskaKK, et al. (2012) Effect of long-term exposure to lower low-density lipoprotein cholesterol beginning early in life on the risk of coronary heart disease: a Mendelian randomization analysis. J Am Coll Cardiol 60 : 2631–2639.

2. SmithGD, EbrahimS (2004) Mendelian randomization: prospects, potentials, and limitations. Int J Epidemiol 33 : 30–42.

3. TeslovichTM, MusunuruK, SmithAV, EdmondsonAC, StylianouIM, et al. (2010) Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466 : 707–713.

4. VoightBF, ScottLJ, SteinthorsdottirV, MorrisAP, DinaC, et al. (2010) Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat Genet 42 : 579–589.

5. Davey SmithG, EbrahimS (2003) ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 32 : 1–22.

6. TimpsonNJ, LawlorDA, HarbordRM, GauntTR, DayIN, et al. (2005) C-reactive protein and its role in metabolic syndrome: mendelian randomisation study. Lancet 366 : 1954–1959.

7. Davey SmithG, LawlorDA, HarbordR, TimpsonN, RumleyA, et al. (2005) Association of C-reactive protein with blood pressure and hypertension: life course confounding and mendelian randomization tests of causality. Arterioscler Thromb Vasc Biol 25 : 1051–1056.

8. WensleyF, GaoP, BurgessS, KaptogeS, Di AngelantonioE, et al. (2011) Association between C reactive protein and coronary heart disease: mendelian randomisation analysis based on individual participant data. BMJ 342: d548.

9. EvansDM, VisscherPM, WrayNR (2009) Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk. Hum Mol Genet 18 : 3525–3531.

10. PurcellSM, WrayNR, StoneJL, VisscherPM, O'DonovanMC, et al. (2009) Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460 : 748–752.

11. YangJ, BenyaminB, McEvoyBP, GordonS, HendersAK, et al. (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42 : 565–569.

12. VisscherPM, BrownMA, McCarthyMI, YangJ (2012) Five years of GWAS discovery. Am J Hum Genet 90 : 7–24.

13. SpeliotesEK, WillerCJ, BerndtSI, MondaKL, ThorleifssonG, et al. (2010) Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 42 : 937–948.

14. DehghanA, DupuisJ, BarbalicM, BisJC, EiriksdottirG, et al. (2011) Meta-analysis of genome-wide association studies in >80 000 subjects identifies multiple loci for C-reactive protein levels. Circulation 123 : 731–738.

15. BenyaminB, MiddelbergRP, LindPA, ValleAM, GordonS, et al. (2011) GWAS of butyrylcholinesterase activity identifies four novel loci, independent effects within BCHE and secondary associations with metabolic risk factors. Hum Mol Genet 20 : 4504–4514.

16. HeathAC, WhitfieldJB, MartinNG, PergadiaML, GoateAM, et al. (2011) A quantitative-trait genome-wide association study of alcoholism risk in the community: findings and implications. Biol Psychiatry 70 : 513–518.

17. Wellcome Trust Case Control C (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447 : 661–678.

18. VoightBF, PelosoGM, Orho-MelanderM, Frikke-SchmidtR, BarbalicM, et al. (2012) Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study. Lancet 380 : 572–580.

19. NordestgaardBG, PalmerTM, BennM, ZachoJ, Tybjaerg-HansenA, et al. (2012) The effect of elevated body mass index on ischemic heart disease risk: causal estimates from a Mendelian randomisation approach. PLoS Med 9: e1001212.

20. KnowlerWC, Barrett-ConnorE, FowlerSE, HammanRF, LachinJM, et al. (2002) Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. N Engl J Med 346 : 393–403.

21. CarlssonLM, PeltonenM, AhlinS, AnvedenA, BouchardC, et al. (2012) Bariatric surgery and prevention of type 2 diabetes in Swedish obese subjects. N Engl J Med 367 : 695–704.

22. YangJ, ManolioTA, PasqualeLR, BoerwinkleE, CaporasoN, et al. (2011) Genome partitioning of genetic variation for complex traits using common SNPs. Nat Genet 43 : 519–525.

23. DemirkanA, AminN, IsaacsA, JarvelinMR, WhitfieldJB, et al. (2011) Genetic architecture of circulating lipid levels. Eur J Hum Genet 19 : 813–819.

24. AbrahamG, KowalczykA, ZobelJ, InouyeM (2013) Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease. Genet Epidemiol 37 : 184–195.

25. SjostromL, PeltonenM, JacobsonP, SjostromCD, KarasonK, et al. (2012) Bariatric surgery and long-term cardiovascular events. JAMA 307 : 56–65.

26. TimpsonNJ, HarbordR, Davey SmithG, ZachoJ, Tybjaerg-HansenA, et al. (2009) Does greater adiposity increase blood pressure and hypertension risk?: Mendelian randomization using the FTO/MC4R genotype. Hypertension 54 : 84–90.

27. TimpsonNJ, NordestgaardBG, HarbordRM, ZachoJ, FraylingTM, et al. (2011) C-reactive protein levels and body mass index: elucidating direction of causation through reciprocal Mendelian randomization. Int J Obes (Lond) 35 : 300–308.

28. WrayNR, YangJ, HayesBJ, PriceAL, GoddardME, et al. (2013) Pitfalls of predicting complex traits from SNPs. Nat Rev Genet 14 : 507–515.

29. CheungVG, SpielmanRS, EwensKG, WeberTM, MorleyM, et al. (2005) Mapping determinants of human gene expression by regional and genome-wide association. Nature 437 : 1365–1369.

30. KettunenJ, TukiainenT, SarinAP, Ortega-AlonsoA, TikkanenE, et al. (2012) Genome-wide association study identifies multiple loci influencing human serum metabolite levels. Nat Genet 44 : 269–276.

31. RakyanVK, DownTA, BaldingDJ, BeckS (2011) Epigenome-wide association studies for common human diseases. Nat Rev Genet 12 : 529–541.

32. Davey SmithG, EbrahimS, LewisS, HansellAL, PalmerLJ, et al. (2005) Genetic epidemiology and public health: hope, hype, and future prospects. Lancet 366 : 1484–1498.

33. BoydA, GoldingJ, MacleodJ, LawlorDA, FraserA, et al. (2013) Cohort Profile: the ‘children of the 90s’–the index offspring of the Avon Longitudinal Study of Parents and Children. Int J Epidemiol 42 : 111–127.

34. FraserA, Macdonald-WallisC, TillingK, BoydA, GoldingJ, et al. (2013) Cohort Profile: the Avon Longitudinal Study of Parents and Children: ALSPAC mothers cohort. Int J Epidemiol 42 : 97–110.

35. MedlandSE, NyholtDR, PainterJN, McEvoyBP, McRaeAF, et al. (2009) Common variants in the trichohyalin gene are associated with straight hair in Europeans. Am J Hum Genet 85 : 750–755.

36. PurcellS, NealeB, Todd-BrownK, ThomasL, FerreiraMA, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81 : 559–575.

37. PurcellS, ChernySS, ShamPC (2003) Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics 19 : 149–150.