#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Hierarchical Generalized Linear Models for Multiple Groups of Rare and Common Variants: Jointly Estimating Group and Individual-Variant Effects


Complex diseases and traits are likely influenced by many common and rare genetic variants and environmental factors. Detecting disease susceptibility variants is a challenging task, especially when their frequencies are low and/or their effects are small or moderate. We propose here a comprehensive hierarchical generalized linear model framework for simultaneously analyzing multiple groups of rare and common variants and relevant covariates. The proposed hierarchical generalized linear models introduce a group effect and a genetic score (i.e., a linear combination of main-effect predictors for genetic variants) for each group of variants, and jointly they estimate the group effects and the weights of the genetic scores. This framework includes various previous methods as special cases, and it can effectively deal with both risk and protective variants in a group and can simultaneously estimate the cumulative contribution of multiple variants and their relative importance. Our computational strategy is based on extending the standard procedure for fitting generalized linear models in the statistical software R to the proposed hierarchical models, leading to the development of stable and flexible tools. The methods are illustrated with sequence data in gene ANGPTL4 from the Dallas Heart Study. The performance of the proposed procedures is further assessed via simulation studies. The methods are implemented in a freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/).


Vyšlo v časopise: Hierarchical Generalized Linear Models for Multiple Groups of Rare and Common Variants: Jointly Estimating Group and Individual-Variant Effects. PLoS Genet 7(12): e32767. doi:10.1371/journal.pgen.1002382
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1002382

Souhrn

Complex diseases and traits are likely influenced by many common and rare genetic variants and environmental factors. Detecting disease susceptibility variants is a challenging task, especially when their frequencies are low and/or their effects are small or moderate. We propose here a comprehensive hierarchical generalized linear model framework for simultaneously analyzing multiple groups of rare and common variants and relevant covariates. The proposed hierarchical generalized linear models introduce a group effect and a genetic score (i.e., a linear combination of main-effect predictors for genetic variants) for each group of variants, and jointly they estimate the group effects and the weights of the genetic scores. This framework includes various previous methods as special cases, and it can effectively deal with both risk and protective variants in a group and can simultaneously estimate the cumulative contribution of multiple variants and their relative importance. Our computational strategy is based on extending the standard procedure for fitting generalized linear models in the statistical software R to the proposed hierarchical models, leading to the development of stable and flexible tools. The methods are illustrated with sequence data in gene ANGPTL4 from the Dallas Heart Study. The performance of the proposed procedures is further assessed via simulation studies. The methods are implemented in a freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/).


Zdroje

1. HardyJSingletonA 2009 Genomewide association studies and human disease. N Engl J Med 360 1759 1768

2. HindorffLSethupathyPJunkinsHRamosEMehtaJ 2009 Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106 9362 9367

3. FlintJMackayT 2009 Genetic architecture of quantitative traits in mice, flies, and humans. Genome Res 19 723 733

4. ManolioTACollinsFSCoxNJGoldsteinDBHindorffLA 2009 Finding the missing heritability of complex diseases. Nature 461 747 753

5. EichlerEEFlintJGibsonGKongALealSM 2010 Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11 446 450

6. PritchardJKCoxNJ 2002 The allelic architecture of human disease genes: common disease-common variant…or not? Hum Mol Genet 11 2417 2423

7. BodmerWBonillaC 2008 Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet 40 695 701

8. SchorkNJMurraySSFrazerKATopolEJ 2009 Common vs. rare allele hypotheses for complex diseases. Curr Opin Genet Dev 19 212 219

9. PritchardJK 2001 Are rare variants responsible for susceptibility to complex diseases? Am J Hum Genet 69 124 137

10. GorlovIPGorlovaOYSunyaevSRSpitzMRAmosCI 2008 Shifting paradigm of association studies: value of rare single-nucleotide polymorphisms. Am J Hum Genet 82 100 112

11. CohenJCKissRSPertsemlidisAMarcelYLMcPhersonR 2004 Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305 869 872

12. CohenJCBoerwinkleEMosleyTHJrHobbsHH 2006 Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N Engl J Med 354 1264 1272

13. RomeoSPennacchioLAFuYBoerwinkleETybjaerg-HansenA 2007 Population-based resequencing of ANGPTL4 uncovers variations that reduce triglycerides and increase HDL. Nat Genet 39 513 516

14. AhituvNKavaslarNSchackwitzWUstaszewskaAMartinJ 2007 Medical sequencing at the extremes of human body mass. Am J Hum Genet 80 779 791

15. JiWFooJNO'RoakBJZhaoHLarsonMG 2008 Rare independent mutations in renal salt handling genes contribute to blood pressure variation. Nat Genet 40 592 599

16. AzzopardiDDallossoAREliasonKHendricksonBCJonesN 2008 Multiple rare nonsynonymous variants in the adenomatous polyposis coli gene predispose to colorectal adenomas. Cancer Res 68 358 363

17. NejentsevSWalkerNRichesDEgholmMToddJA 2009 Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 324 387 389

18. RomeoSYinWKozlitinaJPennacchioLABoerwinkleE 2009 Rare loss-of-function mutations in ANGPTL family members contribute to plasma triglyceride levels in humans. J Clin Invest 119 70 79

19. BansalVLibigerOTorkamaniASchorkNJ 2010 Statistical analysis strategies for association studies involving rare variants. Nat Rev Genet 11 773 785

20. AsimitJZegginiE 2010 Rare variant association analysis methods for complex traits. Annu Rev Genet 44 293 308

21. LiBLealSM 2008 Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet 83 311 321

22. MorrisAPZegginiE 2010 An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol 34 188 193

23. MadsenBEBrowningSR 2009 A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet 5 e1000384 doi:10.1371/journal.pgen.1000384

24. PriceALKryukovGVde BakkerPIPurcellSMStaplesJ 2010 Pooled association tests for rare variants in exon-resequencing studies. Am J Hum Genet 86 832 838

25. KingCRRathouzPJNicolaeDL 2010 An evolutionary framework for association testing in resequencing studies. PLoS Genet 6 e1001202 doi:10.1371/journal.pgen.1001202

26. YiNZhiD 2011 Bayesian analysis of rare variants in genetic association studies. Genet Epidemiol 35 57 69

27. HanFPanW 2010 A Data-Adaptive Sum Test for Disease Association with Multiple Common or Rare Variants. Hum Hered 70 42 54

28. NealeBMRivasMAVoightBFAltshulerDDevlinB 2011 Testing for an unusual distribution of rare variants. PLoS Genet 7 e1001322 doi:10.1371/journal.pgen.1001322

29. Ionita-LazaIBuxbaumJDLairdNMLangeC 2011 A new testing strategy to identify rare variants with either risk or protective effect on disease. PLoS Genet 7 e1001289 doi:10.1371/journal.pgen.1001289

30. PanWShenX 2011 Adaptive tests for association analysis of rare variants. Genet Epidemiol

31. HoffmannTJMariniNJWitteJS 2010 Comprehensive approach to analyzing rare genetic variants. PLoS ONE 5 e13584 doi:10.1371/journal.pone.0013584

32. LiYByrnesAELiM 2010 To identify associations with rare variants, just WHaIT: Weighted haplotype and imputation-based tests. Am J Hum Genet 87 728 735

33. LiuDJLealSM 2010 A novel adaptive method for the analysis of next-generation sequencing data to detect complex trait associations with rare variants due to gene main effects and interactions. PLoS Genet 6 e1001156 doi:10.1371/journal.pgen.1001156

34. LuoLBoerwinkleEXiongM 2011 Association studies for next-generation sequencing. Genome Res

35. McCullaghPNelderJA 1989 Generalized linear models London Chapman and Hall

36. GelmanACarlinJSternHRubinD 2003 Bayesian data analysis London Chapman and Hall

37. GelmanA 2004 Parameterization and Bayesian modeling. Journal of the American Statistical Association 99 537 545

38. GelmanAHillJ 2007 Data Analysis Using Regression and Multilevel/Hierarchical Models New York Cambridge University Press

39. GelmanA 2006 Prior distributions for variance parameters in hierarchical models. Bayesian Analysis 1 515 533

40. CarvalhoCPolsonNScottJ 2010 The horseshoe estimator for sparse signals. Biometrika 97 465 480

41. GelmanAJakulinAPittauMGSuYS 2008 A weakly informative default prior distribution for logistic and other regression models. Annals of Applied Statistics 2 1360 1383

42. YiNBanerjeeS 2009 Hierarchical generalized linear models for multiple quantitative trait locus mapping. Genetics 181 1101 1113

43. YiNKaklamaniVGPascheB 2011 Bayesian analysis of genetic interactions in case-control studies, with application to adiponectin genes and colorectal cancer risk. Ann Hum Genet 75 90 104

44. ArmaganADunsonDLeeJ 2010 Bayesian generalized double Pareto shrinkage. Biometrika

45. KyungMGillJGhoshMCasellaG 2010 Penalized Regression, Standard Errors, and Bayesian Lassos. Bayesian Analysis 5 369 412

46. AdzhubeiIASchmidtSPeshkinLRamenskyVEGerasimovaA 2010 A method and server for predicting damaging missense mutations. Nat Methods 7 248 249

47. BenjaminiYHochbergY 1995 Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 57 289 300

48. ParkTCasellaG 2008 The Bayesian Lasso. Journal of the American Statistical Association 103 681 686

49. TibshiraniR 1996 Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society Series B 58 267 288

50. ZouH 2006 The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association 101 1418 1429

51. ZouHHastieT 2005 Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67 301 320

52. YiNXuS 2008 Bayesian LASSO for quantitative trait loci mapping. Genetics 179 1045 1055

53. WangKLiMHakonarsonH 2010 Analysing biological pathways in genome-wide association studies. Nat Rev Genet 11 843 854

54. RebbeckTSpitzMWuX 2004 Assessing the function of genetic variants in candidate gene association studies. Nat Rev Genet 5 589 597

55. ThomasDCContiDVBaurleyJNijhoutFReedM 2009 Use of pathway information in molecular epidemiology. Hum Genomics 4 21 42

56. DunsonDBHerringAHEngleSM 2008 Bayesian selection and clustering of polymorphisms in functionally related genes. Journal of The American Statistics Association 103 534 546

57. ThomasD 2010 Gene-environment-wide association studies: emerging approaches. Nat Rev Genet 11 259 272

58. CordellH 2009 Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet 10 392 404

59. YiN 2010 Statistical analysis of genetic interactions. Genet Res (Camb) 92 443 459

60. ZhuXFengTLiYLuQElstonRC 2010 Detecting rare variants for complex traits using family and unrelated data. Genet Epidemiol 34 171 187

61. WrayNGoddardMVisscherP 2008 Prediction of individual genetic risk of complex disease. Curr Opin Genet Dev 18 257 263

62. KraftPWacholderSCornelisMHuFHayesR 2009 Beyond odds ratios–communicating disease risk based on genetic profiles. Nat Rev Genet 10 264 269

63. KraftPHunterD 2009 Genetic risk prediction–are we there yet? N Engl J Med 360 1701 1703

64. JakobsdottirJGorinMConleyYFerrellRWeeksD 2009 Interpretation of genetic association studies: markers with replicated highly significant odds ratios may be poor classifiers. PLoS Genet 5 e1000337 doi:10.1371/journal.pgen.1000337

65. de los CamposGGianolaDAllisonDB 2010 Predicting genetic predisposition in humans: the promise of whole-genome markers. Nat Rev Genet 11 880 886

66. YangJBenyaminBMcEvoyBPGordonSHendersAK 2010 Common SNPs explain a large proportion of the heritability for human height. Nat Genet 42 565 569

67. MakowskyRPajewskiNMKlimentidisYCVazquezAIDuarteCW 2011 Beyond missing heritability: prediction of complex traits. PLoS Genet 7 e1002051 doi:10.1371/journal.pgen.1002051

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2011 Číslo 12
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Získaná hemofilie - Povědomí o nemoci a její diagnostika
nový kurz

Eozinofilní granulomatóza s polyangiitidou
Autori: doc. MUDr. Martina Doubková, Ph.D.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#