Genome-wide association studies reach hepatology☆
Article Outline
Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver disease. Romeo S, Kozlitina J, Xing C, Pertsemlidis A, Cox D, Pennacchio LA, Boerwinkle E, Cohen JC, Hobbs HH.
Nonalcoholic fatty liver disease (NAFLD) is a burgeoning health problem of unknown etiology that varies in prevalence among ancestry groups. To identify genetic variants contributing to differences in hepatic fat content, we carried out a genome-wide association scan of nonsynonymous sequence variations (n
=
9229) in a population comprising Hispanic, African American and European American individuals. An allele in PNPLA3 (rs738409[G], encoding I148M) was strongly associated with increased hepatic fat levels (P
=
5.9
×
10(−10)) and with hepatic inflammation (P
=
3.7
×
10(−4)). The allele was most common in Hispanics, the group most susceptible to NAFLD; hepatic fat content was more than twofold higher in PNPLA3 rs738409[G] homozygotes than in noncarriers. Resequencing revealed another allele of PNPLA3 (rs6006460[T], encoding S453I) that was associated with lower hepatic fat content in African Americans, the group at lowest risk of NAFLD. Thus, variation in PNPLA3 contributes to ancestry-related and inter-individual differences in hepatic fat content and susceptibility to NAFLD.
[Abstract reproduced by permission of Nat Genet 2008;40:1461-1465].
Abbreviations: PNPLA3, patatin-like phospholipase domain containing 3, SNP, single nucleotide polymorphism, PPARG, peroxisome proliferator activated receptor gamma, NOD2, nucleotide-binding oligomerization domain containing 2, NAFLD, nonalcoholic fatty liver disease
Over the last two years genetic journals have been literally flooded with articles reporting from so-called genome-wide association studies of common diseases. Only recently, the first application of this study concept in a liver disease was published. Romeo et al. investigated non-synonymous (i.e. resulting in an amino acid change) genetic markers (single nucleotide polymorphisms; SNPs) distributed throughout the genomes of 1032 African American, 696 European American and 383 Hispanic participants in the Dallas Heart Study [1]. A quantitative trait, hepatic fat content, was assessed for association with 9229 of the investigated markers. Statistical analysis strongly supported that genetic variation at the PNPLA3 locus, which encodes a protein of largely unknown function, is a critical determinant for hepatic fat content. The association was present in all ethnicities and was independent from key confounders like body mass index, alcohol consumption and diabetes. Albeit a simple conclusion, some reflections on the complex genetic context in which it is drawn may be useful for a clinician.
The first part of the study deals with quality control of the genotyping process of more than 12,000 SNPs. After the application of a series of quality criteria, a total of 2886 SNPs were excluded from the statistical analysis. This number may seem high, but it is in line with the fraction of markers typically failing in genome-wide genotyping experiments. Since no attempts to re-type failed markers with other methods are usually made, any information at these positions is lost to the subsequent analysis. In genetic terms, the number of remaining markers in the present study (9229) is very low. Even on genotyping arrays with up to a million markers, many genes are not represented at all [2]. The fact that Romeo et al. were still able to make a finding of great importance somehow illustrates the pragmatic aspects of low-density technology: One can only see what’s on the array, but since these arrays are now cheap enough for widespread application, this will sometimes suffice.
The lead PNPLA3 marker (rs738409) achieved a P-value far lower than what could possibly occur by chance (P
=
5.9
×
10−10). No other marker was even close. The term genome-wide significance (P
⩽
5
×
10−7) is a useful reference for what is generally considered a “true” finding in genetics [3], given the pitfalls of multiple testing and type I errors that for years flawed the reputation of genetic association studies [4]. Looking at Fig. 1 in the article by Romeo et al., it is even tempting to ask whether PNPLA3 may be the only gene investigated that is important to hepatic steatosis? Typically in complex conditions, one or a few disease genes with strong effects act together with a large number of “weaker” genes to generate the overall genetic risk [5]. PNPLA3 surely represents one of the strong genes. One may then argue that “weak” equals “unimportant”, and discard the search for further risk factors. In type 2 diabetes, however, several large studies were needed to conclusively define PPARG as a disease gene [6]. It is worthy of reflection that one of the weakest disease genes in type 2 diabetes so far is the therapeutically most relevant one. Clearly, statistical significance of findings in genome-wide association analysis does not directly translate into clinical relevance. Investigations of larger study panels (10s of 1000s of individuals) may eventually reveal equally important risk factors for hepatic fat content as PNPLA3.
In sequencing of the PNPLA3 gene, Romeo et al. made two important observations. First, they identified an additional variant (rs6006460 T) which was significantly associated with decreased hepatic fat content. The effect was independent from the influence exerted by the lead SNP. Second, they identified several rare variants which were only present among the 160 individuals with the highest hepatic fat content. At NOD2, a susceptibility gene in Crohn’s disease, it has been shown that the genetic risk in 80% of the cases is exerted by one of 3 common variants. In approximately 20% of the cases, however, NOD2-related Crohn’s disease may be caused by rare variants only found in individual patients [7]. This situation of a mixture of common and rare susceptibility variants is probably typical for most disease loci. Problems arise when no common variant at a disease gene exists, rendering the locus invisible to statistical association analysis. The importance of this “blind spot” of genome-wide association studies is also hinted at in type 2 diabetes. Despite tremendous efforts, all disease variants identified by genome-wide association studies collectively account for a relative sibling risk of as little as 1.07 compared with the overall relative sibling risk in this condition of 3.0 [8]. At least part of this discrepancy is likely to be represented by disease loci with only rare variants. Even though the genome-wide association study design deserves to be applauded, like in the case of Romeo et al., it is important to remember that a large fraction of the genetic risk in complex traits is not identified by this approach.
So, at the end of the day, what does a genome-wide association study tell us? The susceptibility genes identified in various conditions have so far proven of limited value in the context of genetic testing. More interesting are the sometimes surprising clues to the underlying biology of a condition. This case can be made for the finding of Romeo et al., even though PNPLA3 variants have previously been reported to influence obesity [9]. Evidence that PNPLA3 influences hepatic fat content is convincing. However, evidence as to whether PNPLA3 may also influence susceptibility to liver damage in non-alcoholic fatty liver disease (NAFLD) should still be considered circumstantial and relies mainly on the parallel association with plasma alanine-aminotransferase levels [10]. More work is needed; the genome-wide association study merely provides a relevant starting point for this work. A multitude of genetic factors are likely to be important in the pathogenesis of NAFLD, as evident from the discussion above. However, among these, PNPLA3 would possibly not have been identified without the study by Romeo et al. And one starting point is a lot better than none.
References
- Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver disease. Nat Genet. 2008;40:1461–1465
- . Evaluation of coverage variation of SNP chips for genome-wide association studies. Eur J Hum Genet. 2008;16:635–643
- . The future of genetic studies of complex human diseases. Science. 1996;273:1516–1517
- . Problems of reporting genetic associations with complex outcomes. Lancet. 2003;361:865–872
- Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nat Genet. 2007;39:857–864
- Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science. 2007;316:1336–1341
- CARD15/NOD2 mutational analysis and genotype-phenotype correlation in 612 patients with inflammatory bowel disease. Am J Hum Genet. 2002;70:845–857
- Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008;9:356–369
- . Polymorphisms in the adiponutrin gene are associated with increased insulin secretion and obesity. Eur J Endocrinol. 2008;159:577–583
- Population-based genome-wide association studies reveal six loci influencing plasma levels of liver enzymes. Am J Hum Genet. 2008;83:520–528
☆ The author declared that he does not have anything to disclose regarding funding or conflict of interest with respect to this manuscript.
PII: S0168-8278(09)00152-4
doi:10.1016/j.jhep.2009.03.002
© 2009 European Association for the Study of the Liver. Published by Elsevier Inc. All rights reserved.
