Pro-inflammatory fatty acid profile and colorectal cancer risk: A Mendelian randomisation analysis

Background While dietary fat has been established as a risk factor for colorectal cancer (CRC), associations between fatty acids (FAs) and CRC have been inconsistent. Using Mendelian randomisation (MR), we sought to evaluate associations between polyunsaturated (PUFA), monounsaturated (MUFA) and saturated FAs (SFAs) and CRC risk. Methods We analysed genotype data on 9254 CRC cases and 18,386 controls of European ancestry. Externally weighted polygenic risk scores were generated and used to evaluate associations with CRC per one standard deviation increase in genetically defined plasma FA levels. Results Risk reduction was observed for oleic and palmitoleic MUFAs (OROA = 0.77, 95% CI: 0.65–0.92, P = 3.9 × 10−3; ORPOA = 0.36, 95% CI: 0.15–0.84, P = 0.018). PUFAs linoleic and arachidonic acid had negative and positive associations with CRC respectively (ORLA = 0.95, 95% CI: 0.93–0.98, P = 3.7 × 10−4; ORAA = 1.05, 95% CI: 1.02–1.07, P = 1.7 × 10−4). The SFA stearic acid was associated with increased CRC risk (ORSA = 1.17, 95% CI: 1.01–1.35, P = 0.041). Conclusion Results from our analysis are broadly consistent with a pro-inflammatory FA profile having a detrimental effect in terms of CRC risk.


Introduction
Colorectal cancer (CRC) is one of the most common cancers and a major cause of cancer-related mortality in economically developed countries [1]. Geographical differences in CRC incidence between countries and migration studies have established the importance of lifestyle and diet as major determinants for CRC risk [2]. Worldwide CRC is currently diagnosed in over one million individuals annually; however, its incidence is set to increase with adoption of western lifestyles in developing countries [3]. Given the importance of diet as a risk factor for CRC, its modification offers the prospect of impacting significantly on disease incidence through public health initiatives.
Dietary fat has been widely implicated as a risk factor for cancer, and meta-analyses of epidemiological studies have tended to associate CRC risk with a higher consumption of red and processed meat [4]. The association between fat intake on cancer risk however, is likely to depend not only on the quantity, but also on the specific type of fatty acid (FA). Animal models and ecological studies have tended to implicate animal fat [5], saturated fatty acid (SFA) and certain omega-6 polyunsaturated fatty acids (u-6 PUFAs) with an increased risk, and u-3 PUFA intake with a reduced risk [6e8]. Evidence for a causal relationship with intake of specific types of fat from epidemiological studies has however largely been inconclusive. Reasons for inconsistencies in observational studies include the inherent problem of eliciting accurate measurements of long-term diet, confounding and reverse causation [9].
Mendelian randomisation (MR) analysis represents an adjunct to the conventional epidemiological observational study for examining associations between an exposure with a disease. The MR strategy makes use of allelic variants that are randomly assigned during meiosis and are robustly associated with traits of interest, as instrumental variables (IVs). Using genetically defined IVs as proxies of modifiable exposure avoids confounding by environmental factors, is not subject to reverse causality and can inform on life-long exposure [10,11]. Since studies have shown that FA intake influences plasma levels of FAs in theory MR makes an attractive strategy to link dietary FA to CRC risk [12,13].

Colorectal cancer datasets
We investigated the relationship between genetic risk scores for levels of MUFAs, PUFAs, and SFAs and CRC risk adopting a two-sample MR strategy using data from seven reported genome-wide association studies (GWAS) of CRC (Table 1). Briefly, these GWAS were based on individuals with European ancestry: CCFR1, CCFR2, COIN, FINLAND, UK1, Scotland1 and VQ58 [14]. Each study was approved by respective institutional ethics review board and performed/conducted in accordance with the Declaration of Helsinki.

Genotyping data
Comprehensive details of the genotyping and quality control of the seven GWAS have been previously reported [14]. Briefly, we excluded single nucleotide polymorphisms (SNPs) with a minor allele frequency of <1%, low call rate <95%, those SNPs violating Har-dyeWeinberg equilibrium, and individuals with non-European ancestry as assessed using data from Hap-Map v2 [15]. IMPUTEv2 software [16] was used to recover untyped SNP genotypes using a merged reference panel consisting of Sequencing Initiative Suomi (for the FINLAND data) or UK10K (for the remaining data) and 1000 Genomes Project data [17,18]. Poorly imputed SNPs, defined by an INFO score of <0.9, were excluded. Summary statistics from the seven GWAS were used to calculate the odds ratios (ORs) for FArelated SNPs.

Gene variants used to construct genetic risk scores
Genetic risk scores for IVs for each plasma FA were developed from SNPs previously identified by The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium. We considered SNPs associated at genome-wide significance (i.e. P 5.0 Â 10 À8 ) in individuals with European Ancestry. To avoid co-linearity between SNPs for each FA we imposed a threshold r 2 value of !0.01 for linkage disequilibrium (LD) including only the SNPs with the strongest effect on the trait in genetic risk scores ( Table  2, [19e22]). For each identified SNP, we recovered the chromosome positions, the risk alleles, association estimates and standard errors. For each SNP, the allele that was associated with increased FA level was considered the effect allele.

Statistical analysis
The association between the plasma level of each FA and CRC was examined using MR on summary  Table 2 Effect sizes for plasma fatty acid content (per standard deviation increase in levels) for genome-wide significant (P < 5 Â 10 À8 ) instrumental variables reported by CHARGE consortium. Taken from CHARGE consortium, as a percentage of total serum fatty acids, calculated by (b 2 *2*MAF*(1-MAF))/Var(Y) where b is the regression coefficient, MAF is the minor allele frequency and Var(Y) is the variance in levels of the fatty acid. IVs obtained from Refs. [19,20,22]. statistics as per   [23]. The ratio estimate ð b bÞ of all SNPs associated with each fatty acid, combined, on CRC was calculated as follows: where X k corresponds to the association of SNP k (as log of the OR per risk allele) with the fatty acid trait Y, Y k is the association between SNP k and CRC risk (as log of the OR) with standard error s Y k . The estimate for ð b bÞ represents the causal increase in the log odds of the CRC, per unit change in fatty acids. The standard error of the combined ratio estimate is given by: A meta-analysis of statistics for each specific FA generated for each CRC cohort was combined under fixed-effects models to derive the summary ORs and confidence intervals (CIs). To assess the impact of between study heterogeneity, we also derived ORs under a random-effects model.
A central tenet in MR is the absence of pleiotropy (i.e. a gene influencing multiple traits) between the SNPs influencing CRC risk and FA levels. This would be revealed as deviation from a linear relationship between SNPs and their effect size for any FA and CRC risk. To examine for violation of the standard IV assumptions in our analysis, we performed inverse variant weighted (IVW) and MR-Egger regression tests [24].
We considered a significance level of P 0.05 as being satisfactory to derive a conclusion. While ordinarily it would be appropriate to impose a Bonferronicorrected threshold, this assumes an independence of IVs across all FA traits, which is not the case in the present analysis. All statistical analyses were undertaken using R version 3.1 software [25].

Expression quantitative trait locus analysis
To examine the relationship between SNP genotype and expression of FA metabolism genes, we performed expression quantitative trait locus (eQTL) analysis using data from The Cancer Genome Atlas (TCGA) and the genotype tissue expression (GTEx)project [26,27].

Results
The FA-associated genetic variants and their GWASreported characteristics that were used to derive IVs for FAs are detailed in Table 2. A reduced risk of CRC was observed for genetic variants associated with increases in the MUFAs studied (Table 3). In all but one of the seven cohorts increased levels of OA were associated with reduced CRC risk (Fig. 1). In the meta-analysis of these seven cohorts the OR OA was 0.77 (95% CI: 0.65e0.92, P Z 3.9 Â 10 À3 ) with little evidence of between-study heterogeneity (P het Z 0.23, I 2 Z 26%). Similarly, increased levels of POA were associated with reduced CRC risk with an OR POA of 0.36 (95% CI: 0.15e0.84, P Z 0.018, P het Z 0.08, I 2 Z 47%; Fig. 1).
The u-6 PUFAs LA and AA both showed association with CRC risk, but in different directions. Specifically, LA was associated with reduced risk (OR LA Z 0.95, 95% CI: 0.93e0.98, P Z 3.7 Â 10 À4 , P het Z 0.03, I 2 Z 57%; Fig. 1) and AA with an increased risk (OR AA Z 1.05, 95% CI: 1.02e1.07, P Z 1.7 Â 10 À4 , P het Z 0.03, I 2 Z 56%). The association between one standard deviation increase in each of the other PUFAs defined by their respective IVs and CRC risk were null ( Supplementary Fig. 1).
To formally assess the impact of heterogeneity on study findings we derived ORs under a random-effects Table 3 Odds ratios (ORs) and 95% confidence intervals (CI) for one standard deviation increase in genetically predicted plasma fatty acid levels and colorectal cancer risk. model. Associations between AA, LA and OA and CRC risk remained significant (Table 3). We assessed the impact of possible classical pleiotropism on MR estimates using both IVW and MR-Egger regression tests. There was no evidence for violation of the standard IV assumptions used for MR analysis, such as a dependence on confounders (Table 4).
In the present analysis, we used the SNP rs102275 in combination with other SNPs to generate a polygenic risk score for SA, OA and POA, whereas rs174547, which is in LD with rs102275 (r 2 Z 1.0 and D 0 Z 1.0), was used for DPA, AA, DGLA and LA. Both SNPs annotate the FADS2 gene. FADS2 is a rate-limiting enzyme in the desaturation of LA to AA, and a-linolenic acid into DHA and EPA (Fig. 2). These FAs are precursors for prostaglandins and leukotrienes, which are key mediators of the inflammatory response. In an eQTL analysis rs174547 and rs102275 genotype were shown to be strongly correlated with FADS2 expression across a range of different tissue types, including blood (P Z 3.98 Â 10 À29 ), normal colon (P Z 1.65 Â 10 À10 ) and CRC (P Z 2.07 Â 10 À5 ) (Supplementary Table 1).

Discussion
While dietary fat intake has been associated with the CRC risk, teasing out specific FA associations and their mechanistic basis has proven to be challenging. A number of observational studies have reported associations between serum levels of specific FAs with CRC [28,29], supporting our findings.
A major strength of the MR strategy to identify causal associations is that it is not influenced by recall bias and confounding that can affect traditional observational studies. Nevertheless, a key assumption in MR is that the variants used to generate genetic scores are associated with the exposure being queried. Herein, we only made use of SNPs associated with each FA at genome-wide significance from hypothesis-free GWAS. Furthermore, we only used data from individuals of European descent so as to limit bias from population stratification. Another central assumption in MR is that variants are associated with CRC only through the exposure and are not confounded by pleiotropy, which would be revealed by a positive correlation between increasing effect sizes in the IVs and CRC risk. While we did not observe such relationship, we acknowledge that IVs for a number of the FAs were solely based on only one or two SNPs, preventing assessment by IVW and MR-Egger analysis. One strategy to overcome this and fully investigate any pleiotropy would be to measure FA serum levels in correlation with CRC risk.
In this analysis, the same SNP (rs102275, or correlated SNP rs174547) was used to make causal deductions between multiple FAs and CRC risk. Therefore, SNPs have been used each time assuming that the exposure individually accounts for the disease association. The genetic variant association with CRC risk is consequently double-counted, in that the effect is attributed to different FA exposures [30]. With such vertical pleiotropism, single locus MR analyses cannot robustly decipher which FA is primarily driving the relationship with CRC risk. Such considerations have not been addressed in previous studies of the relationship between PUFAs and prostate cancer [31] or between branched-chain amino acids and diabetes [32].
While we did not demonstrate a causal association between other FAs including several PUFAs, SFAs and CRC risk, we acknowledge that our power to demonstrate a relationship was limited. For example, with respect to EPA: assuming the variance explained by the alleles is 0.04%, based on epidemiological observational study data, and a relative risk of 1.04 we had <10% power to demonstrate a relationship [33].
Accepting these caveats we have provided support for differing effects of OA, and u-6 PUFAs LA and AA on CRC risk. Our findings broadly accord with the findings from many of the published ecological and epidemiological observational studies. Notably, increased levels of AA contribute as a risk factor to CRC development [34,35], while increased intake of olive oil, which is high in OA, is associated with decreased risk [36]. A number of epidemiological studies have provided evidence that a Mediterranean diet, with a higher olive oil intake, is associated with reduced CRC risk [36e38].
In the eQTL analysis, both rs102275 and rs174547 show evidence of cis-regulatory effects on FADS2 expression. Intriguingly, rs174547 has previously been reported to have opposing effects on FADS2 and FADS1 expression in CRC [39]. Collectively, these data provide for relationship between diet, genotype, FA metabolism and CRC risk through modulation of an inflammatory response.
Even so, a biological basis for associations between specific FAs and CRC risk remain to be established. It is however, predicted a priori that within any FA class, different members have different actions and effects. With respect to u-6, evidence supports the inflammatory effects for AA through COX-2 production of inflammatory mediators [40] including prostaglandin E2, which affect CRC carcinogenesis [41e43]. This implies that diets high in AA, such as meat or eggs, may lead to more inflammatory compounds, which in turn may increase CRC risk. While increasing dietary LA, an essential FA, might potentially enrich tissues with AA due to their metabolic link [44], a geneeenvironment interaction may exist to influence colon FA content [45]. There is however, contradictory evidence from studies that have associated LA with both an increased [46] and decreased risk of CRC, possibly by altering u-6 to u-3 FA ratios [47] or alternatively production of reactive oxygen species [48]. The ability of aspirin to irreversibly inhibit COX-1 and COX-2 and therefore lower pro-inflammatory signals independent of genotype and diet, has thus proved an attractive option for CRC chemoprevention [49]. In conclusion, irrespective of the biological basis of associations between FAs and CRC risk our findings are consistent with the observation that the dietary composition of MUFAs in Mediterranean diets are risk reducing, and that a pro-inflammatory diet are risk increasing [50]. While we may not be at a stage where we can justifiably advise individuals to alter their intake of specific FAs to decrease the risk of developing CRC, it seems the current guidelines to moderate total fat and SFA consumption and increase unsaturated FA intake is likely to be beneficial.

Conflict of interest statement
None declared. vided by the Oxford Comprehensive Biomedical Research Centre and the EU FP7 CHIBCHA grant. Core infrastructure support to the Wellcome Trust Centre for Human Genetics, Oxford was provided by grant (090532/Z/09/Z). We are grateful to many colleagues within UK Clinical Genetics Departments (for CORGI) and to many collaborators who participated in the VICTOR and QUASAR2 trials. We also thank colleagues from the UK National Cancer Research Network (for NSCCG). Support from the European Union (FP7/207-2013, grant 258236) and FP7 collaborative project SYSCOL and COST Action in the UK is also acknowledged (BM1206). The COIN and COIN No. HHSN26120100037C), and the California Department of Public Health (contract HHSN261201000035C) awarded to the University of Southern California. The content of this manuscript does not necessarily reflect the views or policies of the National Cancer Institute, any SEER program or any of the collaborating centres in the CCFR, nor does mention of trade names, commercial products, or organisations imply endorsement by the US Government, SEER or the CCFR.
We are grateful to all individuals who participated in the various studies. This study made use of genotyping data from the 1958 Birth Cohort, kindly made available by the Wellcome Trust Case Control Consortium 2. A full list of the investigators who contributed to the generation of the data is available at http://www.wtccc. org.uk/.