Elucidating the role of 8q24 in colorectal cancer
The 1000 genomes Phase I Interim reference panel based on low-coverage (4–6x) sequencing of 1094 individuals from Africa (AFR; = 181) catalogued 203 047 SNPs mapping to the 16.2 Mb region.
A total of 92 095 SNPs were monomorphic in all five GWASs.
Details of the 10 most highly associated variants identified in imputation with and without the CG panel are detailed in In 13 of the 16 regions, imputation provided refinement of the association signal identifying a region of interest narrower that the original LD block likely to harbour the functional variant.
However, for three loci, 6p21, 12q13 and 16q22.1, the LD structure is large and complex and prohibited a smaller region of association being delineated.
In total, 46 829 of all variants mapping to the 16 regions had frequencies ≥1%, 4658 (10%) of which were not referenced in db SNP132.
In addition to using 1000 genomes data, we made use of deep sequencing (30×) data generated on 253 individuals, 199 of whom had been diagnosed with early-onset CRC (henceforth referred to as the CG panel).
It has recently been proposed that many GWAS signals are a consequence of ‘synthetic associations’ resulting from the combined effect of one or more rare causal variants rather than simply linkage disequilibrium (LD) with a common risk variant (12).
To enhance our ability to discover low-frequency risk variants, in addition to using 1000 Genomes Project data as a reference panel, we made use of high-coverage sequencing data on 253 individuals, 199 with early-onset familial CRC.
For 13 of the regions, it was possible to refine the association signal identifying a smaller region of interest likely to harbour the functional variant.
We used Haploview to define the haplotype blocks and recombination hotspots containing the tag SNPs previously found to be associated with CRC risk at 1q41, 3q26,2, 6p21.2, 8q23.3, 8q24.21, 10p14, 11q13.4, 11q23.1, 12q13, 14q22.2, 15q13.3, 16q22.1, 18q21.1, 19q13.11, 20p12.3 and 20q13.33.
We did not include the Xp22.2 locus in these analyses due to the low density of GWAS SNPs on the X chromosome and hence the difficulty involved in imputation.
After filtering, the average MAFs of these successfully imputed variants increased to 0.175 and 0.170, respectively.