Genome-wide analysis of allelic expression imbalance detects cis-regulatory effect of breast cancer causal SNPs identified by OncoArray project (#186)
Genome-wide association studies (GWAS) have substantially improved the genetic complexity map of breast cancer. So far, nearly 170 loci associated with breast cancer risk have been identified with the risk variants mostly being within noncoding genomic regions including regulatory elements suggesting many causal variants act through modulating expression of nearby genes. Analysis of allelic imbalance (AI) in expression of the nearby genes is a powerful method to identify cis-acting regulatory effects of such variants.
The OncoArray Consortium has recently developed a custom designed array (the “OncoArray”) and genotyped in over 200,000 breast cancer cases and controls resulted in a successful fine-mapping of 81 breast cancer GWAS loci. We analysed the identified candidate causal SNPs from the fine-mapped loci (~3400 SNPs) for allelic specific expression using RNA sequencing data from 3 datasets including TCGA breast cancer samples (N=1136), TCGA normal breast samples (N=113), and GTEx normal breast samples (N=92). Essentially, for a given risk SNP, AIs of the genes within 1Mb up- or downstream were computed by calculating major allele fractions for transcribed SNPs of each gene and averaging them across the gene. The AIs were then compared between the samples heterozygote and homozygote for the risk SNP. Nearly 100 genes showed significantly higher AI in heterozygotes suggesting a cis-regulatory effect of the corresponding risk SNPs. The results were strongly supported by previously published breast cancer data as ~50% of the genes already validated and published as targets for risk SNPs were included in the gene set identified by our AI analysis. Ingenuity pathway analysis of the significant genes identified DNA replication, recombination, and repair and cell death and survival amongst the top networks suggesting possible mechanisms for cancer development.
Our study provides a unique resource of information on regulatory effects of breast cancer causal variants including a set of cis-regulated genes each can be a focus of a GWAS follow-up study.