Pathogenic implications for autoimmune mechanisms derived by comparative eQTL analysis of CD4+ versus CD8+ T cells

Inappropriate activation or inadequate regulation of CD4+ and CD8+ T cells may contribute to the initiation and progression of multiple autoimmune and inflammatory diseases. Studies on disease-associated genetic polymorphisms have highlighted the importance of biological context for many regulatory variants, which is particularly relevant in understanding the genetic regulation of the immune system and its cellular phenotypes. Here we show cell type-specific regulation of transcript levels of genes associated with several autoimmune diseases in CD4+ and CD8+ T cells including a trans-acting regulatory locus at chr12q13.2 containing the rs1131017 SNP in the RPS26 gene. Most remarkably, we identify a common missense variant in IL27, associated with type 1 diabetes that results in decreased functional activity of the protein and reduced expression levels of downstream IRF1 and STAT1 in CD4+ T cells only. Altogether, our results indicate that eQTL mapping in purified T cells provides novel functional insights into polymorphisms and pathways associated with autoimmune diseases.


3
TargetAmp Nano Labeling Kit for Illumina Expression BeadChip (Epicentre Biotechnologies) with SuperScript III Reverse Transcriptase (Life Technologies) according to the manufacturer's protocol, followed by purification with the RNeasy MinElute Cleanup Kit (Qiagen). RNA quality was assessed after extraction and after labelling using an Agilent 2100 Bioanalyzer and Agilent RNA 6000 Nano Kit (all from Agilent Technologies). The labelled RNA samples were hybridized to HumanHT12-v4 Expression BeadChips (Illumina) according to the manufacturer's instructions.

Gene expression probe mapping
Illumina HT12-v4 BeadChip probe sequences were mapped to the human genome build GRCh37 using TopHat(3) allowing up to five mismatches. Transcript and genome information was retrieved from Ensembl for genome assembly GRCh37.p13. Expression probes that mapped uniquely to the reference with no mismatches were retained. All expression probes mapping to either the X or Y chromosomes were removed. A total of 38,839 expression probes remained for further analyses.

Gene expression quality control and normalization
Preprocessing and quality control of the data was done using R(4) and the Bioconductor packages lumi (5) and arrayQualityMetrics (6). We chose to be conservative on determining outliers. Therefore, we excluded samples 1) which were considered to be outliers based on at least two criteria (array intensity distribution and/or between array comparison and/or individual array quality) in microarray quality metrics reports; 2) with detection rate (using detection P-value cut-off 0.05) less than 10% in CD4 + and CD8 + T cell samples; 3) which were outliers based on XIST gene and Y chromosome expression patterns. Also, mixups between gene expression samples and genotype samples were corrected using MixupMapper (7). The number of samples retained for further analysis was 293 for CD4 + and 283 for CD8 + T cells, 303 unique individuals. Based on similar studies(8), sample size of 300 individuals is sufficient to ensure statistical power to detect eQTL effects in purified cells.
Further, gene expression data was quantile normalized to the median distribution, and then log2 transformed. Then separately, for both cell types, probe and sample means were centered to zero and scaled. Correction for possible population structure in the gene expression data was done by removing four multi-dimensional scaling components (MDS components obtained from genotype data using PLINK(9)) using linear regression. Next, principal component analysis (PCA) was conducted on the sample correlation matrix to 4 correct for environmental and experimental variation. The optimal number of PCs to remove was determined based on the maximum number of cis-eQTLs (described before in Westra et al. (10)). Only PCs that were not affected by genetic variants (did not show significance at the FDR threshold of 0) were used to correct the gene expression data to minimize the amount of genetic variation removed. The optimal number of PCs to remove was 30 in CD4 + and 45 in CD8 + T cell datasets.

Biological effect of mutant and wild-type IL27
Coding sequences of IL27A (wild-type and mutant) and EBI3 without signal peptides

Cis-and trans-eQTL mapping
The effects of SNPs on local (cis-eQTL) and distant (trans-eQTL) genes were determined via eQTL mapping as described in eQTL mapping analysis cookbook developed by the University Medical Center Groningen at the Genetics Department and the Genomics Coordination Center (https://github.com/molgenis/systemsgenetics/wiki/ eQTL-mapping-analysis-cookbook) and in Westra et al. (10). For cis-analysis the distance between the probe midpoint and SNP genomic location was up to 1 Mb and for trans-analysis the distance was more than 5 Mb or the probe and SNP were on different chromosomes. Only SNPs with a MAF > 0.05, call rate > 0.95 and a Hardy-Weinberg equilibrium P-value > 0.001 were included in the analyses. We used the Spearman correlation coefficient (Spearman's rho) to detect associations between the coding allele of the SNP (directly genotyped or imputed allele dosages) and the variations in gene expression levels (residual gene expression levels obtained after corrections described before).
To control for multiple testing, we applied more conservative probe-level false discovery rate (FDR) procedure to account for the number of SNPs tested per probe (see the rationale for this approach in "Correction for multiple testing" block in the Supplementary Note by Westra et al. (10)). In short, we first generated a null distribution by 10 times performing eQTL mapping using permuted gene expression data with shuffled sample labels.
Then, for both the real and permuted P-value distributions, we only used the most significant SNP per probe to determine the FDR at 0.05. We consider 10 permutations to be sufficient as it has been shown in Westra et al. (10) that FDR threshold estimate was quite stable after 5 permutations and adding additional permutations did not greatly change the significance