Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

T Cell Transcriptomes Describe Patient Subtypes in Systemic Lupus Erythematosus

T Cell Transcriptomes Describe Patient Subtypes in Systemic Lupus Erythematosus

  • Sean J. Bradley, 
  • Abel Suarez-Fueyo, 
  • David R. Moss, 
  • Vasileios C. Kyttaris, 
  • George C. Tsokos



T cells regulate the adaptive immune response and have altered function in autoimmunity. Systemic Lupus Erythematosus (SLE) has great diversity of presentation and treatment response. Peripheral blood component gene expression affords an efficient platform to investigate SLE immune dysfunction and help guide diagnostic biomarker development for patient stratification.


Gene expression in peripheral blood T cell samples for 14 SLE patients and 4 controls was analyzed by high depth sequencing. Unbiased clustering of genes and samples revealed novel patterns related to disease etiology. Functional annotation of these genes highlights pathways and protein domains involved in SLE manifestation.


We found transcripts for hundreds of genes consistently altered in SLE T cell samples, for which DAVID analysis highlights induction of pathways related to mitochondria, nucleotide metabolism and DNA replication. Fewer genes had reduced mRNA expression, and these were linked to signaling, splicing and transcriptional activity. Gene signatures associated with the presence of dsDNA antibodies, low complement levels and nephritis were detected. T cell gene expression also indicates the presence of several patient subtypes, such as having only a minimal expression phenotype, male type, or severe with or without induction of genes related to membrane protein production.


Unbiased transcriptome analysis of a peripheral blood component provides insight on autoimmune pathophysiology and patient variability. We present an open source workflow and richly annotated dataset to support investigation of T cell biology, develop biomarkers for patient stratification and perhaps help indicate a source of SLE immune dysfunction.


Systemic Lupus Erythematosus (SLE) is a debilitating autoimmune disease affecting primarily women. It involves dysregulation of T and B cells resulting in excessive production of antibodies against self proteins and DNA, immune complex formation and T cell infiltration into tissues. These processes cause a variety of symptoms including arthritis, cytopenia and kidney failure. The etiologic origins of sporadic SLE are unknown, but altered regulation of T cells is well documented [13]. Genetic determinates of SLE severity have been elusive in part because of the heterogeneity that marks the disease [4, 5], with the majority of cases caused by genetic predisposition coupled with environmental causes. SLE T cells present a poised activation phenotype associated with lower TCR activation threshold, lipid raft aggregation, increased calcium flux upon activation, and overproduction of inflammatory cytokines. Altered gene expression usually accompanies these functional alterations [6].

Expression signatures in SLE have been addressed primarily in the peripheral blood compartment, where pioneering work by the Pascual group first described the interferon signature [7] [8]. These genes are inducible by the cytokine in vitro and have since been subdivided as being targets of type I or II interferon [9]. Many of these are simultaneously induced in subsets of cells including T and B cells [10] and monocytes [11] providing evidence for shared signaling abnormalities in peripheral blood mononuclear cells.

We assayed steady-state mRNA abundance by sequencing to discover molecular underpinnings of T cell dysfunction in SLE. Alterations in expression reveal patient subtypes marked by induction of genes involved in protein folding on the endoplasmic reticulum, high levels of ribosomal protein genes, or the previously identified interferon signature alone. Substantial differences in T cell expression in men and women were also found. Highlighted genes could represent biomarkers informative for disease management and may also direct investigation into other T-cell driven autoimmune conditions. This methodology is amenable to study of any disease with great variability of symptom presentation if highly relevant tissue can be obtained for transcriptome sequencing.

Materials and Methods

Sample Collection

At least 5ml of peripheral blood was collected to Lithium Heparin BD vacutainers from 14 SLE patients under treatment at the Lupus Center at the Rheumatology Division of Beth Israel Deaconess Medical Center. All participating patients fulfilled the American College of Rheumatology criteria for the diagnosis of SLE [12]. Blood was similarly obtained from 4 similarly aged healthy female controls. This study was approved by the Institutional Review Board of Beth Israel Deaconess Medical Center. Written informed consent was obtained from all participating subjects and all clinical investigation was conducted according to the principles expressed in the Declaration of Helsinki.

Cell extraction and RNA isolation

Rosette Sep T cell Purification (StemCell technologies, Vancouver, Canada) was employed as instructed by incubation of blood for 30 min with tetrameric antibody mixture against CD14, CD19, CD20/MS4A1, CD36, CD56, CD66b, CD123, GYPA, and CD16/FCGR3A which binds non-T cells to erythrocytes. Density-gradient centrifugation with Lymphocyte Separation Medium (Cellgro, Manassas, VA) was used to isolate the unstimulated T cells. T cell purity is routinely >93% by this method as determined by CD3 APC/Cy7 HIT3a (Biolegend) staining detected on a Beckman Coulter Gallios Cytometer. RNA was then prepared by Qiagen AllPrep Kit (Valencia, CA) from 3 million T cells with DNAse-I treatment. Roughly 2ug of total RNA was submitted to sequencing, and OD260/280 ratios were approximately 2.1.

Sequencing and Analysis

Unstranded cDNA library preparation and sequencing was performed by BGI (Shenzhen, China). Illumina sequencing provided roughly 75e6 paired 90bp reads for each (~ 12GB gzipped data per sample), which were assessed by FastQC and trimmed to allow ~85% mapping to the GRCh37/hg19 assembly by TOPHAT. Expression scores in Fragments Per Kilobase of transcript per Million mapped reads (FPKM) were obtained by CUFFLINKS for the 24262 best annotated genes, including many expressed psuedogenes and noncoding RNAs (S1 Table). For comparisons between different groups of samples, CUFFDIFF2 was used primarily, alongside DESeq2 and nonparametric tests in R, to calculate expression and statistical differences. Heatmaps were generated using median-normalized expression data with Gene-e and NMF clustering was performed on the Genepattern server, both provided by the Broad Institute. Singular Value Decomposition (SVD) was performed at and Venn diagrams were created with BioVenn [13]. Online supplemental files contain methods with specific program commands and R scripts which were implemented in R studio (S1 Text).


We sequenced mRNA from peripheral T cells in two men and 12 women with a variety of SLE manifestations and SLEDAI disease scores (Fig 1) [14]. As controls we prepared specimens from 4 healthy women aged 25 to 37. None of the patients were receiving therapy with biological agents (Table 1). To evaluate the cell-type purity of the samples we checked non-T cell marker expression. Of the epitopes used to collect T cells by rosette negative selection, only one had expression above background. CD16/FCGR3A, a receptor on NK and T cells [15], had medium expression and was induced in some patients relative to controls. Negligible CD5 and CD19 signal indicated that the preparation was free from B cells and our T cell purity is routinely >93% based on surface CD3 detection. Genes were stratified by expression (average value of all 18 samples) into classes of high, medium, low and unexpressed (≥34, ≥11, ≥1 and less than 1FPKM). Most analysis was carried out on the top quartile of expression in the genome (6047 genes) which included high and medium classes. Among the highest expressed were B2 microglobulin (B2M), several thymosins and the expected myriad ribosomal and mitochondrial proteins.

Fig 1. T cell transcriptome workflow and distribution of major clinical signs in the SLE patient cohort.

A) Samples were obtained from peripheral blood by negative selection and mRNA was sequenced at high depth. CUFFLINKS generated per-sample expression values and CUFFDIFF2 and DESeq2 were used for groupwise comparisons, which were repeated following discovery of novel sample subgroups. B) Frequency of SLE symptom presentation at blood draw for patient samples. Highlighting colors based on patient subtypes determined by downstream analysis.

Although total T cells are readily obtained and efficiently purified, variable numbers of constituent cell types impacts transcript abundance. Our pan-T cell view provides breadth and flexibility for study but does not address alterations in cell type frequency, such as lower total lymphocyte counts or lower proportion of CD4 (fraction and absolute amounts) often found in SLE patients. Genes with significant differential expression usually had greater than 2-fold changes in transcript signal, and therefore likely reflect expression changes more so than differences in cell type frequency associated with SLE.

SLE T cells display more genes with increased rather than decreased expression

First we sought an overall picture of mRNA abundance changes related to SLE in T cell samples. Differential expression metrics were found by CUFFDIFF2, which calculates groupwise expression in Fragments Per Kilobase of transcript per Million mapped reads (FPKM) to allow comparison across genes of varied size and applies a beta negative binomial distribution to generate a False Discovery Rate (FDR) where q<0.05 is routinely considered significant [16]. A scatterplot shows the distribution of fold change by expression level for genes with a 1.5- or 2-fold difference and those detected as significantly altered (Fig 2A). Count-based methods of differential expression are reported to be advantageous in some scenarios, so we also used DESeq2 which was usually confirmatory (p<0.01). Alterations were stratified at multiple levels to provide flexibility for downstream analysis, some of which performed better with more input genes having more subtle differential expression. For high- and medium-expression genes (6047 with greatest average expression for all samples) at least twice as many displayed increased rather than reduced mRNA expression relative to controls at all thresholds. Alterations at each threshold were similarly distributed in terms of mRNA abundance. One third of the genes showing more than 2-fold expression changes were statistically significant by CUFFDIFF2 (q<0.05) but only one third of these passed a comparable threshold in DESeq2 (p<0.01) (Fig 2B).

Fig 2. SLE T cells display more genes with increased rather than decreased expression.

A) Distribution of expression stratified at the 1.5-, 2-fold and q<0.05 CUFFDIFF2 significance levels. B) Relationship between q values and expression fold change in SLE relative to control. C) Select genes significantly increased or decreased as determined by sequencing and CUFFDIFF2, with those confirmed by DESeq2 in bold. D) Example constituent data for OAS2 and NR4A2 in control, inactive and active (SLEDAI >6) samples, where error bars represent the median absolute deviation about the median.

Previously reported interferon signature genes including OAS2, ISG15, UBE2L6, IFI35, IFI44, and STAT1 were detected as significantly induced by both analyses [17]. Among highly-expressed and significantly upregulated genes were IL2RG (encoding CD132, the common subunit of receptors for Il-2, -4, -7–15 and -21), CD53, ENO1 and many immunoglobulin fragment transcripts. Select genes with diverse functions are listed, with those in bold found by both CUFFDIFF2 and DESeq2 (Fig 2C). Genes with significantly reduced mRNA abundance included RGS1 and RGS2, which drive G-protein alpha subunits to their inactive state, EZR, which regulates cytoskeleton-membrane interactions, and several nuclear expression regulators. OAS2 and NR4A2 serve as examples of robust alterations, which usually persisted in patients both with low and high (SLEDAI >6) disease activity (Fig 2D), but were not altered in all patients. Although CCR4 and CCR7 mRNAs were significantly reduced, their surface expression is reported to be increased on SLE T cells [18, 19], suggesting altered post-translational or membrane trafficking regulation for these receptors.

Genes Related to Disease Symptoms

Gene expression markers of SLE symptoms could aid in diagnosis and may point to causative biology. We sought expression changes linked to the presence of increased anti-dsDNA antibodies, low complement levels, or nephritis. Comparison to controls largely recapitulated the overall SLE analysis, showing similar gene expression related to all major symptoms. This was not surprising given that many patients exhibited more than one symptom. We therefore made comparisons among patient samples with and without each SLE manifestation. Although CUFFDIFF and DESeq analyses suggested genes with substantial fold changes and statistically significant differences in expression, the underlying data revealed great vulnerability to outlier expression. We employed the Mann-Whitney nonparametric rank-sum test, based on groupwise median rather than mean expression values, which yielded genes with statistically different expression in samples with and without each clinical sign.

Samples obtained from patients with increased dsDNA antibodies (titer greater than 1:40 at blood draw) had 579 and 44 genes detected as significant at the p>0.05 and 0.01 levels, respectively (Fig 3A). Most compelling was confirmation of an association with increased expression for LY6E [20]. More mRNA for caspase inhibitor CARD16 and proteasome regulator PSMF1 was also detected. Reductions were evident for mRNA for SEMA4D, which has altered expression in arthritis [21], and ITPKB, encoding an inositol phosphate kinase involved in stem cell division [22, 23]. DESeq2 confirmed the association of all but PSMF1 with this clinical feature (p<0.01), while CUFFDIFF2 confirmed only LY6E. None of the genes associated with other symptoms were confirmed by either analysis.

Fig 3. T cell expression of specific genes linked to major clinical manifestations of SLE.

Select genes with differential expression in women largely unique to each symptom were detected by Mann-Whitney rank sum tests related to A) increased presence of dsDNA antibodies B) Low C3 or C4 complement levels or C) biopsy-confirmed lupus nephritis. Error bars represent median absolute deviation from the median value for each group and *p<0.05, **p<0.01 for pairwise tests conducted on samples from patients with or without each symptom (healthy control data plotted only for comparison).

A similar number of gene expression differences uniquely marked samples obtained from patients having low complement (C3 lower than 90 mg/dl or C4 less than 12 mg/dl). Most of the 388 and 46 genes detected at the p<0.05 and <0.01 levels had increased expression (Fig 3B). Several of these have activity at the endoplasmic reticulum. SEC11C encodes a subunit of microsomal signal peptidase complex while peptidyl-prolyl cis/trans isomerases cyclophilin B (PPIB) and FKBP11 both support protein folding. Increases in BOLA2, which binds glutaredoxin 3 to regulate iron levels [24] and the poorly characterized SCAND1 are also good candidate genes for which increased activation marks this clinical phenotype. The function of these genes hints at a disruption of normal activity rather than increased T cell activity.

Patients with nephritis (confirmed by recent biopsy and usually coincident with proteinuria or hematuria) showed fewer genes with altered expression (58 or 2 for p<0.05 or 0.01) and there were no obvious links between them (Fig 3C). The coefficient of variability for altered genes (p<0.05) was greater for these samples (1.2 versus 0.9 and 1.1 for DNA antibodies and low complement, respectively). Patient L078 biopsy and electron microscopy indicated minimal change disease unrelated to lupus, and exclusion from the non-nephritis group had no effect on the detected signature genes because non-parametric tests are only mildly affected by single sample values. One striking marker was TNFRSF14/HVEM, encoding a coreceptor for herpes virus which transduces immunosuppressive signals from BTLA [25]. Although this mRNA was marginally increased in SLE patients relative to controls, patients with nephritis had significantly lower amounts relative to those without. A similar pattern was found for many genes increased in SLE, including OAS2 and MX1, which may indicate progression of disease beyond functional immune signaling within T cells and on to response to renal breakdown. Other increased mRNAs linked to the presence of nephritis were C1orf86/FAAP20, coding for a DNA repair factor [26], and LINC00339, an uncharacterized noncoding RNA. FOS mRNA was strikingly increased. This member of the AP1 transcription complex has numerous immune roles, and mRNA for its paralog FOSL2 was among those significantly decreased overall in patient samples.

Although we detected mRNAs marking the presence of major clinical signs of SLE, the fold changes were less than expected and the genes involved did not suggest a clear picture of the relationship between T cell expression and cause of the symptoms. This could be due to the small size of our cohort and the fact that multiple overlapping symptoms were present in several patients.

Novel Patient Subtypes are Detected by T cell Expression Clustering

Next, we looked for expression patterns among all patients that might uncover subgroups. We applied unbiased clustering (Pearson) to organize samples by expression similarity in a heatmap, first using genes with at least 1.5-fold expression changes in SLE versus controls. Initial clustering was skewed by outlying expression in single samples. Though potentially interesting they disrupt visualization of groups with coherent behavior. To purify the data we filtered for genes with coefficients of variation (Standard deviation divided by average) between 2 and 0.3, which yielded roughly 1000 genes.

Immediately obvious was that three SLE samples cluster with the controls, indicating similar T cell gene expression (Fig 4A). These samples exhibit only a minor T cell expression phenotype we term Type 0. Two of these patients had low disease activity (SLEDAI 0), but L137 had a higher disease score so its control-like expression pattern is surprising, and may be explained by high dose prednisone treatment. Two other sample groups were delineated by high expression for two different sets of genes (red in the bottom middle for three sample columns, or more diffusely in the upper right for nine samples). The middle group contains samples from two men with active disease and a woman (L062) with SLEDAI 1, which we denote Type A. This unbiased organization of samples is striking because active and inactive samples cluster together, uncoupling disease score from T cell expression in some cases.

Fig 4. Unbiased clustering identifies patient subtypes by T cell gene expression.

A) Genes with altered expression in lupus T cells relative to controls, which also showed moderate variation across all samples, were median normalized and Pearson clustered with average linkage, where red and blue denote high and low expression. B) Removal of genes with expression outside of the top quartile permits identification of subtype signature genes. C) Unbiased NMF clustering using the same input genes yields similar patient subgroups. D) SVD clustering of samples provides an approximate metric of sample similarity, where average values for control, affected and all samples are centrally located.

To view modules with common patterns we repeated clustering following removal of entries with lower than top quartile expression. This made additional sample types apparent among those from SLE patients (Fig 4B). At the top of this heatmap is a module of genes induced in four patients with increased immunoglobulin fragment mRNAs as well as two genes whose products act at the endoplasmic reticulum, peptide isomerase chaperone TXNDC5 [27] and MZB1, which promotes IgM assembly. Another module specific to three patients was marked by induction of ARL6IP5/JWA, an ROS-sensitive ER protein involved in DNA repair, prostaglandin receptor PTGER2 and an expressed pseudogene of ribosomal protein RP11. We denote these sample groups as Type B and C, and they were similar both in the cohort of genes and extent of expression change outside of these striking modules. They present a more severe expression phenotype than the other sample groups, but most of the alterations are detectable to a minor degree in Type 0 patient samples. In this view, outlier expression for L078 is readily identified. The induction of several genes unique to individual samples was pervasive in our cohort, and may hint at a common disease mechanism linked to transcriptional regulatory failure.

Membership in each group was somewhat dependent on the algorithm and expression thresholds employed, but the four types of patient samples persisted across clustering schemes. We used other methods to verify the tendency to form these groups. Non-negative matrix factorization (NMF) largely recapitulated the Pearson clustering (Fig 4C) and helped confirm that L078 was most similar to Type B, although it represents an edge case. Singular Value Decomposition (SVD) organization of the samples provided additional evidence for structured similarity, where Type 0 grouped with the controls, Type A is quite separate, and Types B and C are closer together (Fig 4D). In each case Type 0 samples were positioned between the controls and Types B and C, indicating an intermediary or perhaps transitional expression phenotype. We repeated CUFFDIFF analysis to find genes altered in each sample subtype with regard to the controls (S1A Fig) and found more alterations linked to Type B and C, consistent with the segregation resulting from unbiased clustering. Most of the altered expression found in Type 0 was also detected in Type B and C samples. Type B had a greater number of genes that were different from all other sample types (S1B Fig). The clinical data associated with these sample groups did not show striking patterns, other than the fact that Type B and C samples were obtained from patients with more symptoms and higher disease scores (orange and red highlighting, Table 1).

We next sought expression markers capable of partitioning samples into the patient subgroups. CUFFDIFF and DESeq comparisons yielded largely overlapping lists of genes, similar to our findings related to potential mRNA biomarkers of SLE clinical manifestation. Although many genes have significant expression alterations in SLE T cells, most are driven by differences present in less than half of the patients in our cohort due to the relatively mild expression phenotype samples in subgroup 0 and A. We applied the nonparametric Kruskal–Wallis and Dunn tests, which allows for comparison between multiple groups. Putative markers were then prioritized by specificity and high expression (Fig 5). An exception was Type 0 samples, for which LY6E and NME1-NME2 expression was chosen on account of the induction present in all subtypes relative to controls. Type A samples exhibited high levels of DDX17 and ZAP70 mRNA, while type B samples show increased MZB1 and TXNDC5 are expected to best differentiate them from type C samples. Type C samples were less obviously marked but displayed higher levels of PTGER2 and ARL6IP5 mRNA compared to Type A.

Fig 5. Marker gene expression suggested to identify SLE patient subtypes.

Genes were selected based on their ability to differentiate first all SLE samples from controls (top) and then subtypes A or B from the others. Although not all of the genes selected had statistically different expression from all other groups, their use in concert is expected to be sufficient for stratification. Error bars represent median absolute deviation from the median value for each group, and * signifies p<0.05 by Kruskal-Wallis rank-sum followed by Bonferroni-corrected Dunn post test.

Immunosuppression is a serious confounder of human autoimmunity studies, so we also looked for expression differences related to prednisone use. CUFFDIFF and DESeq detected relatively few significant alterations (88 and 4 respectively at q<0.05 and p<0.01), for which only TXNDC5 and MZB1 had been highlighted as related to patient status or subtype. We next looked among the 324 genes with at least 1.5-fold expression change in patients with or without prednisone administration for genes of interest in other comparisons. This more liberal view revealed FOS and LINC00339 (induction linked to nephritis), as well as MZB1 and TXNDC5 (induced in Type B patient samples), as increased in patients under steroid treatment. Mann-Whitney analysis detected 139 genes at the p<0.05 threshold, of which only FOS, LINC00339, and C1orf86 had been previously noted (all increased in nephritis). These results indicate that prednisone may underlie expression that we found related to nephritis or Type B status. There is considerable overlap for these attributes in our cohort, especially for nephritis and prednisone use. However, based on relatively high expression of Type B markers even for patient L078, who was steroid-free, we expect confounding effects to be ruled out in future studies. We suspect patient L137 to be present in the mild phenotype subgroup on account of high dose prednisone treatment.

Functional Annotation detects Pathways and Protein Domains Linked to SLE Expression Changes

DAVID analysis provides a literature-based overview of biological functions related to differential expression [28], where KEGG pathways and INTERPRO protein domains offer concise and non-redundant terminology. We compared SLE samples to controls, and also pooled controls with mild expression phenotype samples (Type 0 and A females) for comparison with grouped Type B and C samples. Patient subtype information strengthened this analysis because the comparison of samples with weak or strong expression phenotypes detected more significant terms, and more genes associated with each, than did the initial control versus SLE analysis (Fig 6).

Fig 6. Distinct biological pathways and protein domains are identified following sample clustering.

The number of genes with altered expression used for each query is in parentheses for each comparison. Significant terms with Benjamini q values <0.05 are in bold. More terms were detected for Control versus SLE samples by removal of male samples (CON v SLE(F)). Further refinement was obtained by grouping minor expression phenotype samples instead with controls (C+0+A(F) v B+C). Redundant and nonspecific terms were discarded and the remaining were ranked by the number of genes associated.

For genes induced in SLE, significantly related KEGG pathways are readily associated with activated and proliferating immune cells, and included Oxidative Phosphorylation (37 genes), Lysosome (26), Proteasome (19) Antigen Processing (16), Glycolysis (14), N-Glycan biosynthesis (11) and Fatty Acid Elongation (5). Significant INTERPRO terms included Immunoglobulin (33) and NAD(P) Binding (27) domains. Down-regulated genes were associated with signaling and nuclear pathways including Spliceosome (13 genes), FC receptor RI Signaling (8), Erb Signaling (9) and Apoptosis (9) and Circadian Rhythms (4). Genes with reduced expression in SLE were enriched for domains related to signaling and gene expression, including Kinase (31), Zinc-Finger (13), Basic Leucine Zipper (9) and Jumonji Transcription Factor (6) motifs. The genes related to each term are listed in the supplement (S2 Table). Analysis of patient subtype samples did not offer compelling differences from these ontology terms, presumably because fewer genes were specific to each.

Induced Transcription Factors Are Suggested to Regulate Induced Genes

We next looked at ENCODE immunoprecipitation data for factors detected in chromatin that might share responsibility for any observed expression changes. Examination of the body and 3kb flanking regions of genes induced at least 2-fold in SLE T cells revealed thousands of binding events. As the consortium data is derived from various cell lines, we looked at the expression of these binding ChIP targets in our data. Among those with signal at induced loci, nine had greater than 1.5-fold increased mRNA in SLE (Fig 7A), most of which are known to impact lymphocyte development or activity. Minimally characterized in T cells was WRNIP1 (Werner helicase interacting protein 1), an ATP-dependent DNA-binding protein related to DNA repair [29, 30]. As they show increased mRNA and are found at induced loci, these factors likely act as transcriptional activators.

Fig 7. ENCODE ChIP analysis identifies factors which account for induced expression in SLE T cells.

The number of chromatin immunoprecipitation binding sites within a 3kb window about significantly induced genes were counted A) Factors binding near induced genes which themselves are induced. B) Factors binding near these genes having reduced mRNA. C) ChIP factors with unchanged mRNA and more than 100 binding sites near induced loci. D) Expression of select chromatin factors by sample type, where error bars represent the median absolute deviation from the median.

A greater number of transcription/chromatin factors were reduced in expression (Fig 7B), a trend detected by DAVID analysis. Among the 14 ChIP targets with mRNAs reduced at least 1.5-fold, several have little described role in T cells. These include ZNF274, which recruits repressive factors SETDB1 and TRIM28/KAP1 [31], chromatin modifiers CHD1 and 2 [32] and ZBTB7A, which represses glycolytic genes [33].

Because mRNA levels are frequently unchanged for transcription factors directing an expression program, we checked ChIP targets unaltered in SLE and depict those with at least 100 sites at induced loci (Fig 7C). Several top hits are well known to influence T cell biology. Runx3 is critical for thymocyte development [34] and YY1 influences Th2 cytokine production [35]. SMC3 and RAD21 interact with MXI1 (found among ChIP targets with decreased mRNA) to function in the cohesin complex, and the former is associated with atopic asthma [36]. Expression for these ChIP factors was usually similar in Type B and C samples (Fig 7D).


Patient variability challenges diagnosis and treatment of many diseases, and peripheral blood provides a window into health status capable of reporting on tissues throughout the body. The transcriptome of peripheral blood components show variation with circadian [37] and seasonal [38] periodicity and are growing in descriptive utility in autoimmunity [39] and other clinical settings including transplantation [40] heart failure [41] amyotrophic lateral sclerosis [42] and several cancers [43, 44] where patient subtypes can be identified based on tumor immune cell expression [45]. Expression analysis in disease-relevant tissue is also useful for prioritization of genomic variants [46].

SLE patients present great clinical heterogeneity as a result of genetic diversity and epigenetic changes related to immunological memory. Robust molecular diagnostics have the potential to guide treatment and describe the causes of the disease. Our mRNA analysis identified new genes related to T cell dysfunction and confirmed induction of interferon signature genes (ISG), including OAS2 which we previously showed is specific to SLE autoimmunity [47]. Patient stratification by ISG expression, however, remains poorly correlated to disease activity [48]. While many of the pathways and domains we found are unsurprising, their notation will aid study of lymphocyte function. Induction of small groups of genes unique to single patients was unexpected, and may prove to be a source of pathological variability related to more common failures of transcriptional repression. Study of nuclear regulators may be most fruitful, in light of the persistent hypomethylation and expression activation displayed SLE T cells [4951].

We expected patients with various clinical signs to show more distinctive expression patterns, as has been shown for rheumatoid arthritis [52]. Our data indicate that the extent of expression alteration in T cells correlates more with the severity of disease rather than which major symptoms were apparent. Both Type B and C sample groups had more genes with significant expression differences, and higher average SLEDAI scores, than did Type 0 in comparison to controls. We were encouraged that expression analysis detects subtypes of patients. Although these expression phenotypes do not correlate with specific symptoms, they will support patient stratification for study of SLE T cell function. We expect that patient subtypes, at least with regard to sexual dimorphism, will increase discriminatory power and reveal common symptoms or treatment response subsequent analysis of a larger cohort.

Several genes primarily associated with B cells, such as CD38 and MZB1, had striking induction in SLE T cells. This may mark an aberrant or immature cell type that is resistant to negative selection T cell purification. Signals related to B cell activation may mistakenly be received on or within T cells, driving trans- or dedifferentiation to a close lineage member. Expression of immunoglobulin transcript fragments is perplexing, and their profound induction in some patients indicates a potential mechanism of disease, again, perhaps related to a transcriptional or chromatin regulatory lesion. While we saw no evidence of Ig protein, the number of loci and degree of induction for these and several genes with products located on the endoplasmic reticulum offer strong evidence that regulation of antibody production deserves further examination.

We conclude that study of diseases hampered by patient heterogeneity can be supported by high coverage transcriptomics of purified tissues, even in small cohorts. Unbiased detection of patient subtypes and biomarkers associated with symptoms may both be revealed if expression variability is carefully considered. The genes and domains suggested herein will hopefully aid in the study of SLE lymphocyte biology and eventually provide aid to clinical decision making.

Supporting Information

S1 Data. This file contains expression and comparison information extracted from S1 Table for use as input for analysis in R.


S1 Fig. T cell sample types B and C present the most alterations relative to controls.

A) Gene counts for 1.5- and 2-fold expression changes (FPKM) apparent in various comparisons show that sample Types B and C have the most extreme expression phenotypes relative to controls. The number of samples used as input is listed in parentheses for each comparison. B) Overlaps of mRNAs increased or reduced at least 1.5-fold in abundance in three sets of three comparisons. Left, refinement effect on the overall control v SLE analysis. Middle, patient Types B and C show most of the altered genes found in Type 0 in addition to many others. Right, comparisons of clinical signs show greater similarity between increased dsDNA antibody and low complement samples, and that nephritis is accompanied by reductions in many mRNAs.


S1 Table. This file contains the gene expression data resulting from the Cufflinks and Cuffdiff analysis, including FPKM and descriptions for the most annotated 24,263 human genes along with metrics from the comparisons performed.

The raw sequence data has been deposited at the Sequence Read Archive under Bioproject Accession ID PRJNA293549.


S2 Table. This file includes lists of ENSG IDs for all CUFFDIFF comparisons yielding greater than 1.5x mRNA expression changes and the David Analysis of Kegg pathway and Interpro domain terms for the refined SLE versus control comparison.


S3 Table. This file includes count-based expression data (matrix_counts) and the comparison matrix inputs for DESeq analysis and an overview of the results.


S1 Text. This document contains supplemental methods information including commands and scripts employed for informatics analysis.



This work was funded by R01AI042269 to GCT, R01AR060849 to VK, and SB is funded under training grant T32 AI074549. We thank members of the Tsokos lab for comments during manuscript development. We also thank James Kozubek, Nikolaus Obholzer and Nathalie Pochet for assistance with initial sequencing analysis, as well as Kenneth Westerman and Kristina Holton at Harvard Research Computing for help with downstream analysis. We are also extremely grateful to all patients participating in our ongoing SLE studies.

Author Contributions

Conceived and designed the experiments: SJB VCK GCT. Performed the experiments: SJB DRM VCK. Analyzed the data: SJB ASF DRM. Contributed reagents/materials/analysis tools: SJB DRM VCK. Wrote the paper: SJB ASF VCK GCT.


  1. 1. Koga T, Hedrich CM, Mizui M, Yoshida N, Otomo K, Lieberman LA, et al. CaMK4-dependent activation of AKT/mTOR and CREM-alpha underlies autoimmunity-associated Th17 imbalance. J Clin Invest. 2014;124(5):2234–45. Epub 2014/03/29. pmid:24667640; PubMed Central PMCID: PMC4001553.
  2. 2. Kis-Toth K, Tsokos GC. Engagement of SLAMF2/CD48 prolongs the time frame of effective T cell activation by supporting mature dendritic cell survival. J Immunol. 2014;192(9):4436–42. Epub 2014/03/29. pmid:24670806; PubMed Central PMCID: PMC4017928.
  3. 3. Mizui M, Koga T, Lieberman LA, Beltran J, Yoshida N, Johnson MC, et al. IL-2 protects lupus-prone mice from multiple end-organ damage by limiting CD4-CD8- IL-17-producing T cells. J Immunol. 2014;193(5):2168–77. Epub 2014/07/27. pmid:25063876; PubMed Central PMCID: PMC4135016.
  4. 4. Dai C, Deng Y, Quinlan A, Gaskin F, Tsao BP, Fu SM. Genetics of systemic lupus erythematosus: immune responses and end organ resistance to damage. Curr Opin Immunol. 2014;31:87–96. Epub 2014/12/03. pmid:25458999; PubMed Central PMCID: PMC4274270.
  5. 5. Kariuki SN, Ghodke-Puranik Y, Dorschner JM, Chrabot BS, Kelly JA, Tsao BP, et al. Genetic analysis of the pathogenic molecular sub-phenotype interferon-alpha identifies multiple novel loci involved in systemic lupus erythematosus. Genes Immun. 2015;16(1):15–23. Epub 2014/10/24. pmid:25338677; PubMed Central PMCID: PMC4305028.
  6. 6. Moulton VR, Tsokos GC. T cell signaling abnormalities contribute to aberrant immune cell function and autoimmunity. J Clin Invest. 2015:1–8. Epub 2015/05/12. pmid:25961450.
  7. 7. Bennett L, Palucka AK, Arce E, Cantrell V, Borvak J, Banchereau J, et al. Interferon and granulopoiesis signatures in systemic lupus erythematosus blood. J Exp Med. 2003;197(6):711–23. Epub 2003/03/19. pmid:12642603; PubMed Central PMCID: PMC2193846.
  8. 8. Chaussabel D, Quinn C, Shen J, Patel P, Glaser C, Baldwin N, et al. A modular analysis framework for blood genomics studies: application to systemic lupus erythematosus. Immunity. 2008;29(1):150–64. Epub 2008/07/18. pmid:18631455; PubMed Central PMCID: PMC2727981.
  9. 9. Chiche L, Jourde-Chiche N, Whalen E, Presnell S, Gersuk V, Dang K, et al. Modular transcriptional repertoire analyses of adults with systemic lupus erythematosus reveal distinct type I and type II interferon signatures. Arthritis Rheumatol. 2014;66(6):1583–95. Epub 2014/03/20. pmid:24644022; PubMed Central PMCID: PMC4157826.
  10. 10. Becker AM, Dao KH, Han BK, Kornu R, Lakhanpal S, Mobley AB, et al. SLE peripheral blood B cell, T cell and myeloid cell transcriptomes display unique profiles and each subset contributes to the interferon signature. PLoS One. 2013;8(6):e67003. Epub 2013/07/05. pmid:23826184; PubMed Central PMCID: PMC3691135.
  11. 11. Shi L, Zhang Z, Yu AM, Wang W, Wei Z, Akhter E, et al. The SLE transcriptome exhibits evidence of chronic endotoxin exposure and has widespread dysregulation of non-coding and coding RNAs. PLoS One. 2014;9(5):e93846. Epub 2014/05/07. pmid:24796678; PubMed Central PMCID: PMC4010412.
  12. 12. Hochberg MC. Updating the American College of Rheumatology revised criteria for the classification of systemic lupus erythematosus. Arthritis Rheum. 1997;40(9):1725. Epub 1997/10/27. pmid:9324032.
  13. 13. Hulsen T, de Vlieg J, Alkema W. BioVenn—a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams. BMC Genomics. 2008;9:488. Epub 2008/10/18. pmid:18925949; PubMed Central PMCID: PMC2584113.
  14. 14. Bombardier C, Gladman DD, Urowitz MB, Caron D, Chang CH. Derivation of the SLEDAI. A disease activity index for lupus patients. The Committee on Prognosis Studies in SLE. Arthritis Rheum. 1992;35(6):630–40. Epub 1992/06/01. pmid:1599520.
  15. 15. Clemenceau B, Vivien R, Berthome M, Robillard N, Garand R, Gallot G, et al. Effector memory alphabeta T lymphocytes can express FcgammaRIIIa and mediate antibody-dependent cellular cytotoxicity. J Immunol. 2008;180(8):5327–34. Epub 2008/04/09. pmid:18390714.
  16. 16. Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol. 2013;31(1):46–53. Epub 2012/12/12. pmid:23222703; PubMed Central PMCID: PMC3869392.
  17. 17. Li QZ, Zhou J, Lian Y, Zhang B, Branch VK, Carr-Johnson F, et al. Interferon signature gene expression is correlated with autoantibody profiles in patients with incomplete lupus syndromes. Clin Exp Immunol. 2010;159(3):281–91. Epub 2009/12/09. pmid:19968664; PubMed Central PMCID: PMC2819494.
  18. 18. Yang PT, Kasai H, Zhao LJ, Xiao WG, Tanabe F, Ito M. Increased CCR4 expression on circulating CD4(+) T cells in ankylosing spondylitis, rheumatoid arthritis and systemic lupus erythematosus. Clin Exp Immunol. 2004;138(2):342–7. Epub 2004/10/23. pmid:15498047; PubMed Central PMCID: PMC1809206.
  19. 19. Sen Y, Chunsong H, Baojun H, Linjie Z, Qun L, San J, et al. Aberration of CCR7 CD8 memory T cells from patients with systemic lupus erythematosus: an inducer of T helper type 2 bias of CD4 T cells. Immunology. 2004;112(2):274–89. Epub 2004/05/19. pmid:15147571; PubMed Central PMCID: PMC1782491.
  20. 20. Feng X, Wu H, Grossman JM, Hanvivadhanakul P, FitzGerald JD, Park GS, et al. Association of increased interferon-inducible gene expression with disease activity and lupus nephritis in patients with systemic lupus erythematosus. Arthritis Rheum. 2006;54(9):2951–62. Epub 2006/09/02. pmid:16947629.
  21. 21. Yoshida Y, Ogata A, Kang S, Ebina K, Shi K, Nojima S, et al. Semaphorin 4D Contributes to Rheumatoid Arthritis by Inducing Inflammatory Cytokine Production: Pathogenic and Therapeutic Implications. Arthritis Rheumatol. 2015;67(6):1481–90. Epub 2015/02/25. pmid:25707877.
  22. 22. Siegemund S, Rigaud S, Conche C, Broaten B, Schaffer L, Westernberg L, et al. IP3 3-kinase B controls hematopoietic stem cell homeostasis and prevents lethal hematopoietic failure in mice. Blood. 2015;125(18):2786–97. Epub 2015/03/20. pmid:25788703; PubMed Central PMCID: PMC4416530.
  23. 23. Hoofd C, Devreker F, Deneubourg L, Deleu S, Nguyen TM, Sermon K, et al. A specific increase in inositol 1,4,5-trisphosphate 3-kinase B expression upon differentiation of human embryonic stem cells. Cell Signal. 2012;24(7):1461–70. Epub 2012/03/27. pmid:22446005.
  24. 24. Li H, Mapolelo DT, Randeniya S, Johnson MK, Outten CE. Human glutaredoxin 3 forms [2Fe-2S]-bridged complexes with human BolA2. Biochemistry. 2012;51(8):1687–96. Epub 2012/02/09. pmid:22309771; PubMed Central PMCID: PMC3331715.
  25. 25. Derre L, Rivals JP, Jandus C, Pastor S, Rimoldi D, Romero P, et al. BTLA mediates inhibition of human tumor-specific CD8+ T cells that can be partially reversed by vaccination. J Clin Invest. 2010;120(1):157–67. Epub 2009/12/30. pmid:20038811; PubMed Central PMCID: PMC2799219.
  26. 26. Yan Z, Guo R, Paramasivam M, Shen W, Ling C, Fox D 3rd, et al. A ubiquitin-binding protein, FAAP20, links RNF8-mediated ubiquitination to the Fanconi anemia DNA repair network. Mol Cell. 2012;47(1):61–75. Epub 2012/06/19. pmid:22705371; PubMed Central PMCID: PMC3398238.
  27. 27. Jin Y, Sharma A, Bai S, Davis C, Liu H, Hopkins D, et al. Risk of type 1 diabetes progression in islet autoantibody-positive children can be further stratified using expression patterns of multiple genes implicated in peripheral blood lymphocyte activation and function. Diabetes. 2014;63(7):2506–15. Epub 2014/03/07. pmid:24595351; PubMed Central PMCID: PMC4066338.
  28. 28. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57. Epub 2009/01/10. pmid:19131956.
  29. 29. Crosetto N, Bienko M, Hibbert RG, Perica T, Ambrogio C, Kensche T, et al. Human Wrnip1 is localized in replication factories in a ubiquitin-binding zinc finger-dependent manner. J Biol Chem. 2008;283(50):35173–85. Epub 2008/10/10. pmid:18842586; PubMed Central PMCID: PMC3259897.
  30. 30. Ramakrishnan R, Liu H, Donahue H, Malovannaya A, Qin J, Rice AP. Identification of novel CDK9 and Cyclin T1-associated protein complexes (CCAPs) whose siRNA depletion enhances HIV-1 Tat function. Retrovirology. 2012;9:90. Epub 2012/11/01. pmid:23110726; PubMed Central PMCID: PMC3494656.
  31. 31. Cruvinel E, Budinetz T, Germain N, Chamberlain S, Lalande M, Martins-Taylor K. Reactivation of maternal SNORD116 cluster via SETDB1 knockdown in Prader-Willi syndrome iPSCs. Hum Mol Genet. 2014;23(17):4674–85. Epub 2014/04/25. pmid:24760766.
  32. 32. Siggens L, Cordeddu L, Ronnerblad M, Lennartsson A, Ekwall K. Transcription-coupled recruitment of human CHD1 and CHD2 influences chromatin accessibility and histone H3 and H3.3 occupancy at active chromatin regions. Epigenetics Chromatin. 2015;8(1):4. Epub 2015/01/27. pmid:25621013; PubMed Central PMCID: PMC4305392.
  33. 33. Liu XS, Haines JE, Mehanna EK, Genet MD, Ben-Sahra I, Asara JM, et al. ZBTB7A acts as a tumor suppressor through the transcriptional repression of glycolysis. Genes Dev. 2014;28(17):1917–28. Epub 2014/09/04. pmid:25184678; PubMed Central PMCID: PMC4197949.
  34. 34. Egawa T, Tillman RE, Naoe Y, Taniuchi I, Littman DR. The role of the Runx transcription factors in thymocyte differentiation and in homeostasis of naive T cells. J Exp Med. 2007;204(8):1945–57. Epub 2007/07/25. pmid:17646406; PubMed Central PMCID: PMC2118679.
  35. 35. Hwang SS, Kim YU, Lee S, Jang SW, Kim MK, Koh BH, et al. Transcription factor YY1 is essential for regulation of the Th2 cytokine locus and for Th2 cell differentiation. Proc Natl Acad Sci U S A. 2013;110(1):276–81. Epub 2012/12/19. pmid:23248301; PubMed Central PMCID: PMC3538243.
  36. 36. Cheng Q, Huang W, Chen N, Shang Y, Zhang H. SMC3 may play an important role in atopic asthma development. Clin Respir J. 2014. Epub 2014/12/18. pmid:25515564.
  37. 37. Whitney AR, Diehn M, Popper SJ, Alizadeh AA, Boldrick JC, Relman DA, et al. Individuality and variation in gene expression patterns in human blood. Proc Natl Acad Sci U S A. 2003;100(4):1896–901. Epub 2003/02/13. pmid:12578971; PubMed Central PMCID: PMC149930.
  38. 38. Dopico XC, Evangelou M, Ferreira RC, Guo H, Pekalski ML, Smyth DJ, et al. Widespread seasonal gene expression reveals annual differences in human immunity and physiology. Nat Commun. 2015;6:7000. Epub 2015/05/13. pmid:25965853; PubMed Central PMCID: PMC4432600.
  39. 39. McKinney EF, Lee JC, Jayne DR, Lyons PA, Smith KG. T-cell exhaustion, co-stimulation and clinical outcome in autoimmunity and infection. Nature. 2015;523(7562):612–6. Epub 2015/07/01. pmid:26123020.
  40. 40. Roedder S, Li L, Alonso MN, Hsieh SC, Vu MT, Dai H, et al. A Three-Gene Assay for Monitoring Immune Quiescence in Kidney Transplantation. J Am Soc Nephrol. 2014. Epub 2014/11/28. pmid:25429124.
  41. 41. Maciejak A, Kiliszek M, Michalak M, Tulacz D, Opolski G, Matlak K, et al. Gene expression profiling reveals potential prognostic biomarkers associated with the progression of heart failure. Genome Med. 2015;7(1):26. Epub 2015/05/20. pmid:25984239; PubMed Central PMCID: PMC4432772.
  42. 42. Ladd AC, Keeney PM, Govind MM, Bennett JP Jr. Mitochondrial oxidative phosphorylation transcriptome alterations in human amyotrophic lateral sclerosis spinal cord and blood. Neuromolecular Med. 2014;16(4):714–26. Epub 2014/08/02. pmid:25081190.
  43. 43. Liong ML, Lim CR, Yang H, Chao S, Bong CW, Leong WS, et al. Blood-based biomarkers of aggressive prostate cancer. PLoS One. 2012;7(9):e45802. Epub 2012/10/17. pmid:23071848; PubMed Central PMCID: PMC3461021.
  44. 44. Shi M, Chen MS, Sekar K, Tan CK, Ooi LL, Hui KM. A blood-based three-gene signature for the non-invasive detection of early human hepatocellular carcinoma. Eur J Cancer. 2014;50(5):928–36. Epub 2013/12/18. pmid:24332572.
  45. 45. Bindea G, Mlecnik B, Tosolini M, Kirilovsky A, Waldner M, Obenauf AC, et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity. 2013;39(4):782–95. Epub 2013/10/22. pmid:24138885.
  46. 46. Tsalik EL, Langley RJ, Dinwiddie DL, Miller NA, Yoo B, van Velkinburgh JC, et al. An integrated transcriptome and expressed variant analysis of sepsis survival and death. Genome Med. 2014;6(11):111. Epub 2014/12/30. pmid:25538794; PubMed Central PMCID: PMC4274761.
  47. 47. Grammatikos AP, Kyttaris VC, Kis-Toth K, Fitzgerald LM, Devlin A, Finnell MD, et al. A T cell gene expression panel for the diagnosis and monitoring of disease activity in patients with systemic lupus erythematosus. Clin Immunol. 2014;150(2):192–200. Epub 2014/01/18. pmid:24434273; PubMed Central PMCID: PMC3932542.
  48. 48. Kennedy WP, Maciuca R, Wolslegel K, Tew W, Abbas AR, Chaivorapol C, et al. Association of the interferon signature metric with serological disease manifestations but not global activity scores in multiple cohorts of patients with SLE. Lupus Sci Med. 2015;2(1):e000080. Epub 2015/04/11. pmid:25861459; PubMed Central PMCID: PMC4379884.
  49. 49. Absher DM, Li X, Waite LL, Gibson A, Roberts K, Edberg J, et al. Genome-wide DNA methylation analysis of systemic lupus erythematosus reveals persistent hypomethylation of interferon genes and compositional changes to CD4+ T-cell populations. PLoS Genet. 2013;9(8):e1003678. Epub 2013/08/21. pmid:23950730; PubMed Central PMCID: PMC3738443.
  50. 50. Hedrich CM, Rauen T, Apostolidis SA, Grammatikos AP, Rodriguez Rodriguez N, Ioannidis C, et al. Stat3 promotes IL-10 expression in lupus T cells through trans-activation and chromatin remodeling. Proc Natl Acad Sci U S A. 2014;111(37):13457–62. Epub 2014/09/05. pmid:25187566; PubMed Central PMCID: PMC4169908.
  51. 51. Coit P, Renauer P, Jeffries MA, Merrill JT, McCune WJ, Maksimowicz-McKinnon K, et al. Renal involvement in lupus is characterized by unique DNA methylation changes in naive CD4+ T cells. J Autoimmun. 2015. Epub 2015/05/26. pmid:26005050.
  52. 52. Anderson AE, Pratt AG, Sedhom MA, Doran JP, Routledge C, Hargreaves B, et al. IL-6-driven STAT signalling in circulating CD4+ lymphocytes is a marker for early anticitrullinated peptide antibody-negative rheumatoid arthritis. Ann Rheum Dis. 2015. Epub 2015/02/05. pmid:25649145.