RNA-seq Analysis Reveals Unique Transcriptome Signatures in Systemic Lupus Erythematosus Patients with Distinct Autoantibody Specificities

Systemic lupus erythematosus (SLE) patients exhibit immense heterogeneity which is challenging from the diagnostic perspective. Emerging high throughput sequencing technologies have been proved to be a useful platform to understand the complex and dynamic disease processes. SLE patients categorised based on autoantibody specificities are reported to have differential immuno-regulatory mechanisms. Therefore, we performed RNA-seq analysis to identify transcriptomics of SLE patients with distinguished autoantibody specificities. The SLE patients were segregated into three subsets based on the type of autoantibodies present in their sera (anti-dsDNA+ group with anti-dsDNA autoantibody alone; anti-ENA+ group having autoantibodies against extractable nuclear antigens (ENA) only, and anti-dsDNA+ENA+ group having autoantibodies to both dsDNA and ENA). Global transcriptome profiling for each SLE patients subsets was performed using Illumina® Hiseq-2000 platform. The biological relevance of dysregulated transcripts in each SLE subsets was assessed by ingenuity pathway analysis (IPA) software. We observed that dysregulation in the transcriptome expression pattern was clearly distinct in each SLE patients subsets. IPA analysis of transcripts uniquely expressed in different SLE groups revealed specific biological pathways to be affected in each SLE subsets. Multiple cytokine signaling pathways were specifically dysregulated in anti-dsDNA+ patients whereas Interferon signaling was predominantly dysregulated in anti-ENA+ patients. In anti-dsDNA+ENA+ patients regulation of actin based motility by Rho pathway was significantly affected. The granulocyte gene signature was a common feature to all SLE subsets; however, anti-dsDNA+ group showed relatively predominant expression of these genes. Dysregulation of Plasma cell related transcripts were higher in anti-dsDNA+ and anti-ENA+ patients as compared to anti-dsDNA+ ENA+. Association of specific canonical pathways with the uniquely expressed transcripts in each SLE subgroup indicates that specific immunological disease mechanisms are operative in distinct SLE patients’ subsets. This ‘sub-grouping’ approach could further be useful for clinical evaluation of SLE patients and devising targeted therapeutics.


Introduction
Systemic lupus erythematosus (SLE) is a complex autoimmune disease with diverse presentations of clinical manifestations [1] and wide range of autoantibodies [2]. The major classes of the autoantibody population are targeted against either dsDNA or RNA associated proteins (also known as extractable nuclear antigens, ENA) like Sm, RNP, SS-A, SS-B etc. The tremendous heterogeneity complicates the exact underlying disease mechanisms in SLE that are least understood. Although DNA microarray based studies did contribute to the understanding of SLE [3][4][5][6][7] however, the information was limited to the level of gene expression, SNPs etc. Recent emergence of deep sequencing technology has added newer dimensions in unraveling the disease specific events. These tools have allowed identification of novel transcripts, alternative splicing events and information on non coding RNAs (ncRNAs) associated with SLE [8][9][10]. In addition, this approach was further employed for identifying rare or novel deleterious variants as genetic causes of SLE [11,12].
Targeting heterogeneity in SLE is the most challenging aspect of the disease, the underlying cause of which if resolved could pave way for development of personalized or precise treatment in SLE. We in our lab have tried to address heterogeneity in autoantibody responses by studying SLE patients in different groups segregated based on distinct autoantibody specificities. Earlier with this approach we have been able to document that differential expression of toll like receptors (TLRs) [13], heat shock proteins (HSPs) [14], miRNA based genetic regulation [15] and divergent sources of autoantigens [16] exist in different autoantibody subsets of SLE patients.
The deep sequencing approaches used so far in earlier studies have identified gene expression profile in unsegregated SLE patients. We hypothesised based on our previous results that the transcriptomics in different subsets of SLE patient grouped on the basis of autoantibody specificities would be differential and could reveal discrete biological pathways operative in the distinctive subsets. Therefore, in this study, as described earlier [13][14][15][16], the SLE patients were characterized into different subsets based on their autoantibody profiles (subset I: anti-dsDNA + or subset II: anti-ENA + or subset III: anti-dsDNA + ENA + ) to identify novel expression patterns of transcripts that would otherwise be missed when studying SLE patients as one common group.
We identified various categories of transcripts like coding, non-coding, antisense, pseudogenes etc. and immunoglobulin (Ig) transcripts using RNA-seq technology that express differentially in SLE patients' subsets. The expression of coding RNA and Ig varies significantly between different subsets of SLE patients though no significant difference was observed for ncRNAs among them. Further, using Ingenuity pathway analysis (IPA) tool we have identified multiple cytokine signaling pathways to be dysregulated in anti-dsDNA + patients whereas Interferon (IFN) signaling was dysregulated in anti-ENA + patients. IPA analysis of transcripts dysregulated in subset with both anti-dsDNA and anti-ENA autoantibodies shows regulation of actin based motility by Rho signaling pathway to be most affected. In addition, different transcripts of same genes were observed to be expressed differentially in each SLE subset. Granulocyte signature genes though present in all SLE subsets had unique distribution of specific transcripts in different subsets. Plasma cell (PC) signature transcripts including the hyper mutated Ig gene transcripts were observed to be differentially distributed in each SLE patient subsets.
Taken together, the identification of distinguishing genetic patterns, transcripts and subset specific events is suggestive of distinct disease driven processes in serologically defined SLE EDTA, pH 7.4) at room temperature (RT) and erythrocytes were removed by washing with phosphate buffered saline (PBS). Leukocytes thus obtained lysed in TRI reagent (Sigma-Aldrich, St Louis, MO) for RNA isolation as per the manufacturer's protocol. Briefly, lysate mixed with the chloroform, centrifuged and the aqueous layer was separated and collected in a fresh tube. Further, RNA in the aqueous layer is precipitated by isopropanol followed by washing with 70% ethanol. RNA samples were treated with DNase to remove the genomic DNA contamination. Their quality was assessed using 2100 Bioanalyzer (Agilent Technologies). Samples with RNA integrity number (RIN) >7 were processed for library preparation and RNA-sequencing.  Table 1. cDNA library was prepared using Illumina1 TruSeq1 RNA sample preparation kit as per manufacturer's instruction. In brief, mRNA molecules were purified using oligo-dT attached magnetic beads and fragmented using divalent cations under elevated temperature. The first strand cDNA strand was synthesized from cleaved RNA fragments using reverse transcriptsase and random primers followed by second strand cDNA synthesis using DNA Polymerase I. RNase H was used to specifically digest the template mRNA. After the end repair process and adenylation of 3 0 end tailing, adapters were ligated to the cDNA. Samples were then purified and enriched with PCR to create the final cDNA library. Quality of the cDNA libraries were accessed using Agilent Technologies 2100 Bioanalyzer. The libraries were hybridized to the flow cell and cluster was generated by bridge amplification. Paired end sequencing was performed using Illumina1 Hiseq-2000 platform.

Bioinformatics analysis
Output files in fastq format were processed for pre-alignment QC. On an average,~84% of the total reads of all samples passed ! 30 Phred score. Low quality base were trimmed from the reads. Further refinements for the removal of the unwanted sequences including mitochondrial genome sequence, ribosomal RNAs, transfer RNAs, adapter sequences and others were done using bowtie2 (version 2.1.0), tool. The pre-processed reads were aligned to the reference human genome and gene model downloaded from Ensembl database (http://www.ensembl.org/info/ data/ftp/index.html) using Tophat program (version 2.0.8) with default parameters. Reads uniquely mapped were considered for further analysis (S1 Table). Cufflink program (version 2.0.2) was used to determine differentially expressed transcripts and genes. To check the reliability and the comparability of differential expression analysis, the transcripts/ genes with FPKM ! 1 in all individually sequenced patients and controls were examined. Correlation analysis of differentially expressed transcripts/ genes among the biological replicates was also performed to rule out the possibility of variation among the samples in the group (S1 Fig). A difference of at-least two fold in the transcripts/ genes expression between different subgroups and control were considered for further analysis. Student's t-test and Benjamini-Hochberg false discovery rate tests were performed for each of the differentially expressed transcripts across the biological replicates. Further, we used DESeq program, another tool to analyze the differentially expressed genes (DEGs) and to compare the results of Cufflink analysis. DESeq is the count based matrices that identifies DEGs only whereas, Cufflink adopts an algorithm that controls cross-replicate variability and read-mapping ambiguity by using a model for fragment counts (FPKM) based on a beta negative binomial distribution that identifies both differentially expressed transcripts and genes. We compared the DEGs results from Cufflink and DESeq analyses, and took the intersection of them for downstream pathway analysis. The datasets from this study have been deposited in the Gene Expression Omnibus repository (GEO series accession number: GSE80183).

Principal component analysis and Functional analysis
Unsupervised analysis using, principal component analysis (PCA) and hierarchical clustering was performed to visualize the similarities and the distinction between samples belonging to different SLE subsets. Principal component analysis is a mathematical algorithm that extracts important variables (in form of components PC1 and PC2) from a large set of variables available in a data set. The transcripts which showed median FPKM ! 1 in all patient samples were used to generate the PCA plot. Further, extensive analysis was performed to identify the relevant bio-functions and the pathways associated with differentially expressed gene transcripts in different subsets of SLE patients' by using ingenuity pathway analysis (IPA) software (Ingenuity Systems, Redwood City, CA). Differentially expressed (upregulated or downregulated) gene transcripts in each SLE patients' subsets that shows a minimum of two fold change as compared to that of healthy individuals were imported to IPA for the analysis. IPA generates the pathway utilizing the genes from our dataset referred as 'focus gene' and genes stored in ingenuity knowledge database (based on the functional annotation and experimental observation). It also computes a p value for each pathway which indicates the likelihood of the association between focus genes and canonical pathway is not random. A cut-off of p value was set at less than 0.05 (or score > 1.3 score = -lop P) and was calculated using Fisher's exact test to define the significance of the pathways associated with our dataset. Moreover, IPA analysis of overall dysregulated transcripts (upregulated and downregulated; both together) were analysed for the prediction of activated or inhibited canonical pathway based on z-score. IPA automatically calculates the z-score based on differentially expressed genes/ transcripts in our dataset and the information stored in IPA knowledge database. Positive z-score suggests the activation, whereas negative z-score indicates inactivation of the pathway. The pathway with the highest scores and focus molecules were identified by IPA analysis and displayed graphically as a collection of nodes (gene transcripts) and edges (the biological relationships between the nodes). The node color indicates the up-regulation (orange) or downregulation (green) of gene transcripts. In addition to IPA analysis, other gene enrichment approaches were used for the functional characterization of differentially expressed transcripts in distinct SLE subgroups which includes Gene Set Enrichment Analysis (GSEA) and Database for Annotation, Visualization and Integrated Discovery (DAVID) Bioinformatics Database.

Real-time reverse transcription-polymerase chain reaction (RT-PCR)
The uniquely expressed transcripts identified by RNA-seq analysis in distinct SLE patients' subsets were further validated using real time PCR. A total of twenty three SLE samples and eight control samples were used for the validation experiments that also includes the samples used for RNA-seq. The patient samples used for qPCR were specifically marked in Table 1.
The qPCR was performed using TaqMan1 assays (Applied Biosystems, Foster City, CA). CCL20, CCNA1, EPHB2 and ELANE transcripts were selected for validation based on their expression in specific SLE subgroup. CCL20 expressed uniquely in anti-dsDNA + , CCNA1 in anti-ENA + , EPHB2 in anti-dsDNA + ENA + and ELANE in all SLE patient subsets. GAPDH was used as internal control. cDNA was synthesized using High capacity reverse transcription kit (Applied biosystems, Foster City, CA) following manufacturer's instruction. Briefly, prior to reverse transcription reaction, 2 μg of RNA was subjected to DNase (NEB) treatment to remove the contaminating genomic DNA. Further, reaction mixture comprising DNase treated RNA, RT Buffer (10×), Random Primer (10×), dNTP mix (25×), Reverse transcriptase enzyme, and RNase Inhibitor (10 U/μl) were incubated at 25˚C for 10 min followed by incubation at 37˚C for 120 min. Finally, enzyme is inactivated by incubation at 85˚C for 5 min.
Whole reaction was carried out in PCR Thermocycler (Applied biosystems, Foster City, CA). A TaqMan1 assay for each gene was used for performing quantitative real time PCR as per manufacturer's instruction. PCR reactions were carried out on Applied Biosystems 7500 realtime PCR system, using 2 × TaqMan1 universal PCR master mixes (Applied Biosystems, Foster City, CA). The comparative Ct method was used to interpret the data as described by Livak and Schmittgen [18]. Relative expression of each gene among SLE patients and healthy individual was determined using formula, Fold change = 2 -ΔΔCt , where ΔCt = Ct(Gene transcript) − Ct(GAPDH) and ΔΔCt = ΔCt(SLE patient) − ΔCt(Control).

Statistical Analysis
The statistical analysis of the data was performed using GraphPadprism software v.5.0 (Graph-Pad Software, San Diego, CA). Statistically significant difference among two or more groups was identified using Kruskal-Wallis H test. The comparison of CCL20, CCNA1, EPHB2 and ELANE gene transcripts expression in each SLE subsets to controls was performed using the nonparametric Mann Whitney test. P-value less than 0.05 were considered as statistically significant.

Principal Component Analysis
The aim of this study was to identify transcriptomic signature in different subsets of SLE patients categorized on the basis of autoantibody profile; Subset I: anti-dsDNA + (patients possessing autoantibody against dsDNA); Subset II: anti-ENA + (patients possessing autoantibody against ENA) and Subset III: anti-dsDNA + ENA + (patients possessing autoantibodies both against dsDNA and ENA). Initially, we performed an unsupervised principal component analysis to identify subset specific phenotypes that is more likely to be represented as a function of all transcripts rather than the separate expression values on PCA plot. The PCA analysis was conducted using the transcripts that showed FPKM!1. Using RNA-seq we identified that anti-dsDNA + patients expressed 36464 transcripts with FPKM!1 whereas anti-ENA + and anti-dsDNA + ENA + patients expressed 34689 and 33703 transcripts with FPKM!1 respectively. The plot represents a total of 33254 transcripts on PCA plot. The PCA analysis represented individual samples on two principal components ( Fig  1A). PCA reveals that samples of anti-dsDNA + subgroup were spatially separated from the anti-ENA + patient samples. Majority of the samples belonging to anti-dsDNA + and anti-ENA + subgroup clustered in their respective class. Anti-dsDNA + ENA + patient samples were observed to lie either close to anti-dsDNA + patients or anti-ENA + patients. However, one sample from each subgroup exhibit variation from their respective SLE subgroup. Similar results were observed with hierarchical cluster analysis (HCA) which represents similarity and distinction among the samples. The Dendrogram derived from HCA is based on the Euclidean distance between datasets in the space of the first two PCs which is represented as the height of the branches (Fig 1B).

Transcriptome Characterization
After we observed the separation of anti-dsDNA + and anti-ENA + SLE subgroups using unsupervised cluster analysis, it was further interesting to identify the distribution of different class   Table).We observed that the percentage of dysregulated protein-coding transcripts was not uniform in different subsets of SLE patients, with minimal in anti-dsDNA + ENA + subgroups (25%) and maximal in anti-dsDNA + subgroup (45%) (Fig 2). SLE patients represent a diverse array of autoantibodies against self-antigens so we also evaluated the expression of Ig genes in different SLE patients' subsets. We observed striking difference in the Ig gene transcripts among distinct subsets. Highest percentage (60%) of Ig gene transcripts was observed to be expressed in anti-dsDNA + patients followed by anti-ENA + patients (37%) (Fig 2). There was strikingly reduced number of Ig gene transcripts (3%) in anti-dsDNA + ENA + SLE patients ( Fig  2). It has been recently identified that ncRNAs are key regulatory molecules for the post-transcriptional modulation of genes [19] and other transcripts, we observed equivalent percentage of ncRNAs and other transcripts expressed in each SLE patients' subsets (range 31-35% and 28-37% respectively) (Fig 2). Furthermore, different classes of ncRNA species like lincRNA, miRNA snRNA, snoRNA, misc RNA were observed to be dysregulated in all subset of SLE patients (S2 Fig), with no significant difference in their expression among different SLE subsets. reduced expression were observed in all subsets of SLE patients (Fig 3A and 3B). Heat map representing the differentially expressed transcripts in each SLE subsets as compared to controls has been provided in supplementary information (S3 Fig). Further, the expression of CCL20 CXCL3, CCNA1, OPLAH, EPHB2 and IFNG transcripts that were observed to be upregulated in anti-dsDNA + / anti-ENA + / anti-dsDNA + ENA + patients, as identified by RNAseq analysis were graphically plotted to check for the variability among the individual samples and subgroups (S4A- S4F Fig).

Pathway Enrichment Analysis
The biological relevance and the functional characteristic of the uniquely expressed transcripts associated with distinct SLE patients' subsets were identified by IPA software analysis. IPA analysis of upregulated and downregulated datasets revealed unique canonical pathways in each SLE subgroups ( Table 2). After analyzing the up-and down-regulated transcripts independently and selecting the top affected pathways based on p-value we analyzed these transcripts together to study their effect on particular canonical pathways based on z-scores. An important application of the IPA software is that along with the identification of affected canonical pathway it can predict whether a canonical pathway is activated or inactivated based on the z-score algorithm ( Table 3). These canonical pathways have been discussed in detail in next section. Functional analysis by GSEA and DAVID pathway revealed similar pathways as identified by IPA which further confirms the pathway signatures specific to each SLE subsets. S3 Table lists the result of the analysis by GSEA and DAVID, including GO term and KEGG pathways.

Pathways associated with uniquely expressed transcripts in distinct SLE subsets
Top signaling pathways affected in anti-dsDNA + Subset. Pattern recognition receptor (PRR) signaling, Liver X receptor/ Retinoid X receptor (LXR/RXR) activation and Growth arrest and DNA damage-inducible 45(GADD45) signaling pathways were associated with upregulated transcripts in anti-dsDNA + patients The upregulated transcripts (471 uniquely expressed; A1 in Fig 3A) analyzed by IPA revealed activation of various canonical pathways ( Table 2). The top three canonical pathways were PRR in recognition of Bacteria and Viruses, LXR/RXR activation and GADD45 signaling. In this study, we observed up-regulation of NALPs (cytoplasmic PRR), PKR, extracellular PRRs (C1, C3, PTX3), IRF7, IL-6, IL-10 transcripts which have essential role in evoking inflammatory response and are involved in PRR signaling in recognition of Bacteria and Viruses (S5 Fig). The dysregulated LXR/RXR pathway was found to be associated with overexpression of CD36, FASN, IL-1A, IL-6, OLR1 and SCD1 transcripts. In addition, we also observed the elevated expression of GADD45A, GADD45G, CCND3 and CCNE1 transcripts in anti-dsDNA + patients that are involved in GADD45 signaling pathway.
Nur77 Signaling in T lymphocytes, Role of Nuclear factor activated T-cells (NFAT) in regulation of immune response and Neurotrophin/ TRK signaling pathways were associated with the downregulated transcripts in anti-dsDNA + patients IPA analysis of downregulated set of 244 transcripts (B1 in Fig 3B) revealed Nur77 signaling in T lymphocytes, Role of NFAT in regulation of immune response and Neurotrophin/ TRK signaling as top affected canonical pathways (Table 2) Table 3). The expression of CCL20 and CXCL3 transcripts that were observed as nodes in interactive pathways of anti-dsDNA + patients derived by IPA tool (Fig 4) were graphically represented for the comparative analysis among individual SLE patient sample of specific subgroup (S4A and S4B  Fig).
Top signaling pathways affected in anti-ENA + Subset. Complement system signaling, Interferon signaling and Role of PKR in interferon induction and antiviral response pathways were the top affected pathways associated with the upregulated transcripts in anti-ENA + patients Among the 399 uniquely upregulated transcripts (A2 in Fig 3A) we observed transcripts of CD46, CD55, CD59 genes elevated in anti-ENA + subsets which are complement regulatory molecules. These molecules together with upregulated ITGB2 and ITGAX in anti-ENA + subsets are involved in complement signaling and regulation (S7 Fig). Furthermore, antiviral response related signaling like, Interferon signaling and Role of PKR in interferon induction and antiviral response were implicated in anti-ENA + patients with the upregulation of IFITM2, IRF1, OAS1, MX1, p53, MAP2K6, MAKK3/6 and TNFRSF1A transcripts in these pathways ( Table 2).
Actin cytoskeleton signaling, metabolic pathways and Integrin linked kinase (IK) signaling pathways were the most affected pathways associated with downregulated transcripts in anti-ENA + patients In contrast to 399 upregulated transcripts, only 142 transcripts were downregulated (B2 in Fig 3B) in anti-ENA + dataset as compared to controls. The top most impacted pathway was actin cytoskeleton signaling which plays an important role in cell dynamic processes like motility, cytokinesis and phagocytosis (S8 Fig). Several transcripts like c-SRC, BAIAP2, FLNA, MYL6B, PIR121were downregulated in this pathway. Other transcripts like ACLY, SMHT2 and HAGH that are involved in metabolic pathways like Acetyl CoA biosynthesis III, Glycine Biosynthesis I and Methylglyoxal degradation I pathways, respectively were also seen to have reduced expression. Further, reduced expression of TNFRSF1A, FLNA, MYL6B and COX2 was observed which has potential role in IK signaling pathway was also observed to be affected ( Table 2).
Interferon signaling and p53 signaling pathways were associated with overall dysregulated transcripts in anti-ENA + subsets Combined IPA analysis of both sets of dysregulated transcripts (399 up, A2 + 142 down,B2 in Fig 3A and 3B respectively) revealed Interferon signaling and p53 signaling as the top most affected pathway in anti-ENA + patients subset based on z-scores (Fig 5, Table 3). The dysregulated interferon signaling involves downregulated BAX and IFNAR1 transcripts and upregulated IFITM2, IRF1, OAS1and MX1 transcripts specifically. Reduced expression of Bax, PMAIP1 and PRKDC transcripts and elevated expression of BBC3, MDM4, PML and p53 transcripts were associated with dysregulated p53 signaling. Further, the expression of CCNA1 and OPLAH, transcripts that were observed to be upregulated in anti-ENA + patients and represented as node in its interactive pathway derived by IPA tool (Fig 5) were plotted to compare the variation among individual samples of specific SLE subgroups (S4C and S4D Fig).
Top signaling pathways affected in anti-dsDNA + ENA + Subset. Antigen presentation pathway, CTLA4 Signaling in Cytotoxic T-lymphocyte and Cross talk between DC and NK cells are the top affected pathway associated with upregulated transcripts in anti-dsDNA + ENA + SLE subset in anti-dsDNA + ENA + patients In anti-dsDNA + ENA + patients (A3 in Fig 3A) antigen presentation pathway, CTLA4 Signaling in Cytotoxic T-lymphocyte and Cross talk between DC and NK cells were observed to be affected ( Table 2). We in this study observed the transcripts of IFNγ, NLRC5,Class I MHC, CNX (calnexin) genes to be overexpressed in anti-dsDNA + ENA + patients that are associated with the processing of typically intracellular or viral proteins eventually presented to CD + 8 T cells (S9 Fig). Further, in the CTLA4 Signaling in Cytotoxic T-lymphocyte pathway clathrin adaptor complex like AP2A1 and AP2M1 controls the T-cell activation by endocytosisng CTLA4 molecule followed by its lysosomal degradation. PP2A, MHC-I and GRB2 molecules are the negative signaling proteins involved that interfere with T-cell activation. We observed the upregulation of IFN-γ, F-actin, MHC-I and PVRL2 transcripts which contribute in the crosstalk between NK cells and DCs and have an essential role in immune cell expansion and refinement of the immune response.
CDK5 signaling, Cardiac β-adrenergic signaling and Mitochondrial dysfunction are the top affected pathways in anti-dsDNA + ENA + SLE patients associated with downregulated transcripts dataset in anti-dsDNA + ENA + patients IPA analysis of 131 downregulated transcripts (B3 in Fig 3B) demonstrated CDK5 (Cyclindependent kinases) signaling, Cardiac β-adrenergic signaling and Mitochondrial dysfunction as the most affected pathways ( Table 2). Impairment in the CDK5 signaling pathway was observed due to the reduced expression of ADCY, PPA1 and FOSB which together may hamper neuronal development, synaptic vesicle trafficking and neurotransmitter release (S10 Fig). Further, ADCY, PPA1 and AKAP also compromise the cardiac contractibility by interfering in the Ca +2 ion channels which is crucial for myofilament contraction and relaxation. Most importantly, mitochondrial dysfunction was observed as a result of reduced expression of NDUF1 of complex I, COX5B, COX7A2 of complex IV and SNCA transcripts in anti-dsDNA + ENA + subgroups of SLE patients' specifically.
Regulation of actin based motility by Rho, Vascular endothelial growth factor (VEGF) signaling and Integrin signaling pathways were associated with overall dysregulated transcripts in anti-dsDNA + ENA + subset The overall dysregulated transcripts (200 up, A3 + 131 down, B3 in Fig 3A and 3B respectively) upon analyzing with IPA revealed Regulation of actin based motility by Rho, VEGF signaling and Integrin signaling as the most affected signaling pathways based on z-score (Fig 6,  Table 3). The top most pathway Regulation of actin based motility by Rho is associated with dynamic organization of the actin cytoskeleton which provides the force for cell motility and is regulated by small GTPases of the Rho family, in particular Rac1, RhoA and CDC42. They specifically activate several downstream effectors, like G-actin, ARP 2/3, GDIA, which were observed to be upregulated in our study, whereas MLCP was downregulated. Another pathway, VEGF signaling observed to be affected in anti-dsDNA + ENA + SLE subsets has a fundamental role in growth and differentiation of vascular and lymphatic endothelial cell. We observed elevated expressions of HIF-1α which plays a significant role in inducing transcription of the VEGF gene under hypoxic conditions and it is also responsible for induction of angiogenesis in pathological situations like diabetic retinopathy, tumor angiogenesis and coronary artery disease. Other molecules of this signaling pathway dysregulated in our dataset were α-actinin and actin which has a role in cell migration; GRB2 has a role in cell proliferation. Further, over expression of α-actinin, actin, ARP2/3, GRB2 transcripts and downregulation of MLCP were observed to result into dysregulation of Integrin signaling pathway. This pathway has a crucial role in cytoskeleton remodelling and also triggers the activation of mitogen activated protein kinase (MAPK) pathways. Furthermore, expression of EPHB2 and IFNG transcripts in each SLE patient samples were represented graphically to compare the variation among SLE patient sample of specific subgroup (S4E and S4F Fig). These were the differentially expressed transcripts that were observed as nodes in interactive pathway by IPA in anti-dsDNA + ENA + (Fig 6).

Pathways associated with transcripts dysregulated in both anti-dsDNA + and anti-ENA + SLE subgroups or in all SLE patients' subsets
After analyzing the uniquely expressed transcripts in distinct SLE patient subsets, it was of interest to analyze the common set of transcripts dysregulated in distinct subsets. A total of 211 transcripts (161 up, A4 + 50 down, B4 in Fig 3A and 3B respectively) were dysregulated commonly in anti-dsDNA + and anti-ENA + SLE patients' subgroups. The top affected canonical pathway included Oncostatin M signaling, pancreatic adenocarcinoma signaling and VEGF signaling. Further, the 211 transcripts commonly dysregulated (178 up, A5 + 33 down, B5 in Fig  3A and 3B respectively) in all the three groups were also analysed using IPA which revealed dysregulation of Citrulline biosynthesis pathway, Phagosome maturation pathway and IL-8 signaling pathway (Table 4).
The dysregulated transcripts belonging to the granulocyte signature genes included transcripts of CSTG, DEFA3, DEFA4, ELANE, LTF, MPO, MMP8, LCN2 and CSTD genes. They too expressed differentially in each subsets with predominant expression (higher fold change) in anti-dsDNA + sub-group and minimal in anti-ENA + sub-group (Fig 8, S5 Table).

Plasma cell signature transcripts and distribution of Ig gene transcripts in distinct SLE patients' subsets
SLE is generally characterized by abnormalities in B cell activation which include increased number of circulating Plasma Cells [20] and its frequency has been associated with the production of autoantibodies [21]. Therefore, it was interesting to identify the PC related transcripts and Ig gene transcript distribution in SLE patients segregated on the basis of autoantibody specificities. There are series of molecular events that occurs in mature PCs (Ig secreting long lived cells) which differentiates it from the naïve B cells and plasmablast cells (Ig secreting short lived and proliferative cells). The PCs are characterised by the specific phenotype markers and known expression pattern of cell cycle arrest gene, transcription factors, unfolded protein response (ER stress) and highly mutated immunoglobulin genes [22,23]. We observed upregulation of PC marker gene transcripts to be mostly associated with anti-dsDNA + and anti-ENA + patients. It mainly includes CD27, CD38, CD43, CD138, GP130 in anti-dsDNA + subgroups and CD27, CD38, CD43, CD44, GP130 in anti-ENA + subgroup whereas, anti-dsDNA + ENA + patient show expression of CD38 and CD43 only (Table 5)  HLA-DMB, HLA-DMA and PCNA either in one or more subset of SLE patients ( Table 5). Dysregulation of these transcripts were mainly observed in anti-dsDNA + and anti-ENA + patients while comparatively low number of PC related transcripts were expressed in patients with anti-dsDNA + ENA + . Further, as we mentioned previously the distribution of Ig gene transcripts varies significantly among different subsets of SLE patients (Fig 2). We observed elevated expression of variable regions of heavy and light chain and constant heavy chain region of Ig gene transcripts in all SLE patient subsets with highest distribution in anti-dsDNA + patients than in anti-ENA + patients and lowest in anti-dsDNA + ENA + patients (Table 6) which is in concordance with the variation of PC related transcripts in each SLE subsets.

Validation by TaqMan real time RT-PCR
RNA-seq was performed for a total 16 samples, 4 in each SLE subgroups including healthy individuals. After analysis, four uniquely dysregulated transcripts were selected from each SLE subgroups, CCL20 (upregulated in anti-dsDNA + ), CCNA1 (upregulated in anti-ENA + ), EPHB2 (upregulated in anti-dsDNA + ENA + ) and ELANE (upregulated in all subsets) for further validation by TaqMan real time PCR. Further, more samples were included in each SLE subgroups for the validation experiments. We observed that the results obtained by TaqMan real time PCR were in concordance to those observed by RNA-seq experiments (Fig 9). CCL20 was observed to be significantly overexpressed in anti-dsDNA + patients (p value 0.009) ( Fig  9A). CCNA1 expression was specifically elevated in anti-ENA + patients (p value 0.001) ( Fig  9B). EPHB2 expression was observed to be significantly overexpressed in anti-dsDNA + ENA + patients (p value 0.01) (Fig 9C) and ELANE was significantly overexpressed in all patient subsets compared to controls (in anti-dsDNA + patients p value 0.001; anti-ENA + patients p value 0.02 and anti-dsDNA + ENA + patients p value 0.01) but had higher expression in patients with anti-dsDNA autoantibody (Fig 9D).

Differentially expressed genes identified by Cufflink and DESeq analysis
Differentially expressed (upregulated or downregulated) genes obtained from Cufflink and DESeq analysis tools in each SLE patients' subsets that show a minimum of two fold changes as compared to that of healthy individuals were used for further analysis. The number of differentially expressed genes derived from DESeq analysis was greater in anti-dsDNA + and anti-dsDNA + ENA + subgroups as compared to that of Cufflink tool (Fig 10A and 10B). However, DEGs derived from cufflink analysis was comparatively higher than that of DESeq in anti-ENA + patients (Fig 10C). A total of 169, 40 and 32 genes were commonly observed with Cufflink and DESeq analyses in anti-dsDNA + , anti-ENA + and anti-dsDNA + ENA + patients, respectively (Fig 10A-10C) (S6 Table). Further, we identified that 131 genes were uniquely expressed in anti-dsDNA + patients (A1 in Fig 11), 16 and 7 unique genes expressed in anti-ENA + and anti-dsDNA + ENA + patients respectively (A2 and A3 in Fig 11). Pathway analysis for differentially expressed genes. IPA analysis of the uniquely expressed genes in anti-dsDNA + patients (A1 in Fig 11)   also observed with David pathway analysis (p value 6.4E-04). In anti-ENA + and anti-dsDNA + ENA + subsets, pathway information could not be generated by IPA and David owing to small number of dysregulated genes in these subsets (A2 and A3 in Fig 11).

Discussion
SLE patients are known to exhibit tremendous heterogeneity in clinical presentations. In our earlier studies we have reported that miRNA based immuno-regulatory mechanisms [15], TLR-7 and -9 expressions [13], small HSP involvement [14] and sources of autoantigen pool [16] differentially prevail in different SLE patients' subsets with distinct autoantibody specificities. These findings clearly points towards the interesting observation that the SLE subset specific disease events could often be missed if studied as a single group. The result of the present study design further builds upon the same concept by clearly indicating that a differential transcriptome profile (genes and transcripts) exists for different groups of SLE patients categorized on the basis of distinct autoantibody specificities. The gene transcripts are the mRNAs that are generated from the same locus either by alternative splicing or alternative promoter usage of a gene. The regulation by different gene transcripts is a critical part of disease processes which is not much explored in SLE. The differentially expressed transcripts among different SLE subsets (identified by Cufflink analysis of RNA-seq data), as observed in this study would have a crucial impact on various biological processes that may result into phenotypic differences among different SLE patients.
An unbiased approach for the transcriptome analysis in SLE patients indicated that patients with anti-dsDNA + and anti-ENA + autoantibodies have specific clinical phenotype that sets them apart. Anti-dsDNA + ENA + patients show some similarity with other group of patients which could be due to the presence of autoantibodies against both dsDNA and ENA, which may contribute towards shared phenotype. Though, some samples lie apart from their respective group that could be due associated clinical manifestation in that particular patient. Extensive analysis of distinct SLE sub-groups revealed unique genetic patterns in each subset (SLE patient with anti-dsDNA autoantibody or anti-ENA autoantibody or patient with both autoantibodies). Further, use of IPA analysis predicted the functional relevance of distinctly expressed transcripts which were found to be associated with unique immunological pathways in each SLE subsets. The top most affected pathway in anti-dsDNA + sub-group was role of pattern recognition receptors in bacteria and viruses suggesting that the innate immune system which has an important role in the production of pro-inflammatory cytokines are preferentially dysregulated in this subgroup of SLE patients. Interestingly, multiple cytokine signaling pathways were also observed to be dysregulated in anti-dsDNA + patients. Cytokine imbalance is very well known to be implicated in SLE pathogenesis [24] and the levels of various cytokines have been demonstrated to correlate with anti-dsDNA levels in SLE patients [25,26]. Further, the dysregulation of cytokine signaling specifically in this subset of SLE patients could also be due to miRNA mediated regulation as reported previously [15]. LXR/RXR pathway found to be affected in anti-dsDNA + patients is involved in lipid metabolism and inflammation. Recently, a study reported that perturbation in lipid homeostasis in SLE patients affects the functioning of T cells [27]. Other affected pathways are Nur77 signaling in T lymphocytes and Role of NFAT in regulation of immune response. Both the pathways are central to T cell signaling which is considered to have an important role in SLE pathogenesis [28]. Nur77 which is downregulated in anti-dsDNA + patients is crucial for TCR-mediated thymocyte apoptosis for the elimination of self-reactive TCRs [29]. Persistence of the autoreactive T-lymphocytes further leads to the activation of B-cells and autoantibody generation contributing to the pathogenesis of lupus. Enhanced calcium-calcineurin NFAT signaling has been reported in SLE patients [30] whereas another study has demonstrated decreased calcineurin expression depending upon the glucocorticosteroid [31]. These drugs are generally used for the treatment of SLE [32]. It is important to note here that the patients in each SLE subsets were on immunosuppressive drugs, the difference observed in the expression pattern of transcripts could be the outcome of specific disease driven process. We observed dysregulation of GADD45 signaling which is known to have a crucial role in DNA damage and replication. Li et al 2010, have reported that overexpression of GADD45, contributes to SLE pathogenesis by promoting demethylation in T cells [33]. Dysregulation of Neurotrophin/ TRK signaling pathway in anti-dsDNA + subgroup as observed in this study have been reported to be associated with T-cell development [34] and neuronal functions [35]. Neurotrophins were reported in neuropsychiatric SLE (NPSLE) patients and its increased expression was associated with improved neuropsychiatric symptoms [36] but no association has been reported in SLE patient with specific serological group.
In contrast to anti-dsDNA + SLE patients, anti-ENA + patients show increased expression of complement regulatory molecule CD46, CD55, CD59 and ITGB2, ITGAX thus affecting complement signaling. Elevated CD46 expression had been previously reported in SLE sera [37] but Alegretti et al., 2012 showed diminished expression of CD46, CD55 and CD59 in peripheral blood of SLE patients with haematological involvement [38]. Furthermore, we observed interferon regulated or inducible transcripts are elevated in patients with anti-ENA autoantibody which thus affect IFN signaling. Environmental triggers like viral infection are reported to induce IFNs which are central to SLE pathogenesis [39,40]. Another pathway, role of PKR in interferon induction and antiviral response, is affected in the same group of SLE patients involving transcripts of IRF1, TNFRSF1A, p53, MAP2K6 and MKK. Apart from the interferon induction, IRF1 also contributes to the dysregulated epigenome that leads to perpetuation of SLE [41]. These MAPK signaling molecules were observed to be hyperactivated and are shared among various networks in lupus [42]. Molecular mimicry between a particular protein of Epstein-Barr virus (a suspected SLE causing agent) and Sm protein (an anti-ENA target) have been widely recognized as an initial trigger in the development of the autoimmunity. This molecular mimicry could attribute to the dysregulation of viral associated pathway specifically in anti-ENA + patients [43,44]. Lipid metabolism, amino acid metabolism and methylglyoxal degradation pathways were found to be associated with anti-ENA + patients. Although these pathways are not reported in SLE so far but several other metabolic pathways like glycolysis, Krebs cycle, fatty acid β oxidation and amino acid metabolism have been reported to be defective in SLE patients [45]. Abnormalities in actin cytoskeleton signaling and IK signaling were observed in anti-ENA + patients in particular. It is speculated that the dysregulation of actin cytoskeleton signaling specifically in anti-ENA + patients results from miRNA mediated regulation as reported by Chauhan et al., 2014 [15]. However, abnormal actin cytoskeleton distribution patterns were reported in bone marrow-derived mesenchymal stem cells of SLE patients compared to healthy controls [46].
In SLE patients with presence of both anti-dsDNA and anti-ENA autoantibodies (anti-dsDNA + ENA + ) it was observed that Antigen cross presentation pathway and effector CTLA4 signaling in T-lymphocytes pathways were significantly affected. Overexpression of IFN-γ and calnexin in our study is suggestive of increased processing and endosomal trafficking of antigens and presentations to the cytotoxic T-lymphocytes that was reported to be involved in the pathogenesis of lupus nephritis [47]. CTLA-4 is a critical gatekeeper of T-cell activation and immunological tolerance and has been implicated in patients with a variety of autoimmune diseases through genetic association. Abnormal CTLA functioning has also been reported in SLE [48]. The transcripts of the gene involved in CTLA-4 signaling like AP2A1, AP2M1, MHC-I, PP2A and GRB2 were observed to be increased in anti-dsDNA + ENA + patients. It is thus possible that the over-activation of CTLA-4 signaling may impact other T-cell signaling pathways and contribute to SLE pathogenesis. Impairment of the NK cell function had been reported in SLE patients [49] which could be due to miRNA mediated dysregulation [50,51]. Aberrant expression of DC cells is also documented in SLE patients [52]. These two cell types together may contribute to the pathogenesis of SLE via cross talk. Further, dysregulation of CDK5 signaling was observed in this SLE subgroup. Jeffries et al., 2011 also reported CDK5 signaling pathway to be affected among hypomethylated genes in CD4+ T cells in lupus [53]. Dysregulation of Clathrin-mediated endocytosis and CDK5 signaling had previously been reported in anti-dsDNA + SLE patients due to miRNA mediated regulation [15]. Mitochondrial dysfunction was observed to be associated with downregulated transcripts in anti-dsDNA + ENA + patients. Mitochondrial dysfunctioning was widely observed in SLE patients which supports our present finding [54,55]. Moreover, G-actin is an important molecule that is primarily involved in the signaling associated with dynamic organization of actin cytoskeleton [56]. Instead, these are also known to inhibit the Deoxyribonuclease-I (DNase-I) activity [57,58] which is an endonuclease responsible for degrading DNA from neutrophil extracellular traps (NETs) [59]. Thus, inhibition of the DNase activity also leads to the pathogenesis of SLE [58].
Earlier reports established that IFN signature and granulocyte signature genes are dysregulated in SLE patients [4,7,60]. It is important to mention here that these studies were conducted on unsegregated SLE patients whereas a recent study in paediatric lupus patients reported the association of IFN signature and neutrophil signature with specific group of patients. IFN signature was associated with disease activity and neutrophil signature was enriched in active lupus nephritis [61]. However, in the present study, large number of differentially expressed transcripts of IFNs gene family have been preferentially identified in anti-ENA + patients and predominant expression of granulocyte signature gene in anti-dsDNA + patients. It was interesting to note that the interferon alpha pathway was observed to be enriched in the anti-ENA + SLE subsets although specific subtypes of ENA autoantibodies (Sm, RNP, SS-A, SS-B, Jo-1 and Scl-60) were not evaluated in this study. High interferon levels had been shown to be associated with elevated level of anti-dsDNA and anti-ENA autoantibodies in SLE patients [62], however Niewold et al, 2005 reported increased levels of anti-SSA autoantibodies and ANA whereas no change in the anti-dsDNA titre was seen after IFN-α treatment in hepatitis C patient, who developed de novo SLE [63]. Another study also reports higher interferon score in SLE patients with ENA autoantibodies whereas no difference was observed in IFN score with variation of anti-dsDNA status [64]. Moreover, majority of neutrophil associated autoantigens that were observed to be enriched in neutrophil extracellular traps (NETs) and are reported to be highly expressed in SLE as compared to other autoimmune diseases such as rheumatoid arthritis, vasculities, multiple sclerosis etc. [65]. The elevated NET formation and their inadequate degradation also contribute to the autoantigen pool in SLE patients [59,66]. In another study in our lab, we observe that the NET degradation and its clearance are significantly compromised in anti-dsDNA + patients whereas it is comparable to healthy individual controls in anti-ENA + patients [16]. These observations strengthen the view that unique pathological events would be associated with each SLE patients' subset with distinct autoantibody specificity. We therefore propose that SLE groups to be studied in segregation so that precise clinical evaluation and treatment pertaining to specific groups should be devised.
Our observation of markedly large number of dysregulated PC related transcripts including Ig gene transcripts in anti-dsDNA + and anti-ENA + patients as compared to anti-dsDNA + ENA + was interesting. Moreover, the frequency of circulating plasma cell had been reported to be associated with anti-dsDNA titres in SLE patients [21]. In contrast, high titres of anti-Sm/RNP and Ro/La autoantibodies had been shown to be associated with long-lived plasma cells whereas plasmablast cells, which are more susceptible to immunosuppresive or targeted B cell therapies, are responsible for the production of anti-dsDNA antibodies [67,68]. It is, therefore, possible that the production of specific autoantibodies against dsDNA or ENA may result from other immuno-regulatory disturbances such as dysregulation of TLRs [13,69,70]. despite of similar expression pattern of PC related transcripts in anti-dsDNA + and anti-ENA + patients as observed in this study.
In addition, the transcripts like IL-6ST, TGFβ-R1, JAK, AKT, GRB2, MT2A, RALGDS, were observed to be commonly elevated in anti-dsDNA + and anti-ENA + SLE patients. Previous studies had shown their association with the inflammation [71][72][73], cell proliferation and growth signaling [74,75] in SLE patients. However, Inhibition of JAK-STAT signaling and PI3/AKT/mTOR signaling had been suggested to be a potential treatment option for SLE [76,77]. RALGDS was reported in SLE PBMCs whereas MT2A and GRB2 were observed to be upregulated in lupus T cells [78,79]. MT2A has been reported to play crucial role in leukocyte chemotaxis [80].
Besides observing the uniquely expressed transcripts in patients with different autoantibody specificity, we have also identified transcripts that were commonly dysregulated in all three subsets. ARG1, GLS are the important enzymes involved in the citrulline biosynthesis. The antibodies against cyclic citrullinated proteins (CCP) are serological biomarker for rheumatoid arthritis whereas 10-15% of SLE patients also exhibit anti-CCP autoantibodies [81]. Rab7 that was found to be elevated in all subsets of SLE patients which plays a crucial role in regulating membrane traffic between the endo/lysosomal system and phagosome maturation at the time of internalization of pathogens or apoptotic cells [82]. It is also evident that the phagocytosis efficiency of the neutrophils and macrophages are compromised in SLE patients [83,84]. Furthermore, elevated expression of CAP3/7, DEFA1, MPO and LIM kinase transcripts was observed in all SLE patients' subset that are involved in IL-8 signaling. IL-8 signaling is associated with neutrophil migration and activation as a result of inflammation [85].
In-depth analysis of DEGs using two different softwares were performed because there is no general consensus regarding the best analysis tool for the differential expression analysis of RNA-seq data [86,87], DESeq program was also used to support the Cufflink analysis. The functional analysis of commonly expressed genes, as identified by both Cufflink and DESeq software in anti-dsDNA + patients shows involvement of Cell Cycle signaling pathway. Earlier studies have reported abnormality in cell cycle phase in SLE patients [88]. Even though no pathway could be predicted in case of anti-ENA + and anti-dsDNA + ENA + SLE subsets due to small number of DEGs but they have unique expression pattern of genes as observed in this study for differentially expressed transcripts among distinct patient subsets.
In conclusion, this study has identified unique expression pattern of transcripts in SLE patients varying in autoimmune response to key nuclear autoantigens (dsDNA and ENA). We have also identified critical canonical pathways associated with dysregulated transcripts that may distinguish the patients with anti-ENA autoantibodies from the patients with autoantibodies against dsDNA. The possibility of underlying differences in the disease mechanism in SLE patients could be due to pathological role driven by distinct autoantibodies. The results of the present study in conjunction with the ongoing genomic analysis of SLE patients characterized on the basis of distinct end organ disease manifestations could generate useful information and provide avenues for development of new targeted and precise therapeutic interventions in SLE.