Identification of pathological RA endotypes using blood-based biomarkers reflecting tissue metabolism. A retrospective and explorative analysis of two phase III RA studies

There is an increasing demand for accurate endotyping of patients according to their pathogenesis to allow more targeted treatment. We explore a combination of blood-based joint tissue metabolites (neoepitopes) to enable patient clustering through distinct disease profiles. We analysed data from two RA studies (LITHE (N = 574, follow-up 24 and 52 weeks), OSKIRA-1 (N = 131, follow-up 24 weeks)). Two osteoarthritis (OA) studies (SMC01 (N = 447), SMC02 (N = 81)) were included as non-RA comparators. Specific tissue-derived neoepitopes measured at baseline, included: C2M (cartilage degradation); CTX-I and PINP (bone turnover); C1M and C3M (interstitial matrix degradation); CRPM (CRP metabolite) and VICM (macrophage activity). Clustering was performed to identify putative endotypes. We identified five clusters (A-E). Clusters A and B were characterized by generally higher levels of biomarkers than other clusters, except VICM which was significantly higher in cluster B than in cluster A (p<0.001). Biomarker levels in Cluster C were all close to the median, whilst Cluster D was characterised by low levels of all biomarkers. Cluster E also had low levels of most biomarkers, but with significantly higher levels of CTX-I compared to cluster D. There was a significant difference in ΔSHP score observed at 52 weeks (p<0.05). We describe putative RA endotypes based on biomarkers reflecting joint tissue metabolism. These endotypes differ in their underlining pathogenesis, and may in the future have utility for patient treatment selection.


Introduction
Rheumatoid Arthritis (RA) is a heterogeneous chronic inflammatory and autoimmune disease, characterised by pain and swelling of joints, progressive joint deterioration, bone PLOS ONE | https://doi.org/10.1371/journal.pone.0219980 July 24, 2019 1 / 12 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 erosions and impaired function, often leading to disability [1]. Due to the heterogeneity and the complex aetiology of the disease, rates of drug response in RA patients varies from compound to compound. A systematic review of effectiveness of anti-TNF compounds found the mean percentage of ACR-20 responders was 60.8% and EULAR responders 70.5%, leaving a large proportion still unresponsive [2]. Development and selection of treatment is therefore a challenging process making it difficult to deliver targeted treatments. Currently treatment is based on the use of disease activity markers such as disease activity scores (e.g. DAS28), radiographic scores and other clinical assessments to gauge a patient's state, progress and response to treatment. This does not take into account more intricate pathological differences between the patients but rather more systemic levels of inflammation. By adopting a precision medicine approach, one aims to improve response rates through the use of biomarkers facilitating the identification of patients most likely to benefit [3].
RA may comprise of multiple clinical and molecular endotypes [4], being defined as subtypes of a disease with distinctive underlying pathological mechanisms [4,5]. A patient's clinical presentation, response to treatment and rates of disease progression may be determined by a patient's endotype. Thus, well characterised endotypes may provide patients and physicians with a means for targeted treatment against a patient's individual molecular disease pathogenesis. Moreover, endotypes may provide insight into novel pathways and thereby discovery of new drug mode [4].
In order to begin designing such a tool to identify distinct endotypes in RA, it is vital to have an understanding of underlying pathological mechanisms. Type I, II and III collagens are the most prevalent structural proteins in the joint, and among the most prevalent in the body [6]. Various clinical and preclinical studies have shown in increase in type I, II and III collagen degradation in RA due to an increase in proteolytic activity [7,8]. Multiple biomarkers have been developed that measure such tissue metabolites in blood of patients with arthritis [9]. Examples of such tissue metabolites are: i) CTX-I and PINP, which are measures of bone resorption and formation; [8,10] ii) C1M and C3M, which reflect MMP mediated degradation of collagen 1 and 3 respectively; [11,12] iii) C2M, measuring MMP mediated collagen 2 degradation; [13] iv) CRPM, CRP degradation; [14] and v) VICM, MMP driven citrullinated vimentin degradation [15]. These are also examples of tissue metabolites which have been shown to elevated in RA, associated with disease activity [16] and treatment response [9]. As these metabolites are a reflection of tissue associated disease mechanism, they could be useful tools to identify different endotypes in RA.
The aim of this study was to demonstrate how different putative endotypes representing underlying pathologies of RA, can be identified through the use of joint specific serological biomarkers.

Studies
Four cohorts were pooled for this study, including two RA studies; LITHE (N = 1196, NCT00106535) and OSKIRA-1 (N = 918, NCT01197521) and two OA studies; SMC01 (N = 1176, NCT00486434) and SMC02 (N = 1030, NCT00704847) all of which have been described in detail before [17][18][19][20]. The RA studies included patients with moderate to severe disease and with a high disease activity. Patients from the OA studies were included in order to enrich the cohort with a disease population with low inflammation but with known joint disease as characterised by X-ray and by symptomatic scores, providing a broader range of biomarker levels. Patients were randomly selected from the clinical trials to participate in biomarker sub-studies of all phase III clinical trials for RA (LITHE and OSKIRA-1) and placebo arms of OA studies (SMC01 and SMC02) (Fig 1). Patients in LITHE all had active moderate to severe RA with inadequate response to methotrexate (MTX). OSKIRA-1 had similar inclusion criteria, selecting patients with active RA and inadequate response to MTX. Both SMC trials recruited patients with a medical history and symptoms of knee OA. Further details of the patient demographics from each study can be read in Table 1. Patients from any cohort with missing biomarker measurements at baseline were excluded from the study. No ethical approval was required as no patient recruitment or data collection was carried out for this study. The authors did not have access to any identifying participant information. Endpoints in each RA study were measured according to ACR and EULAR criteria [21], and include radiographic measurements, DAS-28 disease activity scores as well as patient and physician reported outcomes. Erosion (ERN), joint space narrowing (JSN) and modified sharp score (SHP) were calculated at baseline and 24 weeks for LITHE and OSKIRA, and at 52 weeks for LITHE. Delta (Δ) ERN, JSN and SHP were calculated by subtracting baseline score from follow-up scores at week 24 and 52 where available for the placebo patients, to remove treatment bias.
All variables were calibrated according to the assay they had been measured in to control for possible batch effect. Each biomarker was then log transformed and min-max normalised to allow for direct comparison of each of the variables.

Clustering protocol
Three statistical heuristics were evaluated in order to identify cluster tendency and the number of inherent clusters in the data. The GAP statistic [28], silhouette method [29] and elbow method [30] were all used to assess this, with the majority rule used to indicate number of clusters.
Patient clustering was then performed using Ward hierarchical clustering. The statistical significance of the resulting cluster separations was assessed using permutation testing as described by Serviss et al. [31] using 100 iterations. ANOVA and Chi squared tests were used to identify differences of clinical measures between the placebo patients in each of the clusters. In order to identify individual differences between clusters, Tukey ad hoc test was used.
All data handling and statistical analysis was performed using R software (version 3.4.1, R Development Core Team, 2012), with use of factoextra (version 1.0.5) and pheatmap (version

Study overview
Patients with serological markers measured at baseline (BL) were selected from study populations (N = 1233, Table 1). Patients from the OA studies (SMC01 and SMC02) were significantly older and had significantly higher BMI than patients in the RA cohorts (LITHE and OSKIRA) (Age, p < 0.001; BMI, p = 0.002). There was no significant difference in the percentage of males in the OA cohorts.

Description of clusters
Hierarchical clustering was performed based on a panel of 7 tissue specific serological biomarkers measuring bone resorption and formation (CTX-1 and P1NP), cartilage degradation (C2M), macrophage activity (VICM), interstitial matrix degradation (C1M and C3M), and CRP metabolites (CRPM). This analysis revealed 5 distinct clusters from based on patient biomarker profiles, which can be seen in the heatmap in Fig 2. Permutation testing revealed all clusters were statistically significantly different from each other, all with p < 0.01. Clusters A (194 subjects) and B (256 subjects) were comprised of the majority of the RA patients (97.9% and 98% RA patients respectively). Cluster C (188 subjects), D (363 subjects) and E (232 subjects) contained lower percentages of RA patients (74.5%, 5.8% and 44.3% respectively). This was reflected in the demographics and biomarker profiles of the patients. A full description of cluster demographics and distribution can be found in Table 2.
Cluster A and B, which were predominantly RA clusters, had higher levels of C2M, CRPM, C1M and C3M compared to clusters C, D and E. This is indicative of high levels of cartilage turnover, CRP metabolites and interstitial matrix turnover respectively, typically observed in RA patients [32,33]. Both bone markers (P1NP and CTX-I) were also significantly elevated in clusters A and B in comparison to clusters C, D and E (p<0.001). Both Clusters A and B were characterised by consistently high levels of all biomarker measures, indicating cluster A and B appear to reflect endotypes of high disease activity. The difference between cluster A and B was the level of VICM (macrophage activity) that was significantly lower in cluster A compared to cluster B (p<0.001).
Cluster C had a mix of RA and OA patients and did not appear to have a particularly defining characteristic, all biomarker levels being close to the median. Cluster D, the largest cluster and predominantly an OA cluster, had relatively moderate levels of tissue turnover, characterising patients with lower disease activity as expected of OA patients. Furthermore, cluster D was characterised by significantly low levels of C2M, indicating lower cartilage turnover than patients from clusters A, B and C (p < 0.01). Cluster D therefore reflects an endotype with low bone activity, low inflammation and low cartilage activity. While cluster E, a mixed RA and OA cluster, shared many of the characteristics of cluster D, including low levels of all inflammatory markers, and lower levels of cartilage turnover, a notable exception was the higher levels of bone turnover biomarkers observed. Thus, cluster E appears to reflect an endotype with high bone activity and low inflammation.

Association between clusters and RA clinical assessments in placebo patients
We looked at the RA placebo patients across the 5 clusters in order to identify differences in clinically relevant variables (Table 3). and two osteoarthritis (OA), were clusters based on 7 tissue specific blood based biomarkers measuring bone resorption and formation, cartilage degradation, macrophage activity, interstitial matrix degradation and CRP metabolites. 5 clusters were found with two high inflammatory RA endotypes and one low inflammatory RA endotype, and a final "mixed" endotype. OA patients were clearly separated from RA patients as expected.
https://doi.org/10.1371/journal.pone.0219980.g002 The absolute radiographic scores (ERN, JSN and SHP) did not differ significantly at BL between the clusters. At 52 weeks however, ΔSHP was significantly different (p = 0.03). This was driven by ΔERN which was also significantly different between the clusters (p = 0.01), with cluster B having the highest change in erosion (ΔERN = 1.51 ± 2.1). Tukey post-hoc analysis reveals ΔERN was significantly higher in B at 52 weeks than A (p<0.05) and C (p<0.05). JSN did not differ between the clusters at baseline, nor ΔJSN at week 24 or week 52.
There was a significant difference in DAS28 at baseline (p<0.001) between all clusters. Tukey post-hoc analysis shows that at baseline, DAS28 was significantly different between cluster pairs A-E, B-C and B-E. The ΔDAS28 values at week 24 and 52 however was insignificant between the clusters (Table 3).
Age, BMI, duration of the disease, number of previous DMARDs, HAQ were statistically insignificantly different between the RA placebo patients of all the clusters identified.

Discussion
This study was a retrospective analysis of phase III clinical studies, exploring the hypothesis that biomarkers reflecting tissue metabolism may identify previously unrecognised RA endotypes. We used biomarker data from two RA clinical studies, enriched with two OA studies as a non-inflammatory arthritic population. Through clustering analysis we attempted to identify endotypes of RA with distinct profiles of tissue specific turnover, and identify associations between these endotypes and indicators of clinical outcomes. The biomarkers used are measures of tissue inflammation or turnover, mainly metabolites of collagen degradation and formation. As collagens are the main constituents of connective and calcified tissue, these biomarkers reflect the active molecular processes in disease affected tissue [33]. Type I and type III collagen are found in the connective tissue around the joint, as well as other affected connective tissues of the body, while type II collagen is present mainly in cartilage. We found five distinct putative endotypes (i.e. clusters), associated with different biomarker levels. Clusters A and B both had high levels of connective tissue and bone turnover compared to other groups, somewhat typical of RA [7,34,35]. Conversely, clusters D and E had lower levels of the connective tissue markers. Whilst this is to be expected of OA patients, RA patients in these clusters are more unusual. These RA patients showed no significant difference in DAS28 compared to other clusters, which could indicate a subset of RA patients that have low connective tissue turnover in spite of high disease activity.
Both clusters A and B exhibit high levels of connective tissue markers and high rates of bone formation and resorption, but differ in macrophage activity (VICM) and CRP metabolism (CRPM). Interestingly, placebo patients in these clusters differ significantly in the rate of radiographic progression.
Recent studies by Gu et al. showed the role macrophages play in inflammatory bone diseases and the effect they have on bone formation and degradation. These studies record that macrophage activity plays a part in bone tissue turnover [36].
Differences between diverse inflammatory and arthritic endotypes have previously been reported by other groups [37][38][39]. Dennis et al. who investigated synovial biomarker traits of certain endotypes in relation to drug response. They found four potential endotypes); i) lymphoid (B and plasma cell dominant), ii) a myeloid (macrophage dominant), iii) a fibroid, and iv) a low inflammatory phenotype [37]. This may align well with the connective tissue turnover endotypes we propose in this paper, where cluster A and B are characterised by high tissue turnover driven by the MMP activity which are expressed by immune cells such as macrophages [40]. Similarly, we also found differences in the macrophage marker VICM, separating the two high inflammatory endotypes, clusters A and B.
It is also important to note that whilst disease activity (DAS28) significantly differed across the groups, the disease activity difference between groups A and B (inflammatory endotypes) was insignificant, while cluster B exhibited much faster structural progression. This finding is consistent with claims that measurement of disease activity alone it is not sufficient to identify fast progressing RA patients who may require more a more aggressive therapeutic strategy [41].
Presently, there are few positive examples of endotyping in the field of RA, other than seropositivity, and anti-cyclic citrullinated peptide, often used to classify disease. The identification of 31 genetic risk loci in seropositive RA, including HLA-DRB1, and the group of alleles referred to as the shared epitope, demonstrate that RA has a strong but complex genetic aetiology. As a result, a genetic approach to the identification of biomarkers of response has been pursued with both candidate and genome-wide level approaches [42,43]. While associations that may ultimately lead to a greater insight into the molecular aetiology of RA have been established, the predictive capability so far has been insufficient to have meaningful clinical application. Indeed, Wang et al. [44] reported that each polymorphism identified in a genomewide association study accounted for 2% of the variance observed with a change in Disease Activity Score in 28 joints (DAS28) response to tocilizumab, while no significant association of polymorphisms of the target IL-6 receptor and its cognate ligand IL-6 was associated with change in DAS28 response, following correction for multiple testing. Baseline serum or RNA measurement of IL-6 or IL-6 receptor levels also provided little discrimination between responders and non-responders, indicating that the prevalence of the target receptor or it's cognate ligand was not clinically deployable to identify those patients most likely to benefit from tocilizumab [43]. There also have been attempts to utilise gene expression analysis to identify biomarkers of biologic endotypes with superior response to treatment [45]. For example, Kim et al. showed the importance of a differentially expressed gene, G0S2, in the response to anti-TNF therapy, in multiple cohorts.
The findings in this study support previous proposals that RA patients have differing underlying molecular aetiologies, and encourages further investigation into developing tools to advance patient stratification. The prospect of identifying robust endotype profiles representing distinct disease driving pathways and networks is tantalising. Such findings have implications for the development of new precision medicines, benefiting patients, physicians and payers [46]. If our endotypes are validated they may allow for superior treatment regime selection for the individual patients [47]; e.g. the treatment of patients with high bone remodelling, but moderate tissue inflammation might be focused on rebalancing the bone remodelling balance. Sivadas et al. show how whole genome sequencing has been used to identify genes involved in drug metabolism and identification of target genes [48]. In a review by Orr et al. they describe the advances in the use of synovial tissue for patient stratification and identification of novel targets and biomarkers [49]. Such approached could be coupled with a serological approach in order to improve understanding of drug metabolism and patient strata.
The strengths of this study were the large number of participants at baseline, along with robust tissue specific markers allowing for an accurate profiling and clustering of arthritic patients' disease pathology. Due to the nature of clinical trials, a high placebo dropout rate may affect the comparison of clinical endpoints. Patients enrolled in the trials had been exposed to prior therapy including DMARDs and NSAIDs. As information regarding specific prior treatments and ACPA positivity was not recorded or made available to us, we were not able to correct for these possible confounders. In contrast to LITHE, which was a 52 week placebo-controlled study, OSKIRA was a 24 week placebo-controlled study. Hence radiographic endpoint for OSKIRA were only available up to 24 weeks. For this reason, in order to reap any practical gain from this investigation, these findings must be validated in larger cohorts with more consistent follow up measurements to show any true predictive ability of these endotypes.

Conclusion
In conclusion this paper shows how the clustering of RA patients using tissue specific markers allows for the identification of disease endotypes, which may have use in precision medicine after continued research.