Use of principle component analysis to quantitatively score the equine metabolic syndrome phenotype in an Arabian horse population

Equine metabolic syndrome (EMS), like human metabolic syndrome, comprises a collection of clinical signs related to obesity, insulin dysregulation and susceptibility to secondary inflammatory disease. Although the secondary conditions resulting from EMS can be life-threatening, diagnosis is not straightforward and often complicated by the presence of other concurrent conditions like pituitary pars intermedia dysfunction (PPID). In order to better characterize EMS, we sought to describe the variation within, and correlations between, typical physical and endocrine parameters for EMS. Utilizing an unsupervised statistical approach, we evaluated a population of Arabian horses using a physical examination including body measurements, as well as blood plasma insulin, leptin, ACTH, glucose, and lipid values. We investigated the relationships among these variables using principle component analysis (PCA), hierarchical clustering, and linear regression. Owner-assigned assessments of body condition were one full score (on a nine-point scale) lower than scores assigned by researchers, indicating differing perception of healthy equine body weight. Rotated PCA defined two factor scores explaining a total of 46.3% of variation within the dataset. Hierarchical clustering using these two factors revealed three groups corresponding well to traditional diagnostic categories of “Healthy”, “PPID-suspect”, and “EMS-suspect” based on the characteristics of each group. Proxies estimating up to 93.4% of the composite “EMS-suspect” and “PPID-suspect” scores were created using a reduced set of commonly used diagnostic variables, to facilitate application of these quantitative scores to horses of the Arabian breed in the field. Use of breed-specific, comprehensive physical and endocrinological variables combined in a single quantitative score may improve detection of horses at-risk for developing EMS, particularly in those lacking severe clinical signs. Quantification of EMS without the use of predetermined reference ranges provides an advantageous approach for future studies utilizing genomic or metabolomics approaches to improve understanding of the etiology behind this troubling condition.

Introduction Equine Metabolic Syndrome (EMS) is a condition characterized by regional and abnormal adiposity, hyperinsulinemia, and susceptibility to laminitis [1,2]. Characteristics of the EMS phenotype, particularly in early stages, can be difficult to identify; often requiring endocrine diagnostics like the dynamic test for insulin dysregulation [3,4]. Since limited treatments are available for EMS, diagnosis and prevention are of utmost importance [2,4]. Current recommendations for EMS emphasize management of obesity through careful diet and exercise regimens [5,6]. Overlap of clinical signs for Pituitary Pars Intermedia Dysfunction (PPID), also known as Equine Cushing's disease, creates additional difficulty in diagnosis [7]. Affecting primarily older horses, PPID is the result of dopaminergic neurodegeneration of the pars intermedia of the pituitary gland. Clinical signs include hypertrichosis, polyuria, abnormal adiposity, and laminitis, which in some PPID horses may be due to endocrinopathic insulin dysregulation [3,8]. PPID can exist concurrently with EMS, and there is some evidence suggesting that underlying EMS may be the cause of overlapping characteristics of these two conditions, like insulin dysregulation and laminitis, as these findings are not present in all PPID cases [3,8,9].
Characterized as a cluster of signs associated with laminitis susceptibility, EMS may present physiologically as obesity, insulin dysregulation and hyperlipemia [10]. In this definition, obesity is the only outwardly observable clinical sign. In many cases obesity is present with insulin dysregulation, yet regional obesity may be absent in some horses with EMS [2,11]. The Henneke body condition scoring (BCS) system is widely used to classify obese or overweight horses, though it is at best only semi-objective [12]. Skillfully applied, the BCS is an accurate estimate of obesity, yet few owners receive specific training or practice in its use and, as a result, commonly underestimate body condition of their horses [13]. Morphometric measurements including the neck circumference at midpoint and heart girth circumference, relative to the height at the withers (NC/H and HG/H), can provide a quantitative estimate of obesity [14,15]. These measures of obesity may inform predictive estimates of future laminitis risk and/or a "pre-laminitic" condition, but the most informative measure may differ by breed and/or the tendencies of the individual horse [16,17].
Insulin dysregulation, stemming from excessive insulin secretion and peripheral insulin resistance, is the main endocrinopathy proposed in EMS [10]. Further tests such as circulating triglycerides, cholesterol, and leptin concentrations may be useful as elevated values are indicative of EMS [18,19]. Additional recommendations include diagnostic testing to rule out PPID as a possible diagnosis [20,21]. Although functional tests for insulin resistance and pituitary function are preferable, ambulatory veterinarians in the field may be limited to one-time basal glucose and insulin tests and circulating ACTH levels due to various constraints including cost to owner, access to a laboratory for rapid processing, and limited time [10]. Two previously published proxy measurements improved estimates of IR using a single basal glucose and/or insulin concentration [22]. The reciprocal of the square root of insulin (RISQI) estimates insulin sensitivity of peripheral tissues (RISQI) and the modified insulin-to-glucose ratio (MIRG) estimates β-cell response to glucose [22].
Unsupervised modeling methods utilizing multiple easily measurable physiological variables, including body measurements and serum diagnostics, might present a solution to increase accuracy when screening for EMS. In this type of approach, variable reduction methods like Principle Components Analysis (PCA) are commonly applied to complex diagnostic datasets for conditions as diverse as altitude sickness, Schizophrenia, and most relevant here, human Metabolic Syndrome [23][24][25]. PCA efficiently condenses these multivariate datasets in to a single vector explaining the largest possible proportion of variation within the dataset, and does not require a priori knowledge of "gold-standard" measures or underlying variables. The main goal of this study was to couple quantitative body scoring measures with single time point tests in order to describe and refine field-based testing for EMS. To this aim, we used principle component analysis (PCA) and hierarchical clustering to transform nine physiological variables into a comprehensive score for characteristics of EMS within the Arabian horse population. We then determined prediction expressions to simplify future investigations comparing these analyses against functional testing, longitudinally within our study cohort, and for evaluation of this approach across multiple breeds. Ultimately, this work will contribute to improved field screening methods for identifying horses with EMS.

Study protocol
All work was approved by University of Florida Institutional Animal Care and Use Committee and carried out by researchers accordingly (protocol #201408459). Owners of registered purebred Arabian horses were recruited through email, referring veterinarians, and online advertisements. Written consent was attained from the owner for each horse participating in this study.

Sample collection
Horses were initially selected for participation in the study based on responses from an ownercompleted online survey comprising 73 questions collected using Qualtrics survey software (Qualtrics, Provo, Utah). Surveys provided background information for each horse including descriptions of diet and exercise, age, sex, registration number, and owner assigned BCS. Additional medical history included previous diagnosis of disease, particularly of EMS or PPID, current medications, and any history of laminitis. Out of 109 horses volunteered through the initial online survey, 50 were excluded due to insufficient age (target age range >8 years old), or were currently receiving medications that could interfere with test results (primarily phenylbutazone in the case of the lameness exam, as well as cyproheptadine, and pergolide). Horses used in this study were also utilized in a concurrent genome-wide association, therefore [26], this effort targeted only pure-bred Arabians and some candidates were excluded due to mixed or unknown ancestry. Privately-owned farms were visited for sampling if at least one horse present was previously diagnosed or suspected, by a veterinarian or owner, of suffering from EMS. As many additional horses as possible, matching the study criteria for breed, sex, and age, were also sampled at each participating farm. Overall, 73 horses were enrolled, with nine horses excluded post-collection due to poor owner compliance with pre-testing fasting requirements, inability to verify purebred Arabian ancestry, or incomplete owner surveys. The final population dataset consisted of 34 non-pregnant females and 30 castrated males, aging from 8-34 years old with a mean of 16.45 years.
Samples were collected at 19 farms across north central Florida, from August to November 2014. Horses were housed indoors or at pasture and participated in various activities including breeding, pleasure, and performance (dressage, hunter, and endurance). Farm visits occurred between 6am and 10am, and owners were instructed to withhold feeding of any concentrated rations for 12 hours prior to the examination, though horses were allowed continual access to forage. Body measurements collected from each horse included heart girth (HG), height at withers (H), and neck circumference (NC) at mid-point from poll to withers, using measuring tape at landmarks previously established [14]. BCS assessed independently by three members of the research team was averaged for analysis (AVG BCS). During examinations, veterinarians recorded heart rate and respiration rate as part of an overall assessment of health. Furthermore, pasture asthma and anhidrosis are common conditions of mature horses in central Florida, warranting use of heart rate and respiration rate as sentinel diagnostics to catch suspected cases of these conditions. Blood samples were collected by jugular venipuncture in two 10 mL EDTA Vacutainer vials (BD Vacutainer1, Becton, Dickinson and Company, Franklin Lakes, NJ) for diagnostic endocrinology and a six mL potassium oxalate Vacutainer vial (BD Vacutainer1, Becton, Dickinson and Company, Franklin Lakes, NJ) for glucose concentrations. On-farm processing of potassium oxalate and EDTA vials consisted of centrifugation at 500 g for 10 minutes, transfer of plasma by pipette to 1-2 mL aliquots into cryovials and flash freezing by immersion in liquid nitrogen. Aliquots were moved for storage to a -80˚C freezer within four hours of collection. Freezing of plasma samples in liquid nitrogen directly after collection minimalized variation in sample quality due to inconsistent handling, environmental conditions and degradation. Potassium oxalate preserved aliquots of plasma were submitted to University of Florida Veterinary Diagnostic Research Laboratory for measurement of glucose concentrations (mg/ dL) using the Dimension1 Xpand Plus integrated chemistry system (Siemens, Erlangen, Germany). EDTA preserved aliquots of plasma were submitted to the Cornell Animal Health Diagnostic Center (accredited by American Association of Veterinary Laboratory Diagnosticians and USDA) for ACTH, leptin, cholesterol and triglycerides. Cholesterol and triglycerides were measured using a Roche ModP chemical analyzer (Roche Diagnostics Indianapolis, Indiana), leptin was measured using the Millipore Multispecies Leptin RIA (Linco Research Inc., St. Louis, MO), and insulin was measured using Millipore porcine insulin RIA kit (EMD Millipore Corporation, Darmstadt, Germany) as previously described [27]. The ACTH measurements were performed using an automated chemiluminescent enzyme immunoassay system (Immulite, Diagnostic Products Corporation, Los Angeles, CA), previously validated by Perkins et al., 2002 [28].
Relative levels of exercise, as reported in owner surveys, were determined using published parameters [29]. Ambient temperature was established based an average morning temperature from online records (www.wunderground.com) according to date and location.
Body measurements for each horse were calculated into ratios HG/H and NC/H, [14] as obesity estimates for comparison to BCS. Insulin dysregulation proxies were calculated using basal insulin and glucose concentrations including glucose:insulin ratio, reciprocal inverse square of basal insulin (RISQI), and modified insulin-to-glucose ratio (MIRG) [22].

Statistical analyses
Values for age, NC/H, HG/H, AVG BCS, ACTH, leptin, MIRG, triglycerides, and cholesterol were analyzed by Principle Component Analysis (PCA). Shapiro-Wilk tests determined a nonnormal distribution for variables of age, plasma ACTH and plasma leptin. Therefore, a Varimax factor rotation was performed on the two principle components determined to be statistically significant by the Bartlett test of eigenvalues (p < 0.05). Samples were clustered using a hierarchical approach into three groups based on factors 1 and 2 (assuming three possible diagnoses). To illustrate trends between the resulting clusters and the original variables, ANOVA tests were used to assess seven of the original nine variables. Two variables, ACTH (log transformed for normalcy) and triglycerides, failed a Levene's test for unequal variances and were therefore described using a Kruskal-Wallis test. A standard least square model estimated the original two factor scores using a reduced set of diagnostic measures that were more typical of tests performed in a clinical setting. Remaining measures not used in the PCA were including vital signs, exercise level and ambient temperature, were evaluated for relationships using ANOVA tests. All statistical analysis was performed using JMP 121 (SAS Institute, Cary, NC) [30].

Results
Owner reported BCS values underestimated the adiposity of their animals by an average of 1.0 BCS unit (p < 0.0001) compared to the researcher average scores (Fig 1). The ratio of heart girth to height (HG/H) correlated positively with both owner-reported and averaged researcher BCS scores. This indicates that owner underestimation of adiposity in their horses occurred almost uniformly across the range of the BCS scale, and demonstrates the utility of body measures as a quantitative estimation for obesity. As expected, resting heart and respiration rate on the day of sampling did not correlate with endocrine values. Heart rate positively correlated with AVG BCS (R 2 = 0.074, p = 0.033) and inversely with exercise level (R 2 = 0.24, p = 0.0024). Respiration rate positively correlated with the presence of a sweat dampened hair coat at the time of sampling (R 2 = 0.19, p = 0.0003) and with ambient temperature (R 2 = 0.22, p < 0.0001). BCS was not related to owner-reported exercise level (p = 0.80) but was inversely correlated with age (R 2 = 0.12, p = 0.0045). HG/H was the only body measurement to significantly differ by sex, as females had higher HG/H than castrated males (R 2 = 0.14, p = 0.0011). In a comparing obesity measurements with previously described insulin dysregulation estimators, only MIRG and RISQI were significantly correlated to all three morphometric measures, HG/H, NC/H and AVG BCS scores (Table 1).
Correlative PCA followed by factor rotation on nine physiological measures distilled relationships between these variables into two factor scores, Factor 1 and Factor 2, explaining 30.1% and 16.3% of the variation respectively (Fig 2). Factor 1 correlated positively with variables typically elevated in obesity and EMS, including AVG BCS, NC/H, HG/Height, triglycerides, leptin, and MIRG. Factor 2 positively correlated PPID characteristics; age, cholesterol, and ACTH.
Hierarchical clustering divided the study population into three groups based on the factor 1 and factor 2 scores for each animal (Fig 3). These clusters were termed the "Healthy" (n = 21), "PPID-suspect" (n = 28), and "EMS-suspect" (n = 25) groups based on the distributions of the original nine variables within each group ( Table 2). The "PPID-suspect" group had significantly higher cholesterol, age and ACTH concentrations, compared to the "EMS-suspect" and "Healthy" clusters. AVG BCS, HG/H, NC/H, leptin, and triglycerides were significantly higher in the "EMS-suspect" group compared to the others.
Linear regression of variables on PCA factors provided reduced models for estimating the factor scores using a reduced set of diagnostics as follows:

Discussion
The overall goal of this study was to use unsupervised statistical modelling to describe the variation within the obesity and EMS phenotype in an Arabian horse population, without the constraint of a prior diagnosis. The variables utilized consisted of relatively straightforward body measurements and scores, combined with commonly used clinicopathological indices of insulin, glucose, lipids, leptin, and ACTH. Application of an unsupervised statistical method allowed examination of variation among our diagnostic criteria without assumptions derived from previously published "gold-standard" references ranges and disease definitions. The factor scores illustrate the spectrum-like range of severity for this condition, allowing independent placement across the "PPID-suspect" or "EMS-suspect" axes with varying magnitude. This quantitative approach more closely resembles the spectrum of clinical signs observed in these conditions than neat diagnostic categories based on a single assay. The utilization of a numerical method for quantifying obesity was necessary in order to perform PCA analysis. This is an important point in our study, as only numerical values are suitable for PCA. Owners consistently underestimated the body condition of their horses. Owner provided scores were nearly one unit lower than the researchers averaged scores across the entire breadth of the scale, supporting previous findings [13,31]. This disagreement may stem from insufficient training in use of the BCS system, or from a human tendency to be overlyoptimistic in diagnosis of obesity in companion animals [32]. Our findings support the HG/H ratio as a quantitative measure for obesity as it correlated well with BCS and "EMS-suspect" score, in agreement with previous findings [15,33]. We observed significantly higher HG/H measures in non-pregnant females than in castrated males. This could be result of a higher prevalence of EMS among females, or a physiological tendency for mares to deposit more adipose tissue around the heart girth area.
Although HG/H alone is not sufficient for estimation of body weight, it may be an ideal measurement for estimation of adiposity by owners as they can repeatedly find these landmarks on their horse and the procedure is similar to the commonly used calibrated "weight tapes" [34], The heart girth measurement alone reveals more subtle changes in adiposity compared to BCS. [35,36]. Allowing owners to more accurately measure small body mass changes could result in increased owner compliance to stricter feeding regimens and earlier prediction of weight gain and inappropriate deposition of fat. Factor 1 ("EMS-suspect" score) positively correlated with variables typically characteristic of EMS including obesity measurements, MIRG, leptin, and triglycerides [15,18]. One aspect of this study was the lack of severe EMS cases within our population. Severe hyperinsulinemia (circulating insulin above 70 μIU/mL) and/or a history of severe laminitis did not occur in within our sample set [37]. This may be a result of the participant screening process, in which horses administered medications were excluded. Horses suffering from severe cases of EMS would likely be treated for laminitis or hyperinsulinemia with medications that would alter the physical exam, and consequently, confound our results. Only eight horses within our population could be traditionally deemed EMS cases as they exceeded all three published screening guidelines for EMS diagnosis including a BCS of seven or greater, plasma leptin higher than seven ng/mL, and fasting insulin concentration over 20 μIU/mL [4]. Our EMS samples represented mild cases or early-stage EMS, also referred to as pre-laminitic metabolic syndrome, thus allowing specific examination of the early stages of the condition and those as risk for developing EMS. According to a previous report, elevated triglycerides and BCS function as proxies to indicate a pre-laminitic metabolic syndrome diagnosis in ponies [19]. Another study conducted in ponies suggested measurements of blood pressure and dynamic measures of insulin dysregulation as useful in early detection of EMS [38]. Within our population of Arabian horses, the "EMS-suspect" score correlated with elevation in all three of these pre-EMS biomarkers, further supporting this summary score as an indicator of metabolic irregularities.
In attempt to lower costs accrued by multiple diagnostic tests, we were able to create a proxy estimating 77.6% of the variance captured by the "EMS" score using tests most frequently used for endocrine testing including fasting insulin and glucose, with the addition of HG/H. This proxy may be a more practical method for clinical application as it requires only one trip to the farm, and reduces cost for owners in comparison to the oral glucose tolerance test, combined glucose-insulin tolerance test, or a full diagnostic panel including leptin and triglycerides. However, the addition of leptin values to this model captures 93.4% of the variance of the "EMS-suspect" score, supporting the inclusion of leptin testing for EMS diagnosis.
We did not expect a significant amount of variance within our sample set to consist of characteristics of a condition that was specifically excluded prior to collection. Our ACTH values were recorded during late summer/fall when ACTH is normally elevated, and the magnitude of this elevation is up to 2.8 times greater in PPID animals than those without [39,40]. By collecting during late summer and fall, horses with underlying PPID that lacked visible signs of the condition were more easily identified than in other seasons, thus allowing for a more definitive distinction of PPID cases from EMS cases [40]. Yet, Factor 2 still captured significant variation in plasma ACTH and age, both key characteristics of PPID, suggesting the label "PPIDsuspect" score. We identified many horses with severely elevated circulating ACTH concentrations that lacked clinical signs of PPID. These horses may have subclinical disease, be at risk of developing PPID in the future, or are simply exhibiting asymptomatic elevated ACTH concentrations due to old age or other conditions [21].
High circulating cholesterol is reported as a sign of EMS [41], yet we observed elevated cholesterol strongly correlated only to the "PPID-suspect" score, a finding more closely in line with those of Elzinga et al., 2016 [27]. The correlation of circulating cholesterol with the "PPID-suspect" score may result from endocrine dysfunction caused by PPID, or simply be a result older age. However, we observed that age and basal ACTH concentrations alone were sufficient to identify horses as possible PPID cases, as both variables explained 99.9% of variation in the original "PPID-suspect" score [21].
Heart and respiration rates reflected ambient temperature (perhaps a result of heat stress) not EMS or obesity. Fitness (as determined by owner-reported exercise activity) correlated with a lower resting heart rate among horses performing more intense exercise regimes. However, horses with lower BCS scores exhibited higher resting heart rates than their higher BCS counterparts. Therefore, vital signs and fitness were not reliable indicators of obesity or the EMS phenotype. Early identification of at risk individuals through other approaches offers opportunity for intervention such as weight loss and diet restriction, especially during spring and summer months to reduce the risk of developing laminitis [19].
Our results suggest consideration of EMS, especially in the early stages, as a complex condition that presents as a spectrum of disease, rather than a threshold diagnosis. This approach assessed the multidimensionality of EMS, allowing improved identification of horses with only mild EMS characteristics that could be at risk for developing clinical EMS. Proxies created in this study provide a reduced model for identifying horses at risk for EMS without added cost and labor of the full-scale model. Findings from this study may be useful to improve the accuracy of identifying individuals at-risk for EMS, particularly in the Arabian breed.
Longitudinal follow-up with the horses from this study may validate the predictive value of our scoring system. It should be noted that this study was performed on mature horses within a single breed displaying relatively mild clinical signs. Longitudinal studies are needed in broader populations of horses, including those with more severe EMS and/or PPID conditions, in order to evaluate the accuracy of our disease proxies. Evaluation of similar proxies in other breeds at risk for endocrine issues, such as Morgan horses, and in breeds with few EMS cases like the Thoroughbred, may be particularly relevant given suspected genetic predilections to this condition.
Supporting information S1 Table. Raw diagnostic measures, derived normalized measures, resulting factor scores and clusters for the sampled population of horses. (XLSX)