Non–Laboratory-Based Self-Assessment Screening Score for Non-Alcoholic Fatty Liver Disease: Development, Validation and Comparison with Other Scores

Background Non-alcoholic fatty liver disease (NAFLD) is a prevalent and rapidly increasing disease worldwide; however, no widely accepted screening models to assess the risk of NAFLD are available. Therefore, we aimed to develop and validate a self-assessment score for NAFLD in the general population using two independent cohorts. Methods The development cohort comprised 15676 subjects (8313 males and 7363 females) who visited the National Health Insurance Service Ilsan Hospital in Korea in 2008–2010. Anthropometric, clinical, and laboratory data were examined during regular health check-ups and fatty liver diagnosed by abdominal ultrasound. Logistic regression analysis was conducted to determine predictors of prevalent NAFLD and to derive risk scores/models. We validated our models and compared them with other existing methods using an external cohort (N = 66868). Results The simple self-assessment score consists of age, sex, waist circumference, body mass index, history of diabetes and dyslipidemia, alcohol intake, physical activity and menopause status, which are independently associated with NAFLD, and has a value of 0–15. A cut-off point of ≥8 defined 58% of males and 36% of females as being at high-risk of NAFLD, and yielded a sensitivity of 80% in men (77% in women), a specificity of 67% (81%), a positive predictive value of 72% (63%), a negative predictive value of 76% (89%) and an AUC of 0.82 (0.88). Comparable results were obtained using the validation dataset. The comprehensive NAFLD score, which includes additional laboratory parameters, has enhanced discrimination ability, with an AUC of 0.86 for males and 0.91 for females. Both simple and comprehensive NAFLD scores were significantly increased in subjects with higher fatty liver grades or severity of liver conditions (e.g., simple steatosis, steatohepatitis). Conclusions The new non–laboratory-based self-assessment score may be useful for identifying individuals at high-risk of NAFLD. Further studies are warranted to evaluate the utility and feasibility of the scores in various settings.


Introduction
Non-alcoholic fatty liver disease (NAFLD) is pathologically defined as accumulation of fat, mainly triglycerides, in hepatocytes, with no evidence of significant alcohol consumption or other secondary causes [1,2]. This includes the entire spectrum of fatty liver conditions, ranging from simple hepatic steatosis through steatohepatitis to cirrhosis. NAFLD is one of the most common metabolic liver disorders, and its incidence is increasing rapidly. The prevalence of NAFLD is between 6.3% and 33% depending on the population [2][3][4], and is expected to rise in the future as the rate of obesity increases, populations become older, and physical activity levels decrease.
NAFLD is associated with serious complications and mortality, and places a large burden on public healthcare systems, as well as patients [5,6]. It not only impairs health-related quality of life, but is also closely related to metabolic syndrome, dyslipidemia, diabetes, and cardiovascular disease [5,7,8]. Subjects with NAFLD demonstrated increased all-cause, cardiovascular and liver-related mortality in general US [9], European [10] and Asian populations [11].
Considering the clinical impact of NAFLD on public health, and its high prevalence, timely screening and detection could be essential to avoid further NAFLD-related morbidity, reduce healthcare costs, and promote early lifestyle interventions that may prevent or delay deterioration of the disease [12]. As NAFLD is usually asymptomatic, it is difficult to predict or determine whether individuals have NAFLD in community settings. NAFLD is diagnosed mostly using imaging modalities such as hepatic ultrasound or computed tomography. These methods are expensive, and can be complicated or inconvenient, and are thus not practical or feasible in the general population. Therefore, establishing a simple screening test or risk assessment tool could be useful not only for identifying individuals at high-risk of NAFLD, but also educating the general public about associated risk factors [13]. A few risk-assessment algorithms have been developed to identify individuals at high-risk of NAFLD [14][15][16][17][18]. Most were derived from relatively small samples (e.g., ,600 subjects) and lack external validation, and all models include variables that are less practical or feasible, such as laboratory profiles that require additional blood assays and/or complicated equations that require calculators. These major barriers, which prevent laypersons from using these models, may partly explain why they have not been widely accepted in practice.
Therefore, the aim of our study was to develop and validate a self-assessment score for NAFLD risk in the general population using simple clinical parameters 2 including demographics, anthropometrics and lifestyle risk factors 2 to provide a reliable and easy tool usable by laypersons with or without the assistance of a clinician. We also developed a more accurate and comprehensive model that can further account for biochemical parameters when this information is available. Finally, we compared the new algorithm with existing models.

Ethics statement
The study protocol was approved by the institutional review board of Ilsan Hospital (SU-YON 2013-02). Informed consent requirement for this study was exempted by the institutional review board because researchers only accessed the database for analysis purposes, and personal information was not accessed.

Data Source and Subjects
The 'development' cohort, named the National Health Insurance Service (NHIS) Registry, was established from 18765 individuals aged $20 years who visited the NHIS Ilsan Hospital in Korea for comprehensive health examinations between 2008 and 2010, and was used for prediction modeling. Figure S1 illustrates a flow diagram of the study design. Subjects who met the following criteria were excluded based on our protocol: (1) alcohol consumption .140 g/week for males and 70 g/week for females (N = 778); (2) positive serologic markers for hepatitis B (N = 752), hepatitis C virus (N = 130), or human immunodeficiency virus (N = 1); (3) presence of thyroid disease, including hyperthyroidism, hypothyroidism, or thyroid hormone replacement therapy (N = 118); (4) abnormal ultrasonographic liver findings (i.e., suspected hepatocellular carcinoma, hepatic mass, or signs of Clonorchis sinensis) (N = 971); and/or (5) absence of questionnaire data or anthropometric measurements (N = 1135). Ultimately, 15676 subjects (8313 males and 7363 females) were eligible for the analysis.
The 'external validation' dataset was obtained from comprehensive health check-up data for Kangbuk Samsung Hospital from 2008, which have been described previously in detail [19]. Briefly, 66868 subjects aged $20 years (46896 males and 19972 females) were selected for the validation study after applying the same set of exclusion criteria.

Data and Measurements
We used demographic and personal and family medical history data, and data on lifestyle/behavioral factors such as smoking and alcohol consumption, physical activity, and anthropometrics. Laboratory parameters were also measured in the morning after overnight fasting for at least 8 h. Subjects previously diagnosed with diabetes by a healthcare professional or taking anti-diabetic drugs based on the health interview survey were classified as having diabetes. Hypertension was defined as diagnosis by a physician or treatment with antihypertensive medication. We defined dyslipidemia according to the National Cholesterol Education Program-Adult Treatment Panel III [20] as a total cholesterol level of $200 mg/dL, a triglyceride level of $150 mg/ dL, a HDL cholesterol level of ,45 mg/dL in males or ,50 mg/ dL in females, a LDL cholesterol level of $130 mg/dL, or selfreported use of prescribed cholesterol-lowering medication.
Smoking status was categorized as never, ex-, or current smoker on the basis of lifetime exposure to cigarettes. Daily alcohol consumption was quantitated by types of beverages, frequency of drinking, and average amount of alcohol consumed on each occasion, as described previously [21]. After excluding subjects with excessive alcohol intake (as one of the exclusion criteria), alcohol consumption was categorized as non-drinker or current drinker. Exercise status was assessed by self-reported questionnaires that included questions about the duration, frequency, and types of exercise. Regular exercise was then defined as engaging in physical activity for at least 30 min twice or more per week.
In all subjects, abdominal ultrasonography (Sonoline Antares MSC 2704 AB, Siemens Medical Solutions, Issaquah, WA) was performed using a 3.5-MHz transducer by trained radiologists who were blinded to the patients' clinical and laboratory data. The severity of fatty liver was categorized into three grades-mild, moderate, and severe-based on standard criteria [22]. Then, we finally defined the status of fatty liver as presence vs. absence.

Statistical Analyses
In the development and validation datasets, continuous variables are expressed as the means 6 standard deviation (SD), and categorical variables are presented as frequencies with percentages. For model development, we performed multiple logistic regression analyses with NAFLD as the endpoint. We included a list of candidate predictors for NAFLD in an initial regression model, with variables selected based on P-values,0.2 in univariate analyses [23]. To create the 'comprehensive' model from the initial model, backward elimination was performed until we generated a final model with statistically significant covariates. Then, we further derived a simpler, parsimonious model that could be used by patients for self-assessment with or without input from a clinician. In the 'simple' model, laboratory parameters were avoided and continuous variables were categorized using user-friendly cut-off points. We created a weighted scoring system by assigning b coefficients in the final model to integer values, while preserving monotonicity. Of note, in variable selection, categorization, and scoring, clinical and practical judgment as well as statistical significance were utilized [24]. We decided to develop sex-specific models to account for the somewhat different risk factors and cut-off points for different sexes. The goodness of fit of the models was assessed using the Akaike information criterion (AIC), and the discrimination ability by area under the receiver operating characteristic curve (AUC).
Next, we compared our new risk scores with the following screening models for NAFLD using the development and validation datasets: the Fatty Liver Index [14] and NAFLD liver fat score [15] from European populations; the Hepatic steatosis index [16]; and Park's index for NAFLD [18] from Asian populations. As evaluation measures, we computed sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), likelihood ratios (LRs) (positive and negative), the Youden index, and AUC [21,25,26]. Imputation was used to handle missing data for fasting insulin (development dataset), menopause status and alcohol consumption (validation dataset). Additionally, we fitted the simple model (derived from the development dataset) to (1) the validation dataset, to assess the consistency of the results, and (2) subjects with excessive alcohol intake (N = 691), to assess the sensitivity/robustness of its discrimination ability. NAFLD fibrosis score [27] was used as a surrogate index for defining poor condition of NAFLD (advanced fibrosis). The formula is: NAFLD fibrosis score = 1.675+ 0.0376age (years)+0.0946BMI (kg/m 2 )+1.136IFG/diabetes (yes = 1, no = 0)+0.996AST/ALT ratio 0.0136platelet (610 9 /l) 0.666albumin (g/dl). Jonckheere-Terpstra test was used to conduct trend analysis. Statistical analyses were conducted using SAS version 9.2 (SAS Institute, Cary, NC), SPSS version 20.0 (SPSS, Chicago, IL) and MedCalc (version 13.1).

Characteristics of subjects in the development and validation datasets
The characteristics of the study subjects are summarized in Table 1 according to NAFLD status. The prevalences of NAFLD (based on ultrasonographic findings) were 41 and 30% in the development and validation datasets, respectively. The higher prevalence in the development dataset may be explained by the higher mean age. In both datasets, subjects with NAFLD tended to be older and more obese, to exercise less, and to have higher laboratory values for metabolic factors, compared to those without NAFLD. Furthermore, males, those with hypertension or diabetes, and postmenopausal females were more likely to have NAFLD.
Development of comprehensive and self-assessment models/scores for NAFLD Table 2 describes the final2comprehensive and simple2 regression models derived from the development dataset. In the   comprehensive model, age, alcohol consumption, and regular exercise were significantly associated with NAFLD in males, while diabetes and menopause were significantly associated with NAFLD in females. WC, BMI, and laboratory covariates such as fasting glucose, lipid profiles, uric acid, and liver enzymes were significant predictors, regardless of sex. The comprehensive model yielded an AUC of 0.86 for males and 0.91 for females. The score derived from the comprehensive model (designated the 'comprehensive score') ranges from 0 to 100 and can be directly interpreted as the 'average' probability of having the disease among persons with similar risk factor profiles. In the simple model, age, WC, BMI, diabetes, dyslipidemia, and exercise were significant for both sexes (most P,0.001). Multiple categories (with scores of 0 to 4) were introduced to capture the risk gradient of obesity measures (BMI and WC), whereas other risk factors were binary. The 'simple or self-assessment score' (0-15) was developed from the simple model, where the seven risk factors jointly yielded an AUC of 0.82 for males and 0.88 for females. The AIC and AUC values of the various models are summarized in Table S1.

External validation
We investigated the diagnostic characteristics of different screening score cut-off points in the development and validation datasets. For the comprehensive score, $40 was selected as the cut-off point to define individuals with a high risk of NAFLD as it gives the highest value for the Youden index (data not shown). For the simple score, we selected a cut-off point of $8, which yielded sensitivities of 75% and 68%, specificities of 71% and 85%, and AUC values of 0.80 and 0.85 in males and females, respectively ( Table 3). The comprehensive score and simple score (in females) yielded the highest overall test accuracy (reflected by the Youden index) and the largest AUC in both datasets, while the performance of the simple score (in males) was comparable in performance of the other risk models. Loss of accuracy and discrimination with the simple score was minimal, despite it not relying on difficult health information or formulae. Comparison analysis of AUC among screening models in the validation dataset showed that the performance of comprehensive score and Fatty liver index were superior to those of other models (Table S2). When our simple model was refitted to the validation dataset, similar results were obtained: all of the risk factors were statistically significant, and the direction and magnitude of the associations were comparable, with an AUC of 0.80 in males and 0.85 in females (Table S3). Figure 1 shows the prevalence of NAFLD according to total simple score for each sex group in the development and validation datasets. The prevalence of NAFLD increased as the risk scores increased. Figure 2 provides a sample questionnaire for the risk assessment of NAFLD that may be used by laypersons, as well as healthcare providers.

Ancillary analyses
As a sensitivity analysis, we applied the simple score to the subjects with excessive alcohol consumption (N = 691) who were initially excluded from the analysis. The simple score yielded similar AUCs (0.82 for males and 0.87 for females), suggesting that the discriminatory ability was preserved, even in a specific population highly susceptible to alcoholic fatty liver.
The simple and comprehensive scores were gradually increased in subjects with higher fatty liver grades (determined by hepatic ultrasound) (all P,0.001; Figure 3A and B). Furthermore, subjects categorized as having advanced fibrosis based on NAFLD fibrosis score [27] showed significantly higher simple and comprehensive scores compared to other subjects with negative or intermediate results from NAFLD fibrosis which denotes less likely to have advanced fibrosis in their liver. (all P,0.001; Figure 3C and D). These findings indicate that the present scores can reflect the severity of fatty liver and be applied to discriminate advanced stage of non-alcoholic steatohepatitis (NASH) from simple steatosis.

Discussion
We developed and validated models that could identify subjects at high-risk of NAFLD. The simple model is based on demographic, clinical, and anthropometric variables-age, WC, BMI, diabetes, dyslipidemia, exercise, alcohol intake, and meno- pause status-while the comprehensive model additionally includes laboratory parameters such as lipid and liver enzyme profiles. Newly developed NAFLD screening models/scores may serve as a doctor-patient communication tool and an initial step to identify high-risk subjects, who can be referred for further blood assays or imaging tests such as ultrasonography, possibly in conjunction with preventive measures or early interventions to manage NAFLD. Depending on the availability of health-related information and targeted accuracy, either or both models may be used. NAFLD is regarded not only as a common disease in Western societies but also as an emerging problem in many Asian countries [28]. That the prevalence of NAFLD can further increase as the number of obese people in Asia increases is supported by the findings that 65 and 85% of subjects with a BMI of 30-40 and $ 40 kg/m 2 , respectively, had NAFLD [29]. Despite its high prevalence and potential impact, recent studies highlighted that, even among high-risk patients, 87% of people did not know they had NAFLD [30] and 51% of healthy potential liver donors were incidentally confirmed as having NAFLD by liver biopsy [31], indicating that alarming proportions of patients with NAFLD are unaware of their illness and are undiagnosed. Therefore, a simple risk score that can efficiently and effectively screen high-risk individuals for NAFLD in communities, as well as in clinical settings, could help improve personal and population health, and public awareness and education about this less known disease. Notably, even in developed countries such as Hong Kong, 83% of the general population had never heard of NAFLD and 78% of respondents who understood the term had a misconception about this common condition [13]. We speculate that this is also a common phenomenon in other settings.
To date, studies have focused on searching for novel biomarkers or developing models that can predict progression of NAFLD to NASH [27,32]. Although NASH is a more serious liver disorder that may progress to cirrhosis, early detection of mild forms of NAFLD, such as simple steatosis, is also an important and promising field from a public health perspective. As fatty liver is a more prevalent but reversible disease, identification of individuals at high-risk of NAFLD through proper risk assessment and subsequent lifestyle modification may restore their hepatic condition and prevent progression to NASH or other related morbidities.
Our study has several distinguishable features. First, to our knowledge, the model was developed based on the largest general population with hepatic ultrasound-defined NAFLD. Many studies examined relatively small numbers of subjects [14,15,33] or used surrogate markers such as liver enzymes to predict NAFLD [34]; however, a significant number of patients with NAFLD have normal liver enzyme levels. Second, our simple score is easy to use and is based on readily available variables. Thus, laypersons can calculate and learn about their own risk-with or without help from healthcare providers-and can initiate a discussion with their physician if necessary. Third, the score includes modifiable risk factors such as obesity (defined by WC and BMI), exercise, and alcohol consumption, so users can learn about important risk factors and could be motivated to change or improve their lifestyle/health habits. For example, if subjects at high-risk of NAFLD reduce their body weight, start exercising regularly, and abstain from drinking, their risk scores could decrease. Lastly, the sensitivity analysis indicated that the simple score is well applicable to predict hepatic steatosis in subjects with heavy drinking as well. This may be explained by the findings that fatty liver is more strongly affected by obesity than by excessive alcohol consumption [35]. Of note, our scores are sex-specific, unlike in previous models. WC, which reflects central obesity, had a stronger association with NAFLD in males than in females, while BMI, which is an index of overall obesity, showed a stronger association with NAFLD in females compared to males in our two independent datasets. This observation may be due to male subjects having more visceral adipose tissue than females, and suggest that males may be more susceptible to visceral fat deposition, which leads to accumulation of fat in the liver [36]. In females, the OR was dramatically increased in subjects aged over 50 years. This increase was offset by adjustment for menopause status in the logistic model, indicating that menopause may be more influential than age for NAFLD in females. This finding is supported by epidemiologic and experimental evidence that insulin resistance and visceral fat levels are significantly increased in postmenopausal females [37], and that estrogen protects against hepatic steatosis [38]. Among biochemical variables, ALT and AST made the greatest contribution to the prediction of NAFLD, followed by triglycerides, consistent with previous findings [14 -16,18]. In addition, one of the components of the comprehensive model was serum uric acid level, which has been proposed to be a risk factor for NAFLD [39].
The present study has some potential limitations, which could be addressed by future investigations. First, diagnosis of NAFLD by ultrasonography could underestimate the actual prevalence of NAFLD in this population. Second, as this risk score was derived from a cross-sectional study, its use for the prediction of future development of NAFLD may be limited. However, cross-sectional data are well suited for screening for prevalent, undiagnosed cases, which generally precedes the prediction of incident, new cases. Third, the models/scores were derived from the general population of an Asian ethnic group, which may limit their generalizability and applicability to non-Asian or other Asian populations. Notably, considering the differences in the definitions of obesity between Western and Asian countries, cut-off points for factors related to obesity (e.g., WC and BMI) will be subject to modifications depending on the population, or new models may be warranted [40].

Conclusions
The present results demonstrate that new screening scores for NAFLD performed well compared to existing models, and has some notable advantages (e.g., no laboratory tests required, selfassessment). Our simple self-assessment score for NAFLD risk could be useful for both primary care practitioners and laypersons as a screening and counseling tool. Future research is warranted to verify the effectiveness, usefulness, and feasibility of our models in various practical settings, and potentially to revise or adapt them for other populations.