Development of Risk Score for Predicting 3-Year Incidence of Type 2 Diabetes: Japan Epidemiology Collaboration on Occupational Health Study

Objective Risk models and scores have been developed to predict incidence of type 2 diabetes in Western populations, but their performance may differ when applied to non-Western populations. We developed and validated a risk score for predicting 3-year incidence of type 2 diabetes in a Japanese population. Methods Participants were 37,416 men and women, aged 30 or older, who received periodic health checkup in 2008–2009 in eight companies. Diabetes was defined as fasting plasma glucose (FPG) ≥126 mg/dl, random plasma glucose ≥200 mg/dl, glycated hemoglobin (HbA1c) ≥6.5%, or receiving medical treatment for diabetes. Risk scores on non-invasive and invasive models including FPG and HbA1c were developed using logistic regression in a derivation cohort and validated in the remaining cohort. Results The area under the curve (AUC) for the non-invasive model including age, sex, body mass index, waist circumference, hypertension, and smoking status was 0.717 (95% CI, 0.703–0.731). In the invasive model in which both FPG and HbA1c were added to the non-invasive model, AUC was increased to 0.893 (95% CI, 0.883–0.902). When the risk scores were applied to the validation cohort, AUCs (95% CI) for the non-invasive and invasive model were 0.734 (0.715–0.753) and 0.882 (0.868–0.895), respectively. Participants with a non-invasive score of ≥15 and invasive score of ≥19 were projected to have >20% and >50% risk, respectively, of developing type 2 diabetes within 3 years. Conclusions The simple risk score of the non-invasive model might be useful for predicting incident type 2 diabetes, and its predictive performance may be markedly improved by incorporating FPG and HbA1c.


Introduction
The prevalence of type 2 diabetes is rapidly increasing worldwide [1]. The International Diabetes Federation estimates that 382 million people had diabetes in 2013, and this number is expected to rise to 592 million by 2035 [1]. In addition, the number of those with prediabetes, a high risk state of developing diabetes, is projected to reach 472 million by 2030 [2]. Around 5-10% of individuals with prediabetes become diabetic every year, and up to 70% of those with prediabetes eventually develop diabetes [2]. In Japan, the prevalence of diabetes increased from 6.9 to 9.5 million between 1997 and 2012 [3]. Given that diabetes and associated complications decrease quality of life and represent a major health-care burden worldwide [2], detecting individuals at high risk of developing diabetes is important for prevention.
A number of risk factors for diabetes have been identified, including obesity, smoking habit, hypertension, level of physical activity, and family history of diabetes [4]. Numerous studies in Western populations have reported that weighted models and scores developed using these factors could identify individuals at high risk of developing diabetes [5]. Given the large differences in diabetes risk across ethnic groups [6], however, the performance of each model and score may differ by ethnicity [7]. In East Asia, several studies have developed risk models [8][9][10][11][12][13][14][15]. Of these, three studies were conducted in Japan [10,11,14] but did not include participants under the age of 40 [10,11,14] and did not validate the developed risk model [14]. Given the relatively high proportion of undiagnosed diabetes mellitus among younger age groups (under 40 years old) [16], models capable of identifying high-risk participants in this age group are still necessary. In addition, these scores require some variables including family history of diabetes and physical activity which are not routinely available or uniformly collected in general health examination in Japan, and thus are of limited use for wider population.
Here, we developed risk scores using only non-invasive risk factors initially and then incorporated laboratory measurements, such as fasting plasma glucose (FPG) and glycated hemoglobin (HbA1c), to predict 3-year incidence of type 2 diabetes. We then assessed the validity of the developed risk score in a large-scale multi-center study among Japanese workers.

Study Procedure
The Japan Epidemiology Collaboration on Occupational Health (J-ECOH) Study is an ongoing multi-center epidemiologic study among workers from several companies in Japan. A total of 12 companies covering various industries (electric machinery and apparatus manufacturing, steel, chemical, gas, non-ferrous metal manufacturing, automobile and instrument manufacturing, plastic product manufacturing, health care) participated in the J-ECOH study. In Japan, employees are obliged to undergo general health examination at least once a year under the health and safety law. As of August 2013, nine participating companies provided health check-up data obtained between January 2008 and December 2012 or between April 2008 and March 2013, which were combined to create an analytic database. The data of the earliest examination (mostly in 2008) was regarded as baseline, but if a 2008 dataset contained a large number of missing data, data from the 2009 examination was used as baseline (one company). We excluded data from one company due to a large number of missing data for both the 2008 and 2009 datasets.

Ethics Statement
Prior to the collection of data, the conduct of the J-ECOH Study was announced in each company by using posters that explained the purpose and procedure of the study. Participants did not provide their verbal or written informed consent to join the study but were allowed to refuse their participation. This procedure conforms to the Japanese Ethical Guidelines for Epidemiological Research, where the procedure of obtaining consent may be simplified for observational studies using existing data. The study protocol including consent procedure was approved by the Ethics Committee of the National Center for Global Health and Medicine, Japan (NCGM-G-001140-05). Most participating companies provided data in either anonymized or de-identified form, but a few other companies provided data including identifiable information, which was removed from analytic database. The data are hosted in the National Center for Global Health and Medicine. Currently, the data cannot be widely shared because the research group has not obtained permission from participating companies to provide the data on request. However, the data can be requested by academic researchers for non-commercial research; inquiries and applications can be made to Department of Epidemiology and Prevention, Center for Clinical Sciences, National Center for Global Health and Medicine, Tokyo, Japan (Dr. Mizoue, mizoue@ri.ncgm.go.jp).

Subjects
From a total of 82,380 participants in eight companies who received a health checkup in 2008 or 2009, 16,198 participants under 30 years old at baseline were excluded because the majority of them did not receive blood test, which is required for employees aged 35 years and 40 years or older by the Industrial Safety and Health Act in Japan. Of the remaining 66,182 participants aged 30 years or older, the present study included 53,216 participants who received health checkup 3 years after the baseline. Of these, we excluded participants who had diabetes at the baseline (n = 3512), who had missing information on glucose (n = 3192), HbA1c (n = 3274), or medical treatment of diabetes (n = 212), and who received blood sample in non-fasting status (n = 5321) or lacked information on fasting status (n = 1905). Some participants met more than one of the exclusion criteria. After further exclusion of 2031 participants with missing data for variables used to develop risk score of diabetes, including body height, body weight, waist circumference, blood pressure, smoking status, low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol, triglyceride, and drug use for hypertension or dyslipidemia, 37,416 participants (32,040 men and 5376 women) remained for analysis.
From the analytic cohort, we randomly selected a two-thirds derivation sample stratified by sex and site to develop a risk score for predicting incidence of diabetes (24,950 participants: 21,364 men and 3586 women). The remaining one third of participants (12,466 participants: 10,676 men and 1790 women) were included in a validation cohort to assess the validity of the derived risk score from the derivation cohort.

Assessment of risk factors
Body height, body weight, and waist circumference were measured at each company in accordance with a standard protocol. Waist circumference was measured at the umbilical level in the standing position by a health professional. Body mass index (BMI) was calculated as weight in kilograms divided by squared height in meters. Smoking status and medication of hypertension or dyslipidemia were ascertained using a questionnaire. Blood pressure was measured using an automated sphygmomanometer. Hypertension was defined as systolic blood pressure of 140 mmHg, diastolic blood pressure of 90 mmHg, or taking medication for hypertension.
Biochemical measurements included plasma glucose, HbA1c, LDL-cholesterol, HDL-cholesterol, and triglycerides. Plasma glucose was measured by enzymatic method in seven companies and glucose oxidase peroxidative electrode method in one company. HbA1c was measured by latex agglutination immunoassay in five companies, high-performance liquid chromatography method in two, and enzymatic method in one. As HbA1c was measured in accordance with a method used by the Japan Diabetes Society, we converted values to the National Glycohemoglobin Standardization Program (NGSP) equivalent value (%) using the following formula: HbA1c (%) = 1.02 × HbA1c (Japan Diabetes Society) (%) + 0.25% [17]. In all participating companies, HDL-cholesterol, LDL-cholesterol, and triglyceride levels were measured by enzymatic method. Dyslipidemia was defined as LDL-cholesterol of 140 mg/dl, HDL-cholesterol of <40 mg/dl, triglyceride of 150 mg/dl, or taking medication for dyslipidemia. All laboratories involved in the health checkup in the participating companies have received satisfactory scores (rank A or score >95 out of 100) from external quality control agencies, including the Japan Medical Association, Japanese Association of Laboratory Medical Technologists, and National Federation of Industrial Health Organization.

Outcome
Diabetes newly diagnosed during the 3-year period after the baseline examination was determined as the outcome in the present analysis. Diabetes was defined as having FPG of 126 mg/dl, or random plasma glucose of 200 mg/dl, HbA1c of 6.5%, or receiving medical treatment of diabetes, which was defined in two ways: medical treatment of diabetes (3 companies) or anti-diabetic drug use (5 companies). Individuals without diabetes at baseline who met any of these conditions in the subsequent health checkups were considered to have incident type 2 diabetes.
Logistic regression analysis was performed to estimate OR and 95% CI of type 2 diabetes for each category of risk factors. We analyzed data using a multiple regression model with backward elimination methods with a significance level of less than 0.05 to first determine variables used for the non-invasive model (excluding dyslipidemia, FPG, and HbA1c). To develop a risk score for predicting 3-year incidence of diabetes, we assigned each category of risk factor with one of the following point scores, corresponding to β coefficients of multivariate logistic regression, in accordance with the method by Guasch-Ferré et al [18] and Lindström et al [19]: 1 for β = 0.01-0.20, 2 for β = 0.21-0.80, 3 for β = 0.81-1.20, 4 for β = 1.21-2.20, and 5 for β >2.20. The reference category of each variable was given a score of 0. The risk score of incident diabetes was calculated as the sum of the individual scores. We then assessed predictive performance for the risk score by drawing a receiver operating characteristic (ROC) curve, with calculation of the area under the ROC curve (AROC) and sensitivity, specificity, positive predictive value, and negative predictive value for various cutoffs. We also assigned point scores in accordance with methods in the Framingham [20] and Hisayama Study [10]. The score in the present study had a larger AROC than the score based on the Framingham Study's method. While the score based on the Hisayama Study's method showed better predictive performance than the present score in terms of AROC, the former had a much wider range than the latter. We therefore decided to use the present method to create a precise and simple risk score. We next incorporated either or both FPG and HbA1c into the non-invasive model, thereby creating invasive models including FPG, HbA1c, or both. We performed ROC analyses for these models as well as the non-invasive model. We compared discriminative ability (AROC) among four models by using DeLong's method [21]. In addition, we calculated net reclassification improvement (NRI) using three risk categories (<3%, 3-10%, and >10%) and integrated discrimination improvement (IDI). Finally, to assess the internal validity of the obtained risk scores, we applied the scoring system to the validation cohort and performed ROC analyses. We compared the predicted and observed incidence of diabetes in each decile of risk score and performed the Hosmer-Lemeshow test to assess model goodness-of-fit. Two-side P values of less than 0.05 were regarded as statistically significant. All analyses were performed using Stata version 13.0 (StataCorp, College Station, TX, USA) and Statistical Analysis System (SAS) software version 9.1 (SAS Institute, Cary, NC, USA).

Results
Compared to participants included in the present analysis, those excluded were younger and more likely to be men and current smoker and to have hypertension and dyslipidemia, but had lower FPG and HbA1c for both age groups of 30-39 years (excluding 35 years) and 40 years (including 35 years) (data not shown).
During the 3-year period, we identified 1122 and 565 incident cases of diabetes in the derivation (1049 men and 73 women) and validation (537 women and 28 women) cohorts, respectively. The characteristics of study participants in the derivation and validation cohorts are shown in Table 1. The means (standard deviation) of age and BMI were 45.5 (7.9) years and 23.3 (3.2) kg/m 2 in the derivation cohort and 45.5 (7.8) years and 23.3 (3.2) kg/m 2 in the validation cohort. The means of age, BMI, waist circumference, and FPG and HbA1c levels and the proportion of women, current smokers, hypertension, and dyslipidemia did not differ markedly between the derivation and validation cohorts.
The association between risk factors and type 2 diabetes risk is shown in Table 2. In the noninvasive variables-adjusted model, men had significantly higher risk of type 2 diabetes than women. In addition, older age, higher BMI, abdominal obesity, current smoking, and hypertension were associated with an increased risk of type 2 diabetes. All variables remained significant in the backward elimination analysis (P value <0.05). In the model including dyslipidemia, FPG, and HbA1c, the ORs of type 2 diabetes for men, older age, higher BMI, abdominal obesity, and hypertension were considerably attenuated, such that sex, abdominal obesity, and dyslipidemia were no longer associated with type 2 diabetes risk. Participants with FPG of 110 mg/dl or HbA1c of 6.0% had significantly higher risk of type 2 diabetes than those with FPG of <100 mg/dl or HbA1c of <5.6%; the multivariable-adjusted ORs (95% CI) of type 2 diabetes were 13.69 (11.15-16.81) for FPG of 110 mg/dl or 17.26 (13.41-22.21) for HbA1c of 6.0%.
All risk factors in the four models and the points derived for each category are shown in Table 3. The total score ranged from 0 to 16 in the non-invasive model, from 0 to 15 in the invasive model including either FPG or including HbA1c, and from 0 to 20 in the model including both FPG and HbA1c. ROCs of each risk model in predicting type 2 diabetes are shown in Fig 1. The AROC for the non-invasive model was 0.717 (95% CI, 0.703-0.731), increasing to 0.893 (95% CI, 0.883-0.902) for the invasive model including both FPG and HbA1c (P value <0.001). In addition, the AROC for the invasive model including FPG (0.843) was significantly higher than that for the invasive model including HbA1c (0.827, P value <0.01) and was significantly lower than that for the invasive model including both FPG and HbA1c (0.893, P value <0.01). When AROCs were calculated by sex, they were larger in women than in men. The values (95% CI) were 0.746 (0.695-0.798) in women and 0.703  and P value = 0.01 for the non-invasive model and χ 2 value = 11.7 and P value = 0.17 for the model including both FPG and HbA1c (Fig 2). The predictive performance for a range of cut-off points for the developed diabetes risk scores are shown in Table 4. In the non-invasive model, a score of 9 points or higher and where the sum of sensitivity and specificity was maximized in the derivation cohort had a sensitivity of 61.0% and specificity of 70.8%. In the model including FPG, a score of 7 points or higher with maximized sensitivity and specificity had sensitivity of 87.6% and specificity of 68.4%. In the model including HbA1c, a score of 10 points or higher had sensitivity of 80.7% and specificity of 72.4%. The model including both FPG and HbA1c with a score of 11 points or higher had sensitivity of 84.2% and specificity of 80.3%. The 3-year predicted probability of incident type 2 diabetes by total points for each risk model are shown in Table 5. In the non-invasive model, participants with scores of 0 to 8 (69.4% of total participants in the derivation cohort) had less than 5% risk of 3-year incident type 2 diabetes. The risk were 5 to <10% for a score of 9 to 11 (23.0% of total participants), 10 to <20% for a score of 12 to 14 (7.0%), and >20% for a score of 15 (0.6%). In the invasive model including both FPG and HbA1c, participants with a score of 0 to 10 (77.4% of total participants), 11 to 12 (10.5%), 13 to 14 (6.9%), and 15 to 18 (5.0%) had less than 5%, 5 to <10%, 10 to <20%, and 35 to <50% risk of incident type 2 diabetes, respectively. Those with a score of 19 (0.2% of total participants) had >50% risk. The ROCs of four risk scores in the validation cohort were closely similar to those in the derivation cohort (Fig 1). When AROCs were calculated by sex, values (95% CI) were 0.841 (0.756-0.927) in women and 0.708 (0.687-0.728) in men for the non-invasive model and 0.938 (0.896-0.981) in women and 0.872 (0.856-0.887) in men for the invasive model including both FPG and HbA1c. At the cut-off point for a non-invasive model score of 9 or higher, sensitivity (61.2%) and specificity (70.8%) were similar to those in the derivation cohort (Table 4).

Discussion
In this large-scale, multi-center study among Japanese working population, we developed a risk score for predicting 3-year risk of type 2 diabetes. In the non-invasive model in which sex, age, BMI, abdominal obesity, smoking, and hypertension were included, the predictive ability for incidence of type 2 diabetes was reasonably good. Performance was further improved by adding FPG and HbA1c. Similar predictive ability was observed in the validation cohort. To our knowledge, this is the third largest study developing risk models and scores to predict incidence of type 2 diabetes.
The risk score based on the non-invasive model we developed showed relatively high predictive ability for type 2 diabetes. Noble et al [5] reviewed 94 risk models and scores and  [10]. In that study, AROC for the score in predicting incidence of type 2 diabetes (follow-up period, 14 years) was 0.700 (95% CI, 0.667-0.732). In the Toranomon Hospital Health Management Center Study 6 (TOPICS 6) among 7654 government employees aged 40-75 years [11], a non-invasive model including age, sex, family history of diabetes, BMI, and current smoking had an AROC of 0.708 (95% CI, 0.679-0.737) in predicting type 2 diabetes (follow-up period, 5 years). The risk score we developed in a large working population had comparable predictive ability but a more stable estimate (narrow confidence interval) than these previous Japanese studies. Given that 27% of the present study population were less than 40 years old, the present risk score could be applied to relatively young populations. Numerous risk scores for diabetes have been developed across diverse populations. Given ethnic origin is strongly related to diabetes risk [2], however, a risk score derived from a population may not be applicable to others. In Japan, three diabetes risk scores have been developed: one among residents in a prefecture [14], another among residents in a rural town [10], and the other among government employees [11]. These scores require family history of diabetes, physical activity, and alcohol consumption which are not routinely available or uniformly collected in general health examination in Japan, and thus are of limited use for wider population. The risk scores we developed here using a large and multi-company database of periodic health checkup are not only statistically robust but also useful in practice.
We used six risk factors, including age, sex, BMI, abdominal obesity, smoking habit, and hypertension to create a non-invasive risk score. The number of factors used in risk models has ranged from 3 to 14 (mean 7.8) in previous studies [5]. The most commonly incorporated factors in previous risk models were age, family history of diabetes, BMI, hypertension, waist circumference, and sex, followed by ethnicity, fasting glucose level, smoking status, and physical activity [22]. We did not include family history of diabetes in the current risk model because that information was collected in only some participating companies. Nevertheless, the predictive power of our risk model was no less than those of previous risk models which did include family history of diabetes. To examine the impact of including family history of diabetes, we repeated the analyses in two major companies where this information was available (n = 31,202) and confirmed that AROCs for the non-invasive model were little improved after incorporating that information (from 0.724 to 0.736). Given that adults with family history of diabetes tend to have higher BMI, waist circumference, blood pressure, and FPG level than those without [23][24][25][26], a lack of family history information can be largely compensated by incorporating the above-mentioned data in the prediction model.
In the present study, the predictive power was enhanced by adding either FPG or HbA1c to the non-invasive model and was further improved by adding both, a finding compatible with previous studies [10,11,15,18,27,28]. The degree of improvement was similar between the model including FPG and the model including HbA1c, though the difference was significant. In the TOPICS 6, the AROCs for the non-invasive models were increased to 0.836, 0.837, and 0.887 after FPG, HbA1c, and both FPG and HbA1c were added, respectively [11], and these corresponding values were 0.867, 0.886, and 0.893 in the European Prospective Investigation into Cancer and Nutrition-Potsdam study conducted in 1962 men and women aged 35-65 years [28]. The AROCs for the model including FPG were 0.772 in the Hisayama Study [10], 0.756 in a Thai cohort of 2677 participants aged 35-55 years [27], 0.784 in the PREDIMED study among 1381 participants aged 55-80 years [18], and 0.848 in a Taiwanese study among 36,972 participants aged 35-74 years [15]. The predictive ability of the present invasive model (AROC: 0.843 for FPG model, 0.827 for HbA1c model, and 0.893 for FPG and HbA1c model) was comparable to or higher than values in these previous studies. Further, the current model including both FPG and HbA1c can identify individuals at very high risk (40-50%) of developing type 2 diabetes within 3 years (Table 5) and thus is a tool for risk stratification.
Strength of the present study included a large number of participants from several companies, making our estimates highly stable (narrow confidence interval). In addition, we confirmed the performance of the risk score in a validation cohort. However, our study also has some limitations. First, we relied on HbA1c and fasting and casual glucose and not the oral glucose tolerance test to define incident type 2 diabetes. However, performing this test in a large sample is not feasible. HbA1c does not require fasting and reflects long-term glycemic status. In addition, the International Expert Committee has recommended using HbA1c to diagnosis diabetes [29]. Second, there were some differences in the characteristics between participants included and excluded from the present analysis. We thus could not rule out the possibility of selection bias as a result of these exclusions. Third, when we assessed the goodness-of-fit using the Hosmer-Lemeshow test, the P value for the non-invasive model was statistically significant, showing poor calibration. However, Hosmer-Lemeshow χ 2 values <20 are considered indicative of good calibration [30] (χ 2 = 19.3 and P value = 0.01 for the non-invasive model in the present study). Fourth, the follow-up period was 3 years, and thus the risk score developed may only be applicable to short-term prediction of diabetes. Nevertheless, the present score may serve for allocating resource to individuals who are at high risk of developing diabetes in near future. Fifth, the J-ECOH Study is a multi-company study using existing data, and thus survey questions regarding lifestyle, history of disease, and drug use as well as the procedure for obtaining anthropometric and biochemical measurement were not uniform. However, all laboratories that performed biochemical analyses for the participating companies have participated in one or more external quality control programs and received the highest rank of evaluation. Finally, most study subjects were workers in large companies; therefore, the present finding might not be applicable to workers in small-and medium-sized companies, those in other large companies with markedly different background, or members of the non-working population. Diabetes risk is probably higher in general population that includes vulnerable people than working population, and thus the application of the present score to the general population would lead to underestimation of diabetes risk. However, the extent of underestimation, if any, may be small, given a similar age-specific prevalence being reported in a nationally-representative sample [31].
In conclusion, we developed a simple risk model using age, sex, BMI, waist circumference, hypertension, and smoking status to predict 3-year incidence of type 2 diabetes in a large-scale multi-center study among Japanese workers. The predictive ability was reasonably high and well reproduced. Since this risk model uses non-invasive information, it may be useful among working population, particularly relatively young individuals who have less opportunity to undergo an examination of blood glucose levels than older individuals. Further, the risk model with FPG and HbA1c showed excellent predictive ability and thus can contribute to risk stratification, facilitating the development of efficient prevention programs against type 2 diabetes.