Development and validation of quality of life instruments for chronic diseases—Chronic gastritis version 2 (QLICD-CG V2.0)

Quality of life is an important outcome indicator to evaluate whether treatment is successful or not. Chronic gastritis leads to ongoing deterioration of subjectively perceived quality of life. There are several generic measures, but they are not developed particularly to assess chronic gastritis problems. The Quality of Life Instruments for Chronic Diseases—Chronic Gastritis (QLICD-CG V2.0) questionnaire is a 39-item, multi-dimensional, self-report instrument to assess chronic gastritis patients’ perception of their health related quality of life in four domains. The instrument was developed in China. The current study aimed to evaluate the psychometric properties of the QLICD-CG V2.0. 194 patients with chronic gastritis were enrolled from 4 hospitals in China. The QLICD-CG V2.0 was administered to patients by trained research assistants. In addition, their demographic characteristics were also recorded. The psychometric testing included construct validity, convergent validity, discriminant validity, test-retest, and responsiveness. The results showed good internal consistency and acceptable floor and ceiling effects (Cronbach’s alpha range from 0.80 to 0.93). CFA showed that the instrument structure has a reasonable fitness (RMSEA = 0.063, 95%CI = [0.057 0.079], CFI = 0.93, GFI = 0.95, SRMR = 0.028). The convergent validity was considered appropriate, with 38 of the 39 items correlated stronger with their assigned scale than a competing scale, except for GPS1. Known groups comparisons showed that the QLICD-CG V2.0 discriminated well between subgroups on the basis of gender, marriage status, and economy status, thus providing evidence of discriminative validity. Convergent validity testing revealed that the QLICD-CG V2.0 domain scores correlated significantly with SF-36 dimension scores, which ranged from 0.21 to 0.58. Test-retest coefficients were satisfactory. A majority of intraclass correlation coefficients were above 0.70, except the psychological domain (0.60) and the items of social support/security (0.61). Responsiveness was tested on 157 patients. Significant differences were found on all QLICD-CG V2.0 domains, between baseline responses and after a treatment, except for the items of appetite and sleep. Robust sensitivity to change was observed. The QLICD-CG V2.0 appears to be a valid and reliable instrument to measure QOL in chronic gastritis patients. Scores were reproducible.

Introduction Chronic gastritis has received an increasing attention within medical practice. It is a long-term inflammation of the gastric mucosa, which can significantly impair the quality of life (QOL) of the patients [1]. Although the progress in gastritis treatment has been remarkable, chronic gastritis still results in difficulties for patients' everyday life, which leads to ongoing deterioration of QOL [2]. Accurate assessment of subjective feeling is critical to determining the efficacy of treatment.
Health related quality of life (HRQOL) reflects patients' feelings and functioning and the impact of their health condition beyond simple symptom assessment [3]. Some generic HRQOL measures have been developed and widely used across a range of diseases, such as 36-item short form health survey (SF-36) [4], and WHO quality of life-BREF (WHOQOL--BREF) [5]. Most of the time, studies use generic measures to study quality of life of chronic gastritis patients. These generic measures place emphasis on overall life satisfaction, such as social functioning, and general health perceptions. They focus on general symptoms or function, e.g., pain. However, there are also some specific symptoms in chronic gastritis patients, such as bloating, heartburn, belching or nausea [6][7][8]. These symptoms cause deterioration in chronic gastritis patients. Generic measures fail to cover all these symptoms on QOL, and may not fully evaluate the entire range of QOL issues, which certain patients may experience. Although there are some specific disease-oriented questionnaires, such as Gastrointestinal Quality of Life Index (GLQI) [9] and EORTC QLQ-STO22 [10], several comments can be made about these questionnaires. GLQI and QLQ-STO22 are not disease-specific measures for chronic gastritis. Disease-specific questionnaires are more efficient than generic questionnaires [11]. Thus, a more specific HRQOL measure, developed particularly to assess chronic gastritis problems, would be useful in assessing HRQOL and to evaluate whether treatment is successful or not.
The purpose of the current study was to evaluate the psychometric properties of a new QOL instrument for chronic gastritis patients, the Quality of Life Instruments for Chronic Diseases-Chronic Gastritis (QLICD-CG V2.0).

Materials and methods
The development of QLICD-CG V2.0 was conducted in a standardized manner, consisting of item development, pilot testing, and psychometric validation [12,13]. The ethics committee of Guangdong medical university approved of this study. Written informed consents were obtained from all the participants prior to survey participation.

Item development debriefing
This study focused on the specific module development across several domains for chronic gastritis patients. The QLICD-CG V2.0 is a self-report measure, with a total of 39 items covering a general module (QLICD-GM, including three domains: 9 items in physical domain, 11 items in psychological domain, and 8 items in social domain) and a specific module (11 items, including three disease-specific domain: epigastric pain, satiety, and psychological impact for chronic gastritis). Each item of the QLICD-CG V2.0 was scored on a 5-point Likert scale (possible score range: 1 to 5, ranging from 1 no problem, to 5 extreme problem). The maximum possible score range of the QLICD-CG V2.0 is 39-195 (28-140 is the maximum possible score range of the general module, 11-55 is the maximum possible score range of the specific module). In the QLICD-CG V2.0, higher scores represent better QOL. It takes about 20 minutes to complete the questionnaire.
Items were generated through a multi-step process: physician consensus panels, semi-structured patients' interviews, and several revisions made in response to patients' data and feedback. Firstly, a pool of 17 items was generated, which consisted of candidate items that reflected the construct concept of the specific module. Secondly, several semi-structured interviews focused on the impact of disease on QOL were conducted. The content derived from these interviews was examined in conjunction with review of relevant literatures and was consulted with 16 experts, including physicians and researchers in clinical and psychometric field. And third, a preliminary QLICD-CG V2.0 paper and pencil questionnaire was conducted with 30 patients. Semi-structured interviews were performed to assess patients' interpretations of the questions. As shown in Table 1, data were gathered on demographic and clinical aspects of patients in preliminary test.
Fourth, after piloting, a formal QLICD-CG V2.0 of 11 items was produced. 174 chronic gastritis patients from four hospitals in China were enrolled in formal test. QLICD-CG V2.0 and QLICD-GM, together with a few questions on demographic and clinical features, were administered to patients by trained research assistants. All participants also answered the SF-36 at the same time.
The investigators described the study to the participants and obtained informed consent from those who agreed to participate and met the inclusion criteria. The inclusion criteria were: 18 years or older, capacity to consent [14]. Chronic gastritis was diagnosed primarily through endoscopy and gastric biopsy by the physician. The classification criteria of chronic gastritis was proposed by Chinese society of digestive endoscopy [15]. In order not to bias responses, the questionnaires were completed after a clinical examination to confirm that the patients were in a stable phase and before the medical procedures. Stable phase was defined as that patient reported no life events and no health changes.

Study design and population
This study was conducted in Guangdong medical university affiliated hospital from July 2015 to May 2016. The outpatients with clinical symptoms of chronic gastritis were chosen as the subjects. A total of 194 subjects were requested to sign informed consent, to complete the paper and pencil questionnaires, and to examine the situation of gastric mucosa by the gastroscopy. All items were reported well understood. According to the results from the questionnaires, 20 patients were excluded for missing response to more than 50% of the total items, and 174 patients remained. Gastric mucosa divided 174 subjects into six clinical subtype groups such as superficial gastritis, superficial gastritis with erosion, flattened erosive gastritis, bile reflux gastritis, complex gastritis, missing. The missing data rate for each item varied from 3.14% to 5.68%. For missing responses, the mean was calculated by imputing the missing responses based on the mean of the non-missing items. The inclusion criteria were as following: firstly, the patients were diagnosed as chronic gastritis. Secondly, there were no medical history of tumor.

Clinical and demographic characteristics
As shown in Table 2, data were gathered on demographic and clinical aspects of patients in formal test. The types of gastritis (superficial gastritis, atrophic gastritis, gastric atrophy, nonerosive gastritis and non-specific gastritis) were recorded with biopsy examination and gastroscopy.

Statistical analysis
Data processing and statistical analyses were performed using SPSS 18 and Mplus 7. Quantitative variables were expressed as means ± standard deviations. The significance level was set at p< 0.05. Reliability measures were of two types: 1. Internal consistency reliability was assessed using Cronbach's alpha coefficient calculated for each scale (a value > = 0.70 supported internal consistency reliability) [16]; 2. Test-retest reliability was assessed using ICC (intraclass correlation coefficient). The QLICD-CG V2.0 was administered twice to 101 patients whose clinical conditions were stable (defined by patient without reporting life events and health changes), separated by a one-week interval, to quantify reproducibility of scores. Responsiveness was tested using the standardized effect size on 157 patients who experienced an antibacterial and antiulcer treatment through which the patients' health statuses changed. The QLICD-CG V2.0 was administered for a second time to these 157 patients within 6 months of the baseline visit to determine whether the instrument was sensitive to these changes in QOL.
Convergent and discriminant validity were examined by comparing the item-dimension correlation. Convergent validity was assessed by correlating each item with the scale it was hypothesized to belong to (a correlation r > = 0.4 supported item internal consistency). And discriminant validity was supported whenever a correlation between an item and its hypothesized scale was higher than its correlation with the other components. Floor and ceiling effects were assessed by the homogeneous response distribution of scores. A confirmatory factor analysis (CFA) was performed using the structural equation modeling. The following indexes were

Score distribution
The QLICD-CG V2.0 total mean score was 127.29±27.20, with a range 59.25-183.85 The general module mean score was 65.56±13.72, with a range 27.68-92.86. The specific module mean score was 61.73±16.64, with a range 31.57-90.99. As shown in Fig 1, the total score and module scores showed no floor or ceiling effect as none of patients obtained the minimum or the maximum score. (Fig 1)

Construct validity
As shown in Table 3, Cronbach a for the QLICD-CG V2.0 domains ranged from 0.80 to 0.93, which showed good internal consistency [19]. The QLICD-CG V2.0 showed high internal consistency as demonstrated by the Cronbach's alpha coefficients (>.80) on each domain. CFA Construct validity was investigated via the known groups comparisons for discriminative validity. Table 4 shows Pearson correlation coefficients between items and the QLICD-CG V2.0 domains. 38 out of 39 items had the largest correlation coefficient for each item with their assigned domains, except for GPS1, which correlated r = 0.60 on physical domain, and r = 0.50 on psychological domain (difference = 0.10). We still chose to include GPS1 in the psychological domain which seemed more reasonable, even if the loading was lower in the psychological domain than the physical domain. Normally, a minimum dimension loading of 0.50 is recommended [20]. However, four items demonstrated low loadings on their assigned dimension (range 0.34-0.50; GPH3-Do you feel treatment cause sexual problem; GPS1-Can you focus on what you are doing, GPS3-Do you think life is fun, CG11-Do you feel annoyed for the restricted diet due to illness). Except for GPS1, three items (GPH3, GPS3, CG11) did not overlap others (loadings of �0.40 on more than one factor). These four items were discussed, and believed to be important to chronic gastritis patients. Therefore, we reserved these items in the QLICD-CG V2.0.
As shown in Table 5, the discriminant validity of QLICD-CG V2.0 was assessed using the domain scores of the QLICD-CG V2.0 across patient groups with different demographic and clinical characteristics. Student's t tests and one-way ANOVA were used to compare these mean domain scores. Females reported significantly lower scores in the psychological domain. Single patients reported significantly lower scores in the psychological domain, general module, and total score. Patients with less income reported significantly lower scores in all domains, except physical domain.  Validation of QLICD-CG  Table 6 shows Pearson's correlation coefficients of the domain scores of the QLICD-CG V2.0 with the SF-36. Results indicated positive correlations. If there were higher correlations between corresponding than non-corresponding domains, the convergent validity was good.

Test-retest reliability and responsiveness
Test-retest reliability was evaluated with ICC. The QLICD-CG V2.0 was administered to 101 patients at baseline and one week after the baseline visit. As shown in Table 7, a majority of ICCs were above .70, except the psychological domain (.60) and the item of social support/security (.61). Responsiveness was assessed using the standardized effect size on 157 patients who experienced a treatment within 6 months of the baseline visit. Standardized effect sizes were calculated by use of the formula: Effect size = (baseline QOL score-QOL score after a treatment) / Standard deviation of change QOL scores. The mean duration of time between baseline and post-treatment assessments was 130.82 ± 28.65 days. Significant differences were found between baseline responses and after a treatment, except for the items of appetite and sleep.

Discussion
Chronic gastritis may damage stomach for years, affecting patients' health and subjective perceived quality of life. There are particular challenges in chronic gastritis patients' life, including the physical impairment, but also change in psychological and social domain [21]. These challenges are reflected in the QLICD-CG V2.0, but are not captured in other generic instruments [9,10]. The impact of disease-specific impairment on chronic gastritis patients has received little attention in research and clinical practice, and the QLICD-CG V2.0 provides a tool to address this gap. This validation study of the QLICD-CG V2.0 showed a 4-domain structure, in a population of Chinese patients with current chronic gastritis. The QLICD-CG V2.0 showed good construct validity, convergent validity, discriminant validity, test-retest reliability, and responsiveness. The convergent validity was supported through positive correlations between the domain score of the QLICD-CG V2.0 and the SF-36. Results also showed that the QLICD-CG V2.0 is suitable for longitudinal studies to detect a meaningful change in chronic gastritis patients. As a disease-specific instrument that focuses on particular symptoms, the QLICD-CG V2.0 is sensitive enough to detect any small changes in QOL. This instrument could be used to monitor response to treatment, which will be applicable to research studies as well as to clinical practice. The QLICD-CG V2.0 has at least two interesting specificities. First, disease-specific module is of particular interest. There are particular challenges in chronic gastritis patients, including the physical, psychological, social, and disease-specific impairment. These challenges are reflected in the QLICD-CG V2.0, but has been unexplored by other generic QOL instruments. The impact of disease-specific impairment on chronic gastritis patients has received little attention in research and clinical practice, and QLICD-CG V2.0 provides a tool to address this gap. The development process of QLICD-CG V2.0 could contribute to the acknowledgement of the importance of the patient perspective in the treatment and outcome assessment. Thus, QLICD-CG V2.0 can add important value to patient recovery.
Second, the QLICD-CG V2.0 was validated on a broadly representative group of chronic gastritis patients, which included superficial gastritis, atrophic gastritis, gastric atrophy, nonerosive gastritis and non-specific gastritis. This captures all aspects of chronic gastritis patients' QOL. Furthermore, to complete the QLICD-CG V2.0 only needs about 20 minutes, making the QLICD-CG V2.0 compatible for clinical practice.
There are some limitations of the present study to be considered. First, we did not test longitudinal responsiveness. It is important to investigate whether the QLICD-CG V2.0 can detect changes in the long run. Second, the sample may be not representative enough. Our study enrolled patients in hospital, the findings might not generalize to those patients who don't come to hospital. Moreover, we failed to discriminate patients with different clinical subtype. It might be due to the small sample size of our study. There are 4 patients with bile reflux gastritis, and only one patient with complex gastritis. Third, there are different known causes of chronic gastritis. A specific cause is difficult to be identified. It is hard to explain the differences between different demographic groups, and thus the generalizability of research findings is uncertain. Further validation of the QLICD-CG V2.0 is needed. A larger sample can yield more accurate results. Future research should explore possible explanations for the differences between different demographic groups.

Conclusion
The QLICD-CG V2.0 is an instrument to assess QOL among patients with chronic gastritis, which presents good psychometric properties. To date, QLICD-CG V2.0 is the only QOL instrument specific to chronic gastritis patients. The QLICD-CG V2.0 can be used in the context of research studies as well as to clinical practice. However, it should be noted that further examination and confirmation of its psychometric properties should be performed in other independent samples.