Before organizing mixed-mode data collection for the self-administered questionnaire of the Belgian Health Interview Survey, measurement effects between the paper-and-pencil and the web-based questionnaire were evaluated. A two-period cross-over study was organized with a sample of 149 employees of two Belgian research institutes (age range 22–62 years, 72% female). Measurement agreement was assessed for a diverse range of health indicators related to general health, mental and psychosocial health, health behaviors and prevention with kappa coefficients and intraclass correlation (ICC). The quality of the data collected by both modes was evaluated by quantifying the missing, ‘don’t know’ and inconsistent values and data entry mistakes. Good to very good agreement was found for all categorical indicators with kappa coefficients superior to 0.60, except for two mental and psychosocial health indicators namely the presence of a sleeping disorder and of a depressive disorder (kappa≥0.50). For the continuous indicators high to acceptable agreement was observed with ICC superior to 0.70. Inconsistent answers and data-entry mistakes were only occurring in the paper-and-pencil mode. There were no less missing values in the web-based mode compared to the paper-and-pencil mode. The study supports the idea that web-based modes provide, in general, equal responses to paper-and-pencil modes. However, health indicators based upon factual and objective items tend to have higher measurement agreement than indicators requiring an assessment of personal subjective feelings. A web-based mode greatly facilitates the data-entry process and guides the completing of a questionnaire. However, item non-response was not positively affected.
Citation: Braekman E, Berete F, Charafeddine R, Demarest S, Drieskens S, Gisle L, et al. (2018) Measurement agreement of the self-administered questionnaire of the Belgian Health Interview Survey: Paper-and-pencil versus web-based mode. PLoS ONE 13(5): e0197434. https://doi.org/10.1371/journal.pone.0197434
Editor: Arsham Alamian, East Tennessee State University, UNITED STATES
Received: December 18, 2017; Accepted: May 2, 2018; Published: May 21, 2018
Copyright: © 2018 Braekman et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the Supporting Information files.
Funding: Funded by Federaal Wetenschapsbeleid - Belspo https://www.belspo.be/. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Population surveys have traditionally used paper-and-pencil self-administered questionnaires to collect information on sensitive questions. However with the growth of internet use, web-based questionnaires have become an important alternative to paper-and-pencil questionnaires due to their many advantages [1;2]. For instance, the process of manual data-entry with its accompanying data-entry mistakes becomes unnecessary [3;4]. As well, web-based questionnaires can produce higher data quality since an automatic skipping and branching logic and warning messages in case of missing and implausible answers can be foreseen [3;4].
Web-based questionnaires cannot,however, be the sole mode of data collection for population surveys, as even in countries with high internet penetration, internet access and skills vary among demographic groups [5;6]. To overcome this limitation, mixed-mode data collection including a web-based and paper-and-pencil mode can be used. Mixing different modes in one survey, can lead to mode effects by simultaneously creating selection and measurement effects . Selection effects can occur when respondents with different characteristics choose a different mode to complete the questionnaire. Measurement effects can occur if the mode influences how respondents understand the question, retrieve relevant information, make a judgment about the adequate response and finally choose the answer [8;9]. For instance, a web-based mode offers a greater opportunity to multitask since respondents are more likely to be engaged in several other activities while completing the questionnaire [10;11]. This might lead to “satisficing” behavior; respondents simply provide a satisfactory answer (e.g. answering don’t know or skipping the question) because an optimal response requires a substantial amount of cognitive effort [12;13]. As well, a web-based mode may limit the ability of the respondents to re-read the questions at their own pace, in their preferred order and to synchronize the answers [14;15]. Furthermore, a web-based mode can generate more honest responses since respondents can be transported into another virtual world wherein they forget their immediate surrounding . In this way, it can create an illusion of privacy.
Mode effects have implications for the comparability of the data collected by different modes . Recent meta-analyses and review studies of the comparability of electronic and paper-and-pencil modes generally found evidence for the equivalence across the modes [14;17;18]. However, other studies found differences in the reporting of general health , mental health [19;20] and sensitive health behaviors [21;22]. In a mixed-mode design, it is not possible to disentangle selection effects from measurement effects . That is why, in the context of future mixed-mode data collection for the self-administered questionnaire of the Belgian Health Interview Survey (BHIS), a study with a repeated measures design was organized to test for measurement effects. More specifically, the aim of this study was to assess the measurement agreement between the newly developed web-based and the paper-and-pencil mode for several health indicators and to ascertain the extent to which the quality of the collected data varied between these modes.
Research design and study population
A two period cross-over design was used, in which respondents completed the questionnaire in both modes with a certain time interval in between. Respondents were recruited on a voluntary basis from a pool of 730 employees of two Belgian research institutes. The research protocol was submitted to the directors of the participating institutes for approval. No ethics committee was involved as this was an internal pilot study. The employees were informed about the objectives of the study in an e-mail before giving their written consent for participation. No benefits or risks were derived from participating in this study. The answers of the participants were kept anonymous as each participant had a unique ID code and the link between the name and the ID code was not accessible to the researchers. This link was deleted after the end of the data collection. In total 195 employees volunteered to participate. Half of the respondents were first assigned to the paper-and-pencil mode (paper first group) and the other half to the web-based mode (web first group). After two weeks the groups were switched: the paper first group received the web-based mode and inversely. Only respondents who completed the questionnaire by both modes were included in the final sample of 149 respondents. At end a response rate of 20.4% (149/730) was achieved. The median number of days between completing the questionnaire in the two modes was 14 days (minimum 2 and maximum 40 days) (Fig 1).
The questionnaire was based on the self-administered paper-and-pencil questionnaire of the BHIS 2013  and could be completed in French or Dutch. The web-based questionnaire was developed to be as comparable as possible to the paper-and- pencil mode. Therefore, the questions were identical (similar wording and almost similar instructions) and the design was comparable (similar colors and lay-out). Still, the web-based mode was developed while applying the imbedded features of this mode such as automatic skipping and branching. Furthermore, soft warnings were given in case of missing values for the first question of every module and for filter questions and in case respondents gave inconsistent or implausible answers. As well, the web-based mode had a multipage design displaying only a few questions on every screen which differs from the paper-and-pencil questionnaire that allows a comprehensive view on the whole questionnaire. Web respondents were, however, able to go back in the questionnaire to change answers given to previous questions. After completing the last questionnaire respondents were asked if they had experienced a health change during the washout period.
The web-based questionnaire, developed using BlaiseIS 4.8 software, could be completed using a computer but not using a tablet or smartphone. Data from the web-based questionnaire were automatically saved in a database. Data collected with the paper-and-pencil questionnaire were entered manually using a program also developed with Blaise® software. A double data-entry was done in order to correct for data-entry mistakes. Table 1 provides an overview of the indicators selected to assess the measurement agreement. These indicators are organized in 4 topics: general health, mental and psychosocial health, health behaviors, and prevention.
Statistical analyses were performed using the statistical package SAS® 9.3. The significance level for all the analyses was set at 5%, with corresponding 95% confidence intervals (CI).
For categorical indicators, kappa coefficients were estimated . Simple kappa coefficients were calculated for binary and nominal indicators whereas linear weighted kappa coefficients were calculated for ordinal indicators. Weighted kappa coefficients take into account the greater disagreement between response categories that are further apart than for those that are closer together on an ordinal scale [27;28]. Linear weights were defined as wi = 1-(i/(c-i)) where i is the difference between the response categories in the web-based mode and paper-and-pencil mode and c is the total number of categories of the indicator. For the interpretation, we followed the cutoffs proposed by Landis & Koch : ≤0.00 = poor, 0.00–0.20 = slight, 0.21–0.40 = fair, 0.41–0.60 = moderate, 0.61–0.80 = good, 0.81–1.00 = very good agreement. In addition, percentages of exact (for binary, nominal and ordinal categorical indicators) and global agreement (only for ordinal categorical indicators) were calculated . Exact agreement was estimated as the percentage of respondents who have the same category in both modes. Global agreement was calculated as the percentage of responses that fell within one category in the positive and negative direction. The percentages of agreement depend on the number of categories; they are expected to be higher for indicators with only a few categories.
Measurement agreement for continuous indicators was assessed using the intraclass correlation coefficient (ICC) . The ICC measures the correlation between a single rating on a continuous measure using the web-based mode and a continuous measure using the paper-and-pencil mode . A score above 0.80 is usually sought in mode comparison, with 0.70 considered as an acceptable value . ICC is based on mean-centered versions of the indicators and is insensitive to respondent’s tendency to provide consistently higher responses in one mode compared to the other [4;31]. For this reason, Wilcoxon signed ranked tests were calculated to detect the presence of differences between both modes.
Kappa and ICC coefficients were calculated overall and by order group (web first or paper first group). In this paper the overall kappa and ICC coefficients are presented. However in case of a difference between the order groups, the coefficients by order group are mentioned. Further, the kappa and ICC coefficients were calculated with and without respondents who said they experienced a health change (n = 11) but since it had almost no effect, it was decided to use the sample including all respondents.
The quality of the data was assessed by evaluating the missing, ‘don’t know’ and inconsistent values. The latter was defined as an answer that should not have been given according to the skipping and branching logic or as an answer that was inconsistent with other answers. ‘Don’t know’ is a non-substantive answer since it can be seen as a way of refusing to answer a question . The quantification of the values was done by counting the total number of these values separately for both modes of data collection. Furthermore, the mean number of missing, ‘don’t know’ and inconsistent values by questionnaires were calculated for both modes and the differences between the modes were evaluated by performing a Wilcoxon signed rank test.
Additionally, paper-and-pencil surveys require manual data-entry and this may generate mistakes and hence, have a negative impact on the data quality. For this reason, a double data-entry was performed. In case inconsistencies were found, they were resolved by checking the paper-and-pencil questionnaire. The number of data-entry mistakes was assessed by counting the total number of data-entry mistakes per data encoder.
Characteristics of the respondents
About 72% of the respondents were female and 57% were younger than 40 years. The age range was 22 to 62 years. No gender or age differences between the order groups were detected (Table 2).
For two indicators a very good agreement was found, with a kappa coefficient of 0.92 (95% CI: 0.85–1.00) for chronic health problems and of 0.84 (95% CI: 0.69–0.99) for activity limitations (Table 3). For self-rated health there was somewhat lower but still good agreement (kappa = 0.74 (95% CI: 0.53–0.96)). About 97% of the respondents had the same response category in both modes for these indicators. The kappa coefficients calculated within each order group showed lower agreement in the web first group compared to the paper first group for self-rated health and for activity limitations. However, there was at least moderate agreement between both modes (kappa≥ 0.55).
Mental and psychosocial health.
For lifetime suicidal ideation a very good agreement was found (kappa = 0.86 (95% CI: 0.76–0.95)) (Table 3). Four other indicators showed good agreement with kappa coefficients varying between 0.61 (95% CI: 0.48–0.74) for mental distress and 0.78 (95% CI: 0.61–0.95) for the presence of an eating disorder. The presence of a depressive disorder (kappa = 0.52 (95% CI: 0.32–0.71)) and of a sleeping disorder (kappa = 0.50 (95% CI: 0.35–0.64)) exhibited only moderate agreement. 77.4% to 95.9% of the respondents had the same response category in both modes. For the ordinal categorical indicator quality of social support, all respondents reported the same response category or stayed within one response category in the positive or negative direction. The kappa coefficients calculated in each order group showed somewhat lower agreement for the presence of an eating disorder in the web first group compared to the paper first group and for the presence of a sleeping disorder and lifetime problematic alcohol consumption in the paper first group compared to the web first group. However the agreement was still at least moderate between both modes (kappa ≥ 0.57) except for the presence of a sleeping disorder (kappa = 0.36 (95% CI: 0.14–0.58)).
The continuous indicator vitality index had an ICC value of 0.79 (95% CI: 0.72–0.84) which indicates that the agreement was acceptable (Table 4). No significant difference between the two modes was observed. The ICC coefficients were similar when doing the analyses in each order group.
For all six categorical health behavior indicators very good agreement was found (Table 3). The kappa coefficients ranged between 1.00 (95% CI: 1.00–1.00) for lifetime cannabis use and 0.84 (95% CI: 0.76–0.91) for risky single occasion alcohol drinking. The percentages of exact agreement indicate that 83.9% to 100% of the respondents had the same response category in the web-based mode as in the paper-and-pencil mode. Concerning the two ordinal indicators alcohol drinking in the past 12 months and risky single occasion alcohol drinking, 100% and 98.6% of the respondents, respectively, gave the same response category or remained within one response category in the web-based and paper-and-pencil mode. When considering kappa coefficients calculated in each order group, equal results were obtained.
The ICC coefficients for the continuous indicators showed high agreement for the number of alcoholic drinks over the whole week (0.89 (95% CI: 0.83–0.93)) and for the age at starting drinking alcohol (0.91 (95% CI: 0.88–0.94)) (Table 4). No significant differences between the two modes were identified. The ICC coefficients were similar when we did the analyses for every order group.
For mammography in the past 2 years and ever being tested for HIV a very good agreement was found with kappa coefficients of, respectively, 0.95 (95% CI: 0.88–1.00) and 0.93 (95% CI: 0.87–0.99) (Table 3). For cervix smear test in the past 3 years somewhat lower but still good agreement was found (kappa = 0.80 (95% CI: 0.65–0.95)). 94.2% to 98.1% of the respondents had the same response category in both modes for the prevention indicators. The kappa coefficients indicated lower agreement for cervix smear test in the past 3 years in the web first group compared to the paper first group. However, the level of agreement was still good (kappa = 0.65 (95% CI: 0.37–0.93)).
Although the total number of missing values was low in both modes, it was higher in the web-based mode (228 (1.3%)) compared to the paper-and-pencil mode (104 (0.6%)) (Table 5). No significant differences were found in the mean number of missing values between the questionnaires in both modes. The total number of ‘don’t know’ values was somewhat higher in the paper-and-pencil mode (93 (3.2%)) compared to the web-based mode (82 (2.8%)) but no significant differences in the mean numbers were found. In the paper-and-pencil mode, there were 12 (1.3%) inconsistent values, while no such values were detected in the web-based mode because of the integrated controls and automatic skipping and branching logic. The two data encoders made 132 data-entry mistakes in total. Data encoder 1 made more mistakes (117 (0.7%)) than data encoder 2 (15 (0.1%)).
This study showed generally a strong agreement between the web-based and the paper-and-pencil mode. For general health indicators good to very good agreement was observed. This is consistent with the findings of Hoebel et al.  who found no differences in the prevalence rates for general health indicators between a web-based and paper-and-pencil health interview survey and of Ritter et al.  who found that respondents answered similarly in these modes for self-report general health instruments. All behavior indicators showed very good agreement. This is in agreement with the results of Vergnaud et al.  who found high measurement agreement for variables related to tobacco use. Hoebel et al.  also found no differences in the prevalence rates for tobacco use and alcohol consumption between these modes. For the three prevention indicators good to very good agreement was found. This is again in line with Hoebel et al.  who found no differences between the web-based and paper-and-pencil mode for participation in influenza vaccination which can be seen as a prevention indicator.
For mental and psychosocial health good to very good agreement was found for six indicators and moderate agreement was observed for two indicators namely the presence of a sleeping disorder and of a depressive disorder. This is in line with a systematic review study that found generally high reliability between electronic and paper-and-pencil modes for psychiatric self-report instruments . The moderate agreement found for depressive and sleeping disorder could be related to the recall period of only one week of the SCL-90-R instrument . Since the washout period in this study was two weeks, it is possible that respondents experienced mood swings or sleeping variation between completing both questionnaires. The variation between health topics in measurement agreement could be due to the nature of the questions as all indicators for which very good agreement was found are based upon factual and objective items whereas the indicators for which moderate agreement was found require assessing personal subjective feelings.
As expected, the web-based mode offered advantages regarding data quality. In the paper-and-pencil mode, respondents gave some answers that should not have been given according to the branching logic and answers that were inconsistent with other answers. Such problems were not reported in the web-based mode due to integrated controls and automatic branching and skipping logic. Furthermore, the process of manual data-entry and the accompanying mistakes were avoided. However, there were no less missing values in the web-based mode. On the contrary, slightly more missing values were generated but this was not a statistically significant difference. Other studies generally found less missing values in a web-based mode compared to a paper-and-pencil mode [3;4;18]. This difference might be explained by the fact that our respondents were allowed to skip questions. Studies that also didn’t enforce answers as well found slightly more missing values in the web-based mode [35;36].
This study has some limitations. A convenience sample of the employees of two research institutes was used. These people are generally in good health, part of the working-age population, mainly highly educated and probably familiar with completing questionnaires in both modes. Consequently, it should be acknowledged that this sample excluded people who do not routinely access the internet. Due to these factors, the sample may not be representative for the general population. Nevertheless, web-based questionnaires in mixed-mode surveys are more likely to attract younger and highly educated people with internet access . This study tested measurement agreement for BHIS indicators which are aggregated indicators based upon multiple questions/items of existing health instruments and that combine multiple response categories of questions. This might have masked potential differences between modes. A two-week washout period prevented that answers given the first time would be recalled and influenced the answers given the second time . However, since this study was organized during the holiday period some variability in the wash-out period occurred (2–40 days). Nevertheless other studies that tested measurement agreement reported comparable variability in washout periods [3;4;39] and a study that compared test-retest reliability of health status instruments using a two-day or two-week washout period found no time interval effect . Furthermore, respondents could indicate if they experienced a health change during the washout period since this could have affected the agreement .
In conclusion, this study supports the idea that web-based modes provide, in general, equal responses as paper-and-pencil modes. A web-based mode greatly facilitates the data-entry process and guides the completing of a questionnaire, however, item non-response was not positively affected. Even with the limitation of having a sample with a majority of highly educated and internet familiar people, the agreement between the two modes was quite substantial to conclude that mixed-mode data collection including a paper-and-pencil and web-based questionnaire could be undertaking without impacting the comparability of the estimates.
- 1. Ekman A, Litton JE. New times, new needs; e-epidemiology. European journal of epidemiology. 2007;22(5): 285–292. pmid:17505896
- 2. Van Gelder MM, Bretveld RW, Roeleveld N. Web-based questionnaires: the future in epidemiology? American Journal of Epidemiology. 2010;172(11): 1292–1298. pmid:20880962
- 3. Touvier M, Méjean C, Kesse-Guyot E, Pollet C, Malon Al, Castetbon K et al. Comparison between web-based and paper versions of a self-administered anthropometric questionnaire. European journal of epidemiology. 2010;25(5): 287–296. pmid:20191377
- 4. Vergnaud AC, Touvier M, Méjean C, Kesse-Guyot E, Pollet C, Malon A et al. Agreement between web-based and paper versions of a socio-demographic questionnaire in the NutriNet-Santé study. International journal of public health. 2011;56(4): 407–417. pmid:21538094
- 5. Choi NG, DiNitto DM. The digital divide among low-income homebound older adults: Internet use patterns, eHealth literacy, and attitudes toward computer/Internet use. Journal of Medical Internet Research. 2013;15(5).
- 6. De Leeuw ED, Hox JJ. Internet surveys as part of a mixed-mode design. In: Das M, Ester P, Kaczmirek L, editors. Social and behavioral research and the internet: Advances in applied methods and research strategies. New York,NY: Taylor & Francis; 2011. pp. 45–76.
- 7. Vannieuwenhuyze J, Loosveldt G, Molenberghs G. A method for evaluating mode effects in mixed-mode surveys. Public opinion quarterly. 2010;74(5): 1027–1045.
- 8. Hoebel J, von der Lippe E, Lange C, Ziese T. Mode differences in a mixed-mode health interview survey among adults. Arch Public Health. 2014;72(46).
- 9. Jäckle A, Roberts C, Lynn P. Assessing the effect of data collection mode on measurement. International Statistical Review. 2010;78(1): 3–20.
- 10. De Leeuw D. To mix or not to mix data collection modes in surveys. Journal of official statistics. 2005;21(2): 233–255.
- 11. Fang J, Prybutok V, Wen C. Shirking behavior and socially desirable responding in online surveys: A cross-cultural study comparing Chinese and American samples. Computers in human behavior. 2016;54: 310–317.
- 12. Heerwegh D, Loosveldt G. Face-to-face versus web surveying in a high-internet-coverage population differences in response quality. Public opinion quarterly. 2008;72(5): 836–846.
- 13. Krosnick JA. Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied cognitive psychology. 1991;5(3): 213–236.
- 14. Gwaltney CJ, Shields AL, Shiffman S. Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: a meta-analytic review. Value in Health. 2008;11(2): 322–333. pmid:18380645
- 15. Shim JM, Shin E, Johnson TP. Self-rated health assessed by web versus mail modes in a mixed mode survey: the digital divide effect and the genuine survey mode effect. Medical care. 2013;51(9): 774–781. pmid:23774510
- 16. Gnambs T, Kaspar K. Disclosure of sensitive behaviors across self-administered survey modes: a meta-analysis. Behavior research methods. 2015;47(4): 1237–1259. pmid:25410404
- 17. Alfonsson S, Maathz P, Hursti T. Interformat reliability of digital psychiatric self-report questionnaires: a systematic review. Journal of Medical Internet Research. 2014;16(12).
- 18. Campbell N, Ali F, Finlay AY, Salek SS. Equivalence of electronic and paper-based patient-reported outcome measures. Quality of Life Research. 2015;24(8): 1949–1961. pmid:25702266
- 19. Schmitz N, Hartkamp N, Brinschwitz C, Michalek S, Tress W. Comparison of the standard and the computerized versions of the Symptom Check List (SCL-90-R): a randomized trial. Acta Psychiatrica Scandinavica. 2000;102(2): 147–152. pmid:10937788
- 20. Vallejo MA, Mañanes G, Comeche MI, Díaz MI. Comparison between administration via Internet and paper-and-pencil administration of two clinical instruments: SCL-90-R and GHQ-28. Journal of Behavior Therapy and Experimental Psychiatry. 2008;39(3): 201–208. pmid:17573039
- 21. Booth-Kewley S, Larson GE, Miyoshi DK. Social desirability effects on computerized and paper-and-pencil questionnaires. Computers in human behavior. 2007;23(1): 463–477.
- 22. Wang YC, Lee CM, Lew-Ting CY, Hsiao CK, Chen DR, Chen WJ. Survey of substance use among high school students in Taipei: web-based questionnaire versus paper-and-pencil questionnaire. Journal of Adolescent Health. 2005;37(4): 289–295. pmid:16182139
- 23. Demarest S, Van der Heyden J, Charafeddine R, Drieskens S, Gisle L, Tafforeau J. Methodological basics and evolution of the Belgian Health Interview Survey 1997–2008. Arch Public Health. 2013;71(24).
- 24. Goldberg D. Manual of the general health questionnaire. Windsor: National Foundation for Educational Research Nelson, 1978.
- 25. Derogatis LR. SCL-90-R administration, scoring and procedures manual. Minneapolis: National Computer System, 1994.
- 26. Landis JR, Koch GG. The measurement of observer agreement for categorical data. biometrics. 1977;33: 159–174. pmid:843571
- 27. Cohen J. A coefficient of agreement for nominal scales. Educational and psychological measurement. 1960;20(1): 37–46.
- 28. Cox B, Oyen HV, Cambois E, Jagger C, Roy Sl, Robine JM et al. The reliability of the minimum European health module. International journal of public health. 2009;54(2): 55–60. pmid:19183846
- 29. Velikova G, Wright EP, Smith AB, Cull A, Gould A, Forman D et al. Automated collection of quality-of-life data: a comparison of paper and computer touch-screen questionnaires. Journal of clinical oncology. 1999;17(3): 998–1007. pmid:10071295
- 30. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychological bulletin. 1979;86(2): 420–428. pmid:18839484
- 31. Graham A, Papandonatos G. Reliability of internet-versus telephone-administered questionnaires in a diverse sample of smokers. Journal of Medical Internet Research. 2008;10(1).
- 32. Bech M, Kristensen MB. Differential response rates in postal and Web-based surveys in older respondents. Survey Research Methods. 2009;3(1): 1–6.
- 33. Ritter P, Lorig K, Laurent D, Matthews K. Internet versus mailed questionnaires: a randomized comparison. Journal of Medical Internet Research. 2004;6(3).
- 34. Wijndaele K, Matton L, Duvigneaud N, Lefevre J, Duquet W, Thomis M et al. Reliability, equivalence and respondent preference of computerized versus paper-and-pencil mental health questionnaires. Computers in human behavior. 2007;23(4): 1958–1970.
- 35. Bot AG, Menendez ME, Neuhaus V, Mudgal CS, Ring D. The comparison of paper-and web-based questionnaires in patients with hand and upper extremity illness. Hand. 2013;8(2): 210–214. pmid:24426921
- 36. Broering JM, Paciorek A, Carroll PR, Wilson LS, Litwin MS, Miaskowski C. Measurement equivalence using a mixed-mode approach to administer health-related quality of life instruments. Quality of Life Research. 2014;23(2): 495–508. pmid:23943258
- 37. Messer BL, Dillman DA. Surveying the general public over the internet using address-based sampling and mail contact procedures. Public opinion quarterly. 2011;75(3): 429–457.
- 38. Basnov M, Kongsved SM, Bech P, Hjollund NH. Reliability of short form-36 in an Internet-and a pen-and-paper version. Informatics for Health and Social Care. 2009;34(1): 53–58. pmid:19306199
- 39. Vallejo MA, Jordán CM, Díaz MI, Comeche MI, Ortega J. Psychological assessment via the internet: a reliability and validity study of online (vs paper-and-pencil) versions of the General Health Questionnaire-28 (GHQ-28) and the Symptoms Check-List-90-Revised (SCL-90-R). Journal of Medical Internet Research. 2007;9(1).
- 40. Marx RG, Menezes A, Horovitz L, Jones EC, Warren RF. A comparison of two time intervals for test-retest reliability of health status instruments. Journal of clinical epidemiology. 2003;56(8): 730–735. pmid:12954464