Perceptions of research integrity climate differ between academic ranks and disciplinary fields: Results from a survey among academic researchers in Amsterdam

Breaches of research integrity have shocked the academic community. Initially explanations were sought at the level of individual researchers but over time increased recognition emerged of the important role that the research integrity climate may play in influencing researchers’ (mis)behavior. In this study we aim to assess whether researchers from different academic ranks and disciplinary fields experience the research integrity climate differently. We sent an online questionnaire to academic researchers in Amsterdam using the Survey of Organizational Research Climate. Bonferroni corrected mean differences showed that junior researchers (PhD students, postdocs and assistant professors) perceive the research integrity climate more negatively than senior researchers (associate and full professors). Junior researchers note that their supervisors are less committed to talk about key research integrity principles compared to senior researchers (MD = -.39, CI = -.55, -.24). PhD students perceive more competition and suspicion among colleagues (MD = -.19, CI = -.35, -.05) than associate and full professors. We found that researchers from the natural sciences overall express a more positive perception of the research integrity climate. Researchers from social sciences as well as from the humanities perceive less fairness of their departments’ expectations in terms of publishing and acquiring funding compared to natural sciences and biomedical sciences (MD = -.44, CI = -.74, -.15; MD = -.36, CI = -.61, -.11). Results suggest that department leaders in the humanities and social sciences should do more to set fairer expectations for their researchers and that senior scientists should ensure junior researchers are socialized into research integrity practices and foster a climate in their group where suspicion among colleagues has no place.


Introduction
Recent breaches of research integrity in The Netherlands and worldwide have shocked the academic community [1][2][3][4]. Such events led to a new field of inquiry that aimed to better understand how common the problems are and what drives researchers to misbehave [5][6][7]. Initially, studies in this area mainly focused on research misconduct, in which there is generally an intent to deceive (fabrication, falsification, plagiarism). However, over time the focus broadened to the more frequent questionable research practices (QRPs). Accumulating empirical evidence has indicated QRPs are much more prevalent than formal research misconduct [8][9][10]. Consequently, QRPs probably have on the aggregated level more impact. Initially, explanations for research misconduct were sought at the level of individual researchers [11] but over time increased recognition emerged of the important role that structural and institutional factors such as research climate may play in influencing researchers' behavior [12][13][14][15][16]. This has shifted the focus to the organizational climate in research settings as a potential target for intervention [17,18].
Studying organizational climates implies investigating the environment researchers work in and how this climate can strengthen or erode research integrity [19,20]. The organizational climate here is defined as "the shared meaning organizational members attach to the events, policies, practices, and procedures they experience and the behaviors they see being rewarded, supported, and expected." (p. 115) [21,22]. Crain et al. [23] have documented that a favorable organizational research climate is positively associated with lower levels of self-reported questionable research practices. The Survey of Organizational Research Climate (henceforth: SOuRCe) is designed to measure the organizational research integrity climate in academic research settings [18,20,22,24].
The SOuRCe is embedded in two conceptual frameworks, the first being organizational justice theory [25]. In a nutshell: the fairer people regard decisions and decision-making processes in their organization, the more likely they trust their organization, abide by decisions made and do not engage in questionable behavior [26,27]. When people perceive procedural or distributional injustice in their organization, they are more likely to behave in ways that, in their mind, compensates for the perceived unfairness [27]. Applied to research integrity, in a research climate where perceived injustice is high, researchers would be expected to be more likely to engage in intentional research misconduct (falsification, fabrication and plagiarism) or questionable research practices [27].
The second conceptual framework underpinning the SOuRCe stems from the Institute of Medicine report Integrity in Scientific Research: Creating an Environment That Promotes Responsible Conduct [28]. This report describes the research environment as an open systems model where different factors influence research integrity. The report specifies that the research integrity climate can both stimulate or diminish responsible research [18,28,29]. Some key factors herein that are reflected in the SOuRCe are ethical leadership, integrity policy familiarization and communication, and the degree to which these are known by people in the organization [18,28].
Previous research with the SOuRCe found that researchers in different phases of their career perceive the research integrity climate differently [22]. PhD students perceived the climate to be fairer compared to senior scientists in that scholarly integrity was valued (e.g. acknowledging work of others). Senior scientists perceived there to be more resources for conducting research responsibly (e.g. policies to deal with integrity breaches were well known) [22].
Wells et al. [22] also found large differences in SOuRCe scores for different organizational subunits. Some had scores twice as negative compared to others or compared to overall mean scores. This indicates that overall high mean scores on an institutional level offer departmental leaders little comfort [20] and research climate may vary significantly within institutions. One factor that accounts for these stark differences between subunits was disciplinary field [22].
Our study aims to determine how scientists experience the research integrity climate, stratified for academic rank and disciplinary field, in two university medical centers and two universities in Amsterdam. This is the first study that investigates research integrity climate in The Netherlands. Assessing research integrity climate will provide insight what factors may hinder responsible research practices [26].
We hypothesized that we would observe significant variability in SOuRCe scale-scores based on (1) the disciplinary field in which academic researchers work and (2) the academic ranks of respondents. As our aim is descriptive in nature, we did not specify the direction of these differences.

Ethical considerations
The Scientific and Ethical Review board of the Faculty of Behavior & Movement Sciences (Vrije Universiteit Amsterdam) approved our study (Approval Number: VCWE-2017-017R1).

Participant selection and procedure
The institutions that participated in our study included two universities (Vrije Universiteit Amsterdam and University of Amsterdam) and two academic medical centers (Amsterdam Medical Centers). Upon securing endorsement from the deans and rectors of the participating institutions and finalizing a data sharing agreement, each institution provided a list of e-mail addresses of all researchers and PhD students. We distributed the electronic survey in May 2017 via email among all academic researchers. Researchers were eligible to participate if they were doing research at least one day per week (>0.2fte) on average. Our cross-sectional online survey contained three instruments (SOuRCe, the Publication Pressure Questionnaire [30] and a list of 60 major and minor research misbehaviors [9]). This article presents the SOuRCe results. The survey concluded with three demographic items about gender, academic rank and disciplinary field.
We used the online survey program Qualtrics (Qualtrics, Provo, UT, USA) to create and distribute the survey. Researchers first received an information e-mail explaining the purpose, goal and procedure of the study. After one week, we sent the official invitation with a unique link to the survey and a link to the non-response survey (see S1 Appendix). The invitation also included a link to our privacy policy and the protocol (see S2 Appendix and S1 Protocol), both available on the project's website (www.amsterdamresearchclimate.com). The survey started with an online informed consent form. After consenting, participants were asked to indicate whether they were doing research for at least one day per week (inclusion criterion). We sent three reminders to those who had not responded yet. All correspondence explicitly stated that the data would remain confidential and that participation was voluntary.

Instruments
We used the Survey of Organizational Research Climate [22][23][24]31]. The SOuRCe evaluates what factors play a role in the perceived research climate on a scale that ranges from 1 ("not at all") to 5 ("completely") [18,24]. It consists of 28 items forming 7 subscales that detail the organizational climate of integrity on a departmental and institutional level [29]. For an overview of the SOuRCe subscales, see Table 1.
SOuRCe subscale scores are calculated by taking the average of all valid non-missing items in that subscale. The respondent needs to validly answer at least half of the items in the subscale for the subscale score to be valid. Valid scores are all response options except for "No basis for judging OR not relevant to my field of work". All subscale scores can be interpreted using the same logic: the higher the score, the stronger the presence of that factor. Higher scores thus express a more favorable perception of the research integrity climate. While most SOuRCe items ask about the perceived presence of integrity supporting aspects of the local climate, the Integrity Inhibitors scale is comprised of items that ask about the perceived presence of factors that may inhibit research integrity. For analysis and reporting, the items contributing to this scale are reverse-coded so that the higher this subscale's score; the greater the lack of integrity inhibiting conditions [29].
The SOuRCe was designed for a biomedical research setting. To make the items more applicable to all disciplinary fields from our study population, we slightly altered the wording of three items in consultation with the design team of the SOuRCe (see S1 Table). We also extended the response option: "No basis for judging" to "No basis for judging OR not relevant to my field of work". Unfortunately, two of the original 28 SOuRCe items were inadvertently omitted from the final distribution of the questionnaire due to a programming error.

Statistical analyses
The intended statistical analyses were preregistered under the title 'Academic Research Climate Amsterdam' at the Open Science Framework. Briefly, for the univariate analyses, we computed overall mean subscale scores and stratified scores per academic rank and disciplinary field. For those subscales where academic rank or disciplinary field was significantly associated, we tested whether stratified scores differed significantly using post hoc Bonferroni corrected F tests. We then created association models with academic rank or disciplinary field as independent variable and subscale score as dependent variable. For the multivariate analyses, we corrected for potential confounders (e.g. gender) or added effect modifiers when inspecting the relations between disciplinary field or academic rank and the SOuRCe subscales.

Results
We collected 7548 e-mail addresses from academic researchers in Amsterdam. When we sent out the information letter, 83 bounced immediately as undeliverable. Also, 109 researchers decided not to participate and asked Qualtrics to be unsubscribed from the data base. 2274 researchers opened the questionnaire (30%). Of those who opened the questionnaire, 1298 (17% of the total sample) researchers answered enough questions to complete at least one SOuRCe subscale (57% of those who opened the questionnaire). See Fig 1. Only 2% filled in the ultra-brief non-response questionnaire.

Differences between academic ranks
Overall mean subscale scores of the total sample are given in Figs 2 and 3 as a general reference to our stratified results. Investigating our first hypothesis (differences between academic ranks), for those subscales that were significantly associated with rank (Integrity Norms, Integrity Socialization, Integrity Inhibitors, Supervisor-Supervisee relations, Expectations and RCR Resources, respectively), we ran post-hoc Bonferroni corrected F tests. The purpose was to see whether PhD students, postdocs & assistant professors or associate & full professors perceived the climate differently on these subscale (see Table 2). PhDs students as well as postdocs and assistant professors scored significantly lower than associate and full professors on 4 subscales (Expectations, Supervisor-Supervisee relations, Integrity Socialization and RCR Resources, respectively). PhDs students (M = 3.73) also scored significantly lower than associate and full professors (M = 3.92) on Integrity Inhibitors. Postdocs and assistant professors (M = 3.67) Perceptions of research integrity climate differ between academic ranks and disciplinary fields scored significantly lower on Integrity Norms than did associate and full professors (M = 3.82) See Fig 2. Finally, we tested whether the relation between academic rank and the SOuRCe subscale scores was confounded or modified by other independent variables (i.e. gender or disciplinary field). Expectations and Integrity Norms were confounded by gender, respectively. Adding gender to these models made the associations between academic rank and SOuRCE subscale scores slightly weaker but the effect remained significant. We found effect modification by gender on RCR Resources only, these stratified results are given in Table 3. Therefore, Fig 2 and Table 2 display statistics corrected for confounding or reporting effect modification if applicable. We have calculated the effect sizes for the significant differences and interpreted these using Cohen [32], see Table 4.

Differences between disciplinary fields
Regarding our second hypothesis (differences between disciplinary fields), disciplinary field was associated with Regulatory Quality and Expectations, see Table 5. Humanities scored significantly lower on Regulatory Quality than biomedicine. Social Sciences (M = 3.05) as well as humanities (M = 2.97) score significantly lower on Departmental Expectations than both   Table 5 and Fig 3. The associations between disciplinary field and both Regulatory Quality as well as Expectations were confounded by rank, yet again the main effect of discipline remained significant. Therefore, Fig 3 and Table 5 display statistics corrected for confounding. We have calculated the effect sizes of each difference, see Table 4.

Discussion
We assessed the research integrity climate in Amsterdam using the SOuRCe. We hypothesized that we would observe significant variability in SOuRCe scale-scores based on (1) the disciplinary field in which academic researchers work and (2) the academic ranks of respondents. For the sake of brevity, we therefore discuss only the significant differences between academic ranks and disciplinary fields below.

Differences between academic ranks
Departmental Expectations were perceived more negatively by PhD students, postdocs and assistant professors. This could be because their career prospects often directly depend on fulfilling these expectations whereas more senior scientists are less directly dependent on meeting publication and funding requirements for retaining their job [33,34]. This result is similar to Martinson et al. (2006) who found mid-and early career scientists to perceive higher amounts of organizational injustice compared to senior scientists as measured by asking scientists about the efforts they put into scientific work and rewards they receive in return [35][36][37].
We found PhDs as well as postdocs and assistant professors to score lower on Supervisor-Supervisee relations than associate and full professors. Martinson et al. [20] found the opposite effect in their study of researchers within the U.S. Department of Veterans Affairs Healthcare System in which the senior staff perceived this scale to be lower than more junior staff. In contrast, in a study of more traditional academic researchers in the U.S., Wells et al. [22] did not find notable differences on this scale by academic rank. The fact that junior researchers in our sample perceive their supervision as suboptimal could be alarming as poor mentoring is associated with the risk of emotional stress [38,39] and poor mentoring is viewed by some as one of the most impactful research misbehaviors [9].
Contrary to Wells et al. [22] who found U.S. junior researchers to report the highest levels of Integrity socialization, we found PhDs and postdocs to report lower levels of Integrity socialization than professors. Junior researchers are the ones who would have to be 'socialized' into research integrity whereas senior researchers in charge of this socialization process report higher levels. This discrepancy could indicate that senior researchers acknowledge the importance of research integrity when it comes to effective socialization of junior researchers into the department, yet we may conclude that in practice this socialization into research integrity does not get sufficient attention.
Communication about research integrity policies, part of the RCR Resources subscale, from the various bodies in academia is often addressed to the deans, department heads or principal investigators. This could explain why Wells et al. [22] found the same result as we have here: senior researchers score higher on RCR Resources than junior researchers. Being a senior researcher (associate & full professor) in an academic organization inevitably means that research integrity policies created at the top are more likely to land on your desk.
Interestingly, this effect depended on gender: female researchers perceived more RCR Resources, except for PhD students where male researchers perceived more resources to conduct their research responsibly. Perhaps female PhD students are also more likely to express their concern about the availability of resources for responsible conduct of research than their male counterparts. There is some evidence that women value procedural justice, the way in which resources are distributed, more than men do [40] but as no gender interactions in SOuRCe subscales have been reported, it seems premature to conclude that this applies here.
PhD students perceived the [lack of] Integrity Inhibitors to be lower than did associate and full professors. Mirroring the pattern for this subscale of Wells et al. [22], PhD students perceive a larger presence of such integrity inhibiting conditions (such as suspicion among colleagues or a hostile atmosphere) than more senior researchers. Associate and full professors may have gotten used to inhibitors such as publication pressure and regard these as less of a threat to integrity [22,41].
Finally, postdocs and assistant professors perceive Integrity Norms to be lower than associate and full professors, indicating a more negative attitude towards research integrity. Maybe postdocs and assistant professors witness less responsible research and more QRPs. This again parallels the three U.S. universities findings where postdocs scored lowest on more than half of the SOuRCe subscales [22]. Postdocs and assistant professors perceiving more questionable conduct of research also aligns with studies assessing the frequency of misbehavior, where mid-career scientists admitted to more research misbehaviors than did senior scientists [5].

Differences between disciplinary fields
Similar to Wells et al. [22], we found the humanities to score lowest on both Departmental Expectations and Regulatory Quality. The difference was to be expected as regulatory bodies play a more important role in fields where rules and regulation are pivotal (such as biomedicine). In areas like literature or philosophy, regulatory bodies are less important or nonexistent. Hence, researchers from the humanities might score lower because they do not encounter these regulating bodies.
The subscale Departmental Expectation measures the degree to which researchers perceive their department's expectations regarding publishing or obtaining funding as fair. Alike Wells et al. [22] natural sciences score highest and the humanities score lowest. One explanation could be that in areas like philosophy or law the traditional way of disseminating academic work is via books, national or specialist journals. Nowadays when performance is measured the focus is predominantly on publishing in (high-impact) international journals. This can cause dissatisfaction from researchers from the humanities, as their books and national contributions are not valued the same way by their department as other academic products such as journal publications.

Strengths of our study
Ours is the first publicly available study to investigate the research integrity climate in a European country. It is too premature to compare our data to the U.S. studies available, as differences in research integrity climates found could be due to a range of factors (known and unknown) that neither of these studies has measured. Our data can provide a useful baseline measurement so that repeated administration of the SOuRCe could provide information on developments over time. With this knowledge we can better inform universities about interventions tailored to specific disciplines and ranks. This can be used to create a better climate for research integrity.
Furthermore, the SOuRCe subscales focus on observable characteristics in the local environment. This means that the SOuRCe provides direct feedback for academic leaders on what can be improved in the organizational structure for fostering research integrity. For example, we found Integrity Socialization is perceived low by junior researchers. This result might target investigation at the institution to find out how socialization can be boosted, how and what means are necessary to foster embedding of research integrity socialization.

Study limitations
Although our completion rate of 18% is low, it is similar to other online surveys. This does not necessarily indicate response bias [42,43]. Response bias could occur when non-responders are dissimilar to responders. We tried to estimate this by asking non-responders to fill in a brief non-response questionnaire, but only 2% of non-responders did which we regard too little to base solid conclusions on. We thus tried to assess the representativeness of our respondents for the total population by comparing our demographics to publicly available data on researchers in Amsterdam. Because data on researchers in medical centers is not readily available, we decided to filter out all researchers who indicated working in biomedical sciences. When comparing the researchers in our sample from the two universities (excluding all researchers from the two medical centers) to the publicly available data on researchers at the two universities in Amsterdam, it appears that we had a reasonably representative sample taking part from the various ranks: 27% of researchers are full or associate professor (our sample: 21%), 40% are assistant professor or postdoc (our sample 38%) and 32% are PhD-student (our sample: 41%).
However, there may be a gender bias as more than half of the researchers in our sample were female (57% respectively). In the Netherlands as a whole, females only account for 39% of academics (https://www.vsnu.nl/f_c_ontwikkeling_aandeel_vrouwen.html). In Amsterdam, that is 42%. This is most likely accounted for by an overrepresentation of female PhD students in our sample (68% versus the national 45% of PhD students in academia). This could be due to women's greater willingness to participate in surveys [44,45]. However, we accounted for this selectivity by correcting for gender where necessary. In the case of RCR Resources, gender modified the results. Hence, we report this effect separately for men and women (see Table 3). To conclude, this selectively of the sample is unlikely to bias our results.
Also, to protect respondents' and institutions' privacy, we decided to only collect personal information about gender, academic rank and disciplinary field. This restricted our ability to obtain institutional-level, department-level and specific field of study level classifications, making it likely that we have missed meaningful variability between institutes or departments within our broad disciplinary categories. This way of collecting our data on relatively large group level only (academic rank and disciplinary field) also makes a more advanced multilevel model infeasible, so results from our multivariate association models (see Tables 2 and 5) should be interpreted with caution as the standard errors of observed associations may be under-estimated due to clustering [46]. We tried to estimate the impact of clustering using unpublished ICCs from the data used by Wells et al. [22] for institute (they had three participating institutions, we have four). Applying the clustering correction affected the relation with rank and Integrity Norms and Integrity Inhibitors: rank was no longer significantly associated with these three subscales. Other associations with rank remained significant despite the VIF correction, see S2 Table. Disciplinary field remained significantly associated with both Expectations and Regulatory Quality, see S3 Table).

Implications
The core finding that the research integrity climate is perceived differently by juniors and seniors as well as by researchers from different disciplinary fields, stresses the need for tailored interventions. A one-size-fits all approach to improve the academic research integrity climate will likely not yield the desired effect [23]. Interestingly, nowadays more attention is paid to proper research integrity education via means of tutorials, seminars and other courses. This does not align with the low score on Integrity Socialization and RCR resources in our sample. However, integrity is not something someone learns from one course, responsible research has to become a habit, not an exception. There is terrain to win by integrating research integrity into daily practice by taking time to make every new researcher in the department familiar with research integrity. Furthermore, it can help to focus discussions about research integrity on the actual situation in the department: what standard procedures have been implemented to foster responsible research without having to compromise research integrity.
A rather alarming observation in our results is PhD students' perception of integrity inhibiting factors. The novices in academic research already have to cope with suspicion and competition among colleagues. Navigating in a research integrity climate with such challenges asks for thoughtful guidance from senior researchers that sadly seems no to have no priority [9].
In conclusion, the research integrity climate is perceived differently by researchers from different disciplinary fields. Small fields like the humanities perceive their department's expectations as more negative compared to other disciplinary fields. The natural sciences overall seem to perceive the climate more positively.
Associate and full professors perceive a more positive research integrity climate than assistant professors, postdocs and PhD-students. This might be a key for improving the research integrity climate. Senior scientists should ensure that new researchers are socialized into research integrity practices and foster a climate in their group where suspicion among colleagues has no place.   Table. Variance Inflation correction tests for association models with academic rank. Clustering refers to situations where there is non-independence of observations in the data, resulting in "design effects" or "intraclass-correlations" (ICCs) in the data. In our study, respondents are clustered (or nested) in departments, that are again nested within disciplines that are themselves nested within institutions introducing dependence in the data on different levels. Inference on regression coefficients needs to take this dependence into account. Ignoring the clustering in the analyses yields estimates for the standard errors for the betas that are too small, and hence, will also result in p-values that are too small (and increase of type Ierrors). For reasons of privacy, data concerning affiliation of the respondents was not available and for this reason a standard multilevel analysis correcting for clustering could not be performed. We therefore used a linear regression with a post-hoc correction of the SE's of the beta's using an estimate for the Variance Inflation Factor. By correcting the SE's, we get some indication of whether the associations we found are still there had we taken clustering into account [46]. The authors of the Wells et al. [22] study calculated their ICC institute (see ICC institute in the formulae below) for us. This way we could compute the Variance Inflation Factor as a correction for the sample size we used to estimate our SE's. We re-calculated the t-statistics, using adjusted degrees of freedom and adjusted SE's, to assess whether the associations we found between rank or discipline and the SOuRCe subscales are also detected when clustering is taken into account. The calculations are according to the following formulas:

Supporting information
The left column describes the relevant SOuRCe subscale with the effective N, the second column the VIF value based on the ICC for Institute from the Wells et al. study, then the adjusted t-scores and significance level of the corrected association is given. The final right column concludes what the impact of the VIF correction meant for the initially found association. (PDF) S3 Table. Variance Inflation correction tests for association models with disciplinary field. Clustering refers to situations where there is non-independence of observations in the data, resulting in "design effects" or "intraclass-correlations" (ICCs) in the data. In our study, respondents are clustered (or nested) in departments, that are again nested within disciplines that are themselves nested within institutions introducing dependence in the data on different levels. Inference on regression coefficients needs to take this dependence into account. Ignoring the clustering in the analyses yields estimates for the standard errors for the betas that are too small, and hence, will also result in p-values that are too small (and increase of type I-errors). For reasons of privacy, data concerning affiliation of the respondents was not available and for this reason a standard multilevel analysis correcting for clustering could not be performed. We therefore used a linear regression with a post-hoc correction of the SE's of the beta's using an estimate for the Variance Inflation Factor. By correcting the SE's, we get some indication of whether the associations we found are still there had we taken clustering into account [46]. The authors of the Wells et al. [22] study calculated their ICC institute (see ICC institute in the formulae below) for us. This way we could compute the Variance Inflation Factor as a correction for the sample size we used to estimate our SE's. We re-calculated the t-statistics, using adjusted degrees of freedom and adjusted SE's, to assess whether the associations we found between rank or discipline and the SOuRCe subscales are also detected when clustering is taken into account. The calculations are according to the following formulas: The left column describes the relevant SOuRCe subscale with the effective N, the second column the VIF value based on the ICC for Institute from the Wells et al. [22] study, then the adjusted t-scores and significance level of the corrected association is given. The final right column concludes what the impact of the VIF correction meant for the initially found association. (PDF)