The German translation of the Oxford utilitarianism scale: Validation and the impact of the Covid-19 pandemic on the observations

Aistė Ambrasė; Malte Hendrickx; Melina Grahlow; Hong Yu Wong; Birgit Derntl

doi:10.1371/journal.pone.0335215

Abstract

The study of utilitarian inclinations is probably the most experimentally investigated aspect of morality. The Oxford Utilitarianism Scale has been developed to provide a self-report tool for reliable measurement of utilitarian views while addressing serious methodological issues with previous measures. In this study, we have translated and validated a German version of the Oxford Utilitarianism Scale (OUS-DE). The scale consists of two subscales: Impartial Beneficence (IB-DE) and Instrumental Harm (IH-DE). We conducted a procedure in a general German sample (N_S1 = 378, 243 women, M_age = 25.37) before the Covid-19 pandemic. A confirmatory factor analysis demonstrated a good fit of a two-factor model for OUS-DE, while internal consistency and construct reliability were acceptable. Both in the pre-pandemic and the post-pandemic sample (N_S2 = 348, 206 women, M_age = 24.61) we found a sex/gender difference, with women scoring significantly higher in the IB-DE subscale than men. We also found that the mean agreement with the IB-DE subscale decreased after the pandemic. In a separate third sample (N_S3 = 39, 19 women, M_age = 23.72), we observed an inverse U-shape relationship between moral behavior related to quarantine requirements and the IH-DE subscale, as measured during the peak pandemic restrictions in late 2020. Repeated OUS-DE measurement in this sample showed stability in responders’ utilitarian beliefs post-pandemic. In sum, OUS-DE is the first available measurement of utilitarian inclinations in German. The scale will enable further research on how utilitarian preconceptions affect behavior in German-speaking populations.

Citation: Ambrasė A, Hendrickx M, Grahlow M, Wong HY, Derntl B (2025) The German translation of the Oxford utilitarianism scale: Validation and the impact of the Covid-19 pandemic on the observations. PLoS One 20(10): e0335215. https://doi.org/10.1371/journal.pone.0335215

Editor: Ehsan Namaziandost, Islamic Azad University Ahvaz Branch, IRAN, ISLAMIC REPUBLIC OF

Received: March 13, 2025; Accepted: October 7, 2025; Published: October 27, 2025

Copyright: © 2025 Ambrasė et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Raw data were generated at University of Tübingen. Derived data supporting the findings of this study are available on Open Science Foundation Website: https://osf.io/ywe84/.

Funding: Participation expenses for this research were covered by a personal scholarship to AA awarded by Vilnius Lyceum Alumni Association UK (VLAAUK, Charity No. 1184536, https://licejausalumni.lt/vlaauk). BD, AA, and MG are supported by the DFG (DE2319, IRTG 2804, https://www.dfg.de/). MH is supported by Weinberg Institute for Cognitive Science Fellowship (https://lsa.umich.edu/weinberginstitute), a Mercatus Fellowship (https://www.mercatus.org/), and an Institute for Humane Studies Fellowship (https://www.theihs.org/). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Utilitarianism is a moral theory that holds that the morally correct action is the one that brings about the most good for the greatest amount of people [1–3]. As a consequentialist theory, it is often distinguished from other moral theories by its focus on the consequences of actions (as opposed to, e.g., intentions, rights, or virtues). Due to Utilitarianism’s commitment to intriguing departures from commonsense morality, agreement with utilitarian principles in the general population has been extensively studied in psychology and neuroscience [4–6].

Utilitarianism deviates from the commonsense morality in two important ways. First, in contrast to the requirement of non-interference in commonsense morality, e.g., “not to harm”, “not to steal” [7–11], Utilitarianism controversially instructs one to use harm in an instrumental way in situations where the aggregate outcome would outweigh the cost. For example, it may be morally required to sacrifice a single bystander to divert a trolley racing toward a group of five [12].

Utilitarianism, like commonsense morality, posits a moral obligation to act beneficially towards those in need [13]. However, in a second radical departure from commonsense morality, Utilitarianism advocates for a radically impartial stance in beneficence. It rejects preferential consideration for individuals based on their relation to or distance from the moral agent [14]. For example, Utilitarianism would require choosing a charity based on objective needs rather than national preferences. The departures from commonsense morality with regard to instrumental harm (IH) and radically impartial beneficence (IB) suggest a 2-D model of utilitarian psychology [15].

While Utilitarianism has received more attention in psychology and neuroscience than any other moral theory, its measurement has typically relied on a caricaturized view of Utilitarianism. Agreement with Utilitarianism was operationalized only in terms of agreement of subjects to sacrifice the wellbeing and life of others in moral dilemmas. This led to a situation where both true moral utilitarians and antisocial individuals are classified as utilitarian [16–20]. Utilitarianism is not well captured by an exclusive focus on IH [21]. In fact, IB is considered a distinctive feature of all consequentialist theories, of which Utilitarianism is the most prominent [22].

As a self-report measure, the Oxford Utilitarianism Scale (OUS; [23]), aims to address this problem by representing both IB and IH, thus assessing personal agreement with the main principles of utilitarian moral theory. The questionnaire comprises two subscales: Impartial Beneficence (IB, 5 items) and Instrumental Harm (IH, 4 items) representing the two tenets that distinguish Utilitarianism from other moral views. The two subscales are independent and dissociable as they show no consistent association with each other. Agreement to both is necessary to infer that the moral agent is indeed guided by Utilitarianism and not another moral theory. The two subscales can also be used separately when investigating associations between the two constructs and individual differences as well as other moral inclinations [23].

In the original OUS validation study [23] and further studies on utilitarian morality [24–26] positive weak to moderate correlations between agreement with the original OUS or either of its subscales and agreement with utilitarian choice in moral dilemmas emerged, indicating content validity of the scale. This evidence was reported in sacrificial dilemmas [24,25], indicating that the measurement generally tracks utilitarian attitudes towards IH, and in impartiality dilemmas [26], showing that the whole IB subscale can capture the impartiality principle well.

Since its publication, OUS has proven to be a valuable research tool in assessing utilitarian inclinations in the general population as well as specific specialist groups such as healthcare professionals [27,28]. OUS has been translated and validated into Turkish [29], French [30], and Spanish [31]. Its structural validity was assessed in Chinese, French, German, Greek, Hungarian, Italian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, and Turkish languages [32]. Most translated and validated versions of OUS confirm the two-factor solution. However, in some languages the translations resulted in a different factor structure [32], indicating either that the translations should be revised and further validation procedures carried out to confirm the two-factor construct of Utilitarianism or that cultural influences should be considered.

Associations between utilitarian inclinations and other moral dispositional traits.

It’s focus on beneficence and minimization of the suffering when harm is involved places Utilitarianism in the moral domain of Harm/Care. Moral domains are sets of morally relevant issues, such as care and harm, fairness, loyalty, authority, sanctity and purity, and they can be measured by the Moral Foundations Questionnaire [33,34]. In a US study, it has been shown that extraordinary altruists highly endorse IB and are more concerned with the Harm/Care than regular controls [35]. In a Polish study, agreement to Harm/Care mediated the significant relationship between religiosity and IB, showing that in different countries cultural and individual differences might be uniquely affecting people’s utilitarian inclinations.

Concern for care also links Utilitarianism to empathy. In moral philosophy, it has been assumed that empathy, the understanding and reception of otherness, is necessary for care behaviors [36,37]. It has been shown that individuals who exhibit higher empathic concern endorse IB to a greater extent and IH to a lesser extent [15,23,30,38]. This pattern of associations between the two components of Utilitarianism has been also demonstrated in other-benefiting altruistic tasks and moral dilemmas where Utilitarianism requires IH [39–41]. However, the relationship between OUS and its subscale scores with empathy were reversed in the Turkish validation of the OUS [29], indicating again that more research is required. As religiosity fosters beneficent behaviors, a positive association between frequency of religious practice and the IB subscale has been found [23,42], while no associations were established between religiosity and behavior during sacrificial harm moral dilemmas [43].

Some authors have claimed that utilitarians are 1) more morally competent because utilitarian choices are based on deliberation processes rather than intuition or emotion [44–46] and that 2) their choices sometimes are driven by higher action-taking tendencies or lower aversion to harm, rather than a true endorsement of utilitarian principles [39,47–51]. OUS has not yet been compared to such constructs. Doing so might provide novel insight into utilitarian psychology.

Associations between utilitarian inclinations and sociodemographic variables.

The current state of research on OUS indicates that sociodemographic differences are associated with consistent individual differences in agreement with utilitarian views. Some authors claim that socialization, culture, and biological sex differences lead to different moral inclinations [50,52,53]. For example, some report that women more strongly internalize their moral beliefs, affecting self-evaluations and moral behavior [54,55]. In the original OUS validation study [23], authors found a trend that men endorsed IH to a greater extent than women, which accords with previous findings in sacrificial moral dilemmas [50,51,56–59] and translation studies of OUS [31]. A higher endorsement of IB has been also found in women [31]. This is similar to studies assessing a related construct, altruism, with women more inclined on average to choose altruistic options than men [60–62].

Moral inclinations also differ between different age groups, even in adults [63–67]. Agreement with IB was found to increase with age in China, Spain, and Chile [31,68]. This pattern is consistent with age effects in altruism [69]. However, in a Romanian sample, younger adolescents scored higher on IB than their older underaged counterparts, indicating that the age effect might not be linear [70]. Regarding IH, agreement to this subscale increased with age in China [68] but decreased in Spain and Chile [31]. A consistent age effect has been demonstrated in utilitarian choices during moral dilemmas, where older adults endorse IH to a lower degree than younger adults [66,71]. The heterogeneity of results on age effects in assessing utilitarian inclinations with OUS indicates that country-related age effects should be investigated.

Effects of Covid-19 pandemic on utilitarian inclinations.

During the Covid-19 pandemic, which started in March 2020 and ended in May 2023 [72], many countries adopted public pandemic policies based on utilitarian principles [73–75]. Germany’s strict pandemic regulations provide ample examples: group/society’s interests were prioritized over interests of individuals, cost/benefit analyses were based on the principle of the greater good in political decisions, and impartiality principles were applied in medical care [76,77]. In daily life, individuals were exposed to personal moral dilemmas, in which the utilitarian option was both personally and socially beneficial [78,79].

This increased exposure to situations where utilitarian choices brought about the largest benefit to society might have altered people’s perceptions of Utilitarianism. Previous research has shown that individuals revise their moral beliefs when their judgments run counter to majority opinion [80] or when imagining a situation in which a proposed contradictory action would be moral [81]. Furthermore, when moral situations are presented in a practical context, as was often the case during the pandemic, individuals tend to be more accepting of moral solutions, even when these solutions contradict their moral views [82]. Taking them together, investigating the effects of Covid-19 pandemic on agreement with OUS and its subscales would provide a valuable addition to research on stability of moral beliefs.

Associations between utilitarian inclinations and utilitarian behavior.

As OUS measures individuals’ self-reported moral beliefs, a question arises whether the measurement can predict real-life utilitarian behavior. Previous research reported that general moral values and beliefs are only weakly associated with actual moral behavior [83]. It has been shown that the association between moral inclinations and the corresponding moral behavior is mediated by co-occurrent moral emotions, moral courage, and surrounding context [84–87]. However, the OUS items are highly specific and criterion-matched to target the distinctive features of Utilitarianism, and methodological research shows that such measures have higher predictive validity than measurements targeting broader constructs [88]. In line with the specificity criterion, the Covid-19 pandemic presented a prime opportunity to study the degree to which OUS predicts moral behavior. During the pandemic, individuals were expected to follow quarantine rules, designed to minimize suffering and benefit a greater number of people while demanding personal sacrifices [78]. Therefore, it is possible that the OUS would have had stronger associations with behavior during the pandemic, as the pandemic provided a conceptually corresponding context.

Current studies

Here we report three studies which measured utilitarian inclinations using OUS. The primary aim of Study 1 was to provide a validated German translation of OUS questionnaire (OUS-DE). Our second aim was to investigate the potential influence of demographic variables, such as age, sex/gender, and religiosity, as well as effects of personality and character differences on agreement with OUS-DE, reported in Study 2. Additionally, as the circumstances allowed, in Study 2 we also intended to investigate the effects of the Covid-19 pandemic on the agreement with OUS-DE, comparing observations from two samples collected before and after the pandemic. Finally, in Study 3 we sought to examine the stability of utilitarian inclinations in the same sample with repeated measures during and after the pandemic as well as studying the relationship between utilitarian moral beliefs and self-reported moral behavior and judgement during the pandemic.

Study 1: Translation and validation of the German Oxford Utilitarianism Scale

The aim of this study was to validate the German translation of the OUS (OUS-DE). We have implemented a validation procedure similar to that done for the original OUS study [23]: we examined the factor structure with confirmatory factor analyses (CFA), assessed the internal consistency, and the split-half reliability of OUS-DE. To establish criterion-related validity of the scale, we implemented the same procedure as the original study by using moral dilemma scenarios related to sacrificial harm and greater good, and self-reported agreement with Utilitarianism.

We hypothesized that a two-factor OUS-DE would reach comparable goodness-of-fit as the original English OUS, and that each of the subscales will show good to excellent model fit in CFA. We expected positive but weak correlations between the two subscales and a strong correlation between the score of each of the subscales and the overall OUS-DE score. OUS has already been translated and validated for use in the Turkish [29], French [30] and Spanish [31] languages, and the CFAs in those studies indicated an excellent two-factor model fit. However, structural validity of a different German translation of OUS was not comparable to the original scale in English language and the scale did not meet configural invariance, indicating that the factorial structure of that German translation was not stable across multiple groups [32]. This prompts the need for another translation and validation study.

Materials and methods

Sample description.

Five hundred five participants completed an online validation survey from 2019-05-14 to 2019-07-14. One hundred twenty-seven participants were excluded from further analyses: 21 participants reported a mother-tongue other than German, two participants failed two attention checks included in the survey, 20 participants completed the survey faster than the predetermined minimal time limit (< 20 minutes), one participant indicated non-binary gender, 70 participants self-reported a mental disorder diagnosis, and 13 participants were identified as outliers in univariate and multivariate outlier analyses (for more details, please see Results). Therefore, the final sample consisted of 378 participants (243 women, M_age= 25.37, SD = 6.52, for sociodemographic characteristics, please see Table 1).

Download:

Table 1. Sociodemographic Characteristics of the final sample.

https://doi.org/10.1371/journal.pone.0335215.t001

Materials.

Oxford utilitarianism scale.

The Oxford Utilitarianism Scale (OUS) consists of two subscales, i.e., Impartial Beneficence (IB, 5 items) and Instrumental Harm (IH, 4 items), comprised of 9 items in total [23]. The scale is scored on a Likert scale from 1 (strongly disagree) to 7 (strongly agree). Greater scores reflect a greater utilitarian inclination, that is, greater implicit agreement with main principles of utilitarian moral theory. The IB subscale measures the positive dimension of Utilitarianism concerning with duties of beneficence (3 items), treating immoral acts and failures to act morally (omissions) as equally morally wrong (1 item), and, importantly, moral impartiality (1 item). Moral impartiality is an important principle in utilitarian moral theory requires an individual to treat other moral agents equally, despite their social closeness [23]. The IH subscale assesses moral appropriateness of instrumental harm, i.e., harm to somebody used as a collateral to save or benefit a greater number of people (3 items) and short-term political oppression to ensure well-being of the citizens (1 item). Together, the two subscales measure general agreement with a utilitarian moral view and an overall agreement score can be calculated. On the other hand, the two subscales can be measured and their relationships with other measurements assessed separately, i.e., they are dissociable as measurements.

Moral dilemma scenarios.

Six moral dilemma scenarios – three sacrificial harm moral dilemmas that capture participants’ behavior in applying instrumental harm for the greater good, and three greater good moral dilemmas that capture participants’ self-sacrificial and impartial behavior – originally used in the validation study by Kahane [23] were used to measure utilitarian inclinations during moral decision-making. Study participants had to evaluate the moral appropriateness of the proposed solution to the sacrificial harm dilemmas by using a scale from 1 (not at all wrong) to 7 (absolutely wrong), with 1 indicating fully utilitarian judgment and 7 indicating fully non-utilitarian judgment. In greater good dilemmas the scoring was reversed, using a scale from 1 (absolutely wrong) to 7 (not at all wrong), with 1 indicating fully non-utilitarian judgment and 7 indicating fully utilitarian judgment.

Self-reported utilitarianism.

Participants self-reported whether they considered their view to be utilitarian after reading a paragraph, explaining the theoretical moral commitments of utilitarian and deontological moral theories on a scale from 1 to 10 (low score for deontological view, high score for utilitarian view). The paragraph on utilitarian moral theory was provided by the authors of the original OUS validation [23].

Procedure.

The Ethics commission of the Medical Faculty at the University of Tübingen approved the online survey (“Validation of German translation of Oxford Utilitarianism Scale”, project number 839/2018BO2).

The OUS and the validation dilemma scenarios as well as Utilitarianism description for self-reported agreement were translated to German by one of the authors, who specializes in moral philosophy and is a native speaker of German. Two independent English native speakers (one from the field of philosophy and one with non-philosophical background) with a very good knowledge of the German language performed back-translation. Any inconsistencies in translation were assessed by the authors of the study. Overall, the German translation of the OUS (OUS-DE) provided a good understanding of the questionnaire items as only minor inconsistencies in wording in the back-translations were observed. Original and translated versions of the questionnaire are depicted in Table 2.

Download:

Table 2. English and German versions of OUS.

https://doi.org/10.1371/journal.pone.0335215.t002

The online study on the SoSciSurvey platform ([89], available at https://www.soscisurvey.de) was set up for the validation of the OUS-DE. Announcements for the online study were distributed by mailing lists on the University of Tübingen’s server as well as printed posters and flyers in the town of Tübingen. The online study ran from May to July 2019. The study was not preregistered.

In the online survey, respondents were first provided with information about the aims of the study, inclusion and exclusion criteria and had to provide written consent via digital button click after reading this information. They were also provided with data protection information for the anonymous survey format. After these procedures, the respondents filled out measurements in the following order: OUS-DE, six moral dilemma scenarios, self-reported agreement with Utilitarianism, Moral Competence Test (MCT, [90]), Moral Foundations Questionnaire (MFQ, [91]), the Saarbrücker Personality Questionnaire (SPF; [92]), the Harm Avoidance Scale (TCI-HA) from the Temperament and Character Inventory (TCI; [93]), and the Action Regulation Emotion Systems Scale (ARES; [94]). In this study only results from OUS-DE, moral dilemma scenarios and self-reported agreement with Utilitarianism will be reported. Original instructions from the questionnaires were included to let the respondents know how to fill out the measurements. Finally, they provided the following demographic information: sex/gender, age, education level, religious affiliation, religiosity, political ideology, mother tongue, existence of diagnosed mental disorders.

After completing the survey, respondents were also provided with an opportunity to win a retail voucher and could enter their e-Mail address into a competition. A voucher competition was used as part of the survey advertisement. Entries from the competition were recorded separately from the survey into a different repository and saved as an independent data file. No identification between survey answers and competition entries was possible as no other information than e-Mail addresses was recorded. The e-Mail addresses were used only for the purpose of voucher competition.

Data analysis.

Data analyses were performed with IBM SPSS Statistics 28.0.1.1, IBM SPSS Amos 28.0.0 (IBM, Chicago, USA), and R programming language [95] using the lavaan package [96].

Preliminary checks.

Data distribution for the OUS-DE was first checked for normality using descriptive statistics (skewness and kurtosis). Univariate outliers in OUS-DE responses were identified by Median Absolute Deviations (MAD) test (Tukey’s fence), calculated on the total OUS-DE score, while multivariate outliers were detected by the Mahalanobis Distance in linear regression.

Factorial validity.

Confirmatory factor analysis (CFA) was performed to evaluate the factorial structure of the OUS-DE using a maximum likelihood estimation. Both one-factor and two-factor models were tested, with the two latent factors allowed to intercorrelate as in the original OUS validation.

Model fit evaluation.

The goodness-of-fit of OUS-DE was assessed by several fit indices, selected based on the original OUS validation report [23] and methodological recommendations [97,98]. Model Chi-square (χ²) tests assess the difference between the observed and expected covariance matrixes, where goodness of fit is indicated by a non-significant test result at a p > .05. However, the χ² test is sensitive to sample size, and in our case, it will possibly indicate a poor model fit due to a large sample size [99]. Therefore, we prioritized two incremental fit indices, the non-normed fit index (also called Tucker-Lewis’s index, TLI) and the comparative fit index (CFI) as they adjust for sample size issues (for both: ≥ .90 = acceptable; ≥ .95 = excellent). Additionally, we used the root mean square error of approximation (RMSEA) to measure the discrepancy between the observed covariance matrix and the hypothesized covariance matrix (≤.05 = good;.05–.08 = acceptable; [100]). Lastly, we used the standardized root mean square residual (SRMR) to measure the average of the standardized fitted residuals (≤.08 = acceptable; [100,101]). Parsimony of the measurement models was assessed by Akaike Information Criterion (AIC; [102]) and Bayesian Information Criterion (BIC; [103]). Lower values in both indices show better parsimony in the measurement [97].

Reliability.

Internal reliability of the measurement was assessed by split-half method (items split by odd and even positions), with Spearman-Brown and the Guttman split-half coefficients [104,105]. Internal consistency was evaluated using Cronbach’s alpha, where higher values (closer to 1) indicate stronger consistency, accounting for questionnaire length effects [106–108].

Construct validity.

Pearson’s two-tailed correlations were calculated between the score of each OUS-DE subscale and the mean score of three corresponding moral scenarios (for more information see Materials of Study 1), as well as between the overall OUS-DE score and self-reported agreement with Utilitarianism.

Results

Data screening and outlier analysis.

Descriptive statistics were first used to assess data normality distribution in overall mean OUS-DE score, scores for each of the two subscales and every item in the scale separately. The mean OUS-DE score was slightly negatively skewed (S = −0.111). IB-DE scores were negatively skewed (S = −0.261), and IH-DE scores were positively skewed (S = 0.292). When each item was considered separately, skewness ranged from negative −0.402 to positive 0.45. Kurtosis statistic showed positive kurtosis for the overall mean OUS-DE score (K = 0.503). Kurtosis negatively but weakly affected IB-DE (K = −0.177) and IH-DE (K = −0.390). When each item was considered separately, kurtosis ranged from −0.999 to −0.399. All reported skewness and kurtosis statistics fall into the acceptable range for large samples [109].

The data was then assessed for univariate and multivariate outliers (see Data analysis section of Study 1). MAD test was performed on the sum of OUS-DE item scores for each participant and the analysis identified one univariate outlier. Mahalanobis Distance in linear regression identified 12 multivariate outliers. These univariate and multivariate outliers were removed from further analyses. After the procedure, final sample consisted of 378 individuals (243 women, M_age = 25.37). Mean responses to OUS-DE and its subscales are depicted in Table 3.

Download:

Table 3. Mean responses to the Oxford Utilitarianism Scale and its subscales.

https://doi.org/10.1371/journal.pone.0335215.t003

Psychometric properties of OUS-DE.

Confirmatory factor analysis (CFA).

CFA was performed on a dataset with 378 cases; no data was missing. Two different models were assessed and compared: 1) a one factor solution and 2) a two fixed factor solution based on the original questionnaire subscales, with covariance drawn between the two factors. Analyses used maximum likelihood estimation (10 iterations). The results are shown in Table 4 and Table 5.

Download:

Table 4. Results of Confirmatory Factor Analyses (N = 378).

https://doi.org/10.1371/journal.pone.0335215.t004

Download:

Table 5. Standardized factor loadings for OUS-DE in the CFA (N = 378).

https://doi.org/10.1371/journal.pone.0335215.t005

The default one-factor model clearly showed an inadequate model fit on all fit indices: χ² = 564.0, df = 27, p < .001, TLI = 0.176, CFI = 0.382, SRMR = 0.186, RMSEA = 0.229, AIC = 12726.737, BIC = 12797.565. Similarly, standardized factor loadings indicated that IH-DE items fit poorly with the IB-DE items if one-factor solution is analyzed (see Table 5).

The two-factor model fit was better, though marginally below recommended cut-offs: χ² = 92.317 df = 26, p < .001, TLI = 0.894, CFI = 0.923, RMSEA = 0.082, SRMR = 0.0627. Modification indices suggested that the model could be improved if residual covariances were modelled between questionnaire items IB-2 and IB-3 (MI = 32.453, Standardized EPC = 0.54), as well as IB-1 and IB-5 (MI = 32.111, Standardized EPC = 0.408). Incorporating these adjustments yielded a modified two-factor model with adequate fit: χ² = 56.509 df = 24, p < .001, TLI = 0.944, CFI = 0.962, RMSEA = 0.06, SRMR = 0.0569. Lower AIC and BIC indices of the modified two-factor model also supported this modified model as the preferable solution. Standardized factor loadings (see Table 5) indicated that the two-factor solution for OUS-DE is appropriate (all loadings > .4).

Results of CFA for the two-factor model remained robust when the sample was split according to sex/gender (for analysis descriptions and CFA results in tables see Supporting Information S1 File).

Split-half reliability.

Split-half reliability testing was performed on all nine items in the scale. The method of odd-even trials was selected. It estimated a Spearman-Brown Coefficient of 0.71 and a Guttman Split-Half Coefficient of 0.7.

Internal consistency.

Internal consistency estimations resulted in Cronbach’s alpha for the whole questionnaire α = 0.671; for IB-DE α = 0.760; and for IH-DE α = 0.716. It is important to note that short questionnaires result in lower alpha values [108], therefore, overall internal consistency could be assumed as acceptable to good.

Construct validity.

Construct validity for this measurement was assessed by calculating Pearson’s correlations (two-tailed) between the overall OUS-DE score, IB-DE score, IH-DE score and self-reported Utilitarianism, as well as mean scores of two types of moral dilemmas: a set of dilemmas concerning with greater good and a set concerning with sacrificial harm. As in the original OUS validation study [23], we have assumed that if the OUS-DE truly measures utilitarian inclinations, moral appropriateness ratings in greater good dilemmas will be directly associated with the measures in IB-DE and behavioral results in moral appropriateness ratings in sacrificial harm dilemmas will be directly associated with the measures in IH-DE, and behavioral results in moral decision-making.

The correlation between IB-DE and OUS-DE was strong (r = 0.778, p < .001). The correlation between IH-DE and OUS-DE was also strong (r = 0.662, p < .001).

Greater good dilemmas.

Mean score of the greater good dilemmas positively correlated with OUS-DE score (r = 0.446, p < .001), IB-DE score (r = 0.544, p < .001) as well as with self-reported Utilitarianism (r = 0.392, p < .001), indicating consistent construct validity. Fig 1 and Table 6 illustrate these correlations.

Download:

Table 6. Correlations between OUS-DE, its subscales, moral dilemmas, and self-reported Utilitarianism.

https://doi.org/10.1371/journal.pone.0335215.t006

Download:

Fig 1. Construct validity analysis.

Correlations between responses to IB-DE subscale and greater good dilemmas (A) and between responses to IH-DE subscale and sacrificial harm dilemmas (B).

https://doi.org/10.1371/journal.pone.0335215.g001

Sacrificial harm dilemmas.

Mean score of the sacrificial harm dilemmas, as expected due to scoring difference in these dilemmas with lower scores meaning greater agreement with utilitarian solution, moderately and negatively correlated with OUS-DE score (r = −0.501, p < .001), IH-DE score (r = −0.64, p < .001), as well as with self-reported Utilitarianism (r = −0.204, p < .001), indicating consistent construct validity. Please see Fig 1 and Table 6 for illustration of the correlation.

Self-reported utilitarianism.

Self-reported Utilitarianism positively correlated with overall OUS-DE: (r = 0.399, p < .001) as well as IB-DE (r = 0.338, p< .001) and IH-DE (r = 0.231, p < .001). Please see Table 6 for these correlations.

Discussion: study 1.

In this study, we translated and validated the German version of the Oxford Utilitarianism Scale (OUS-DE). Our results indicate that the OUS-DE is a reliable measure of utilitarian inclinations in young healthy adults, making it an adequate research tool to address utilitarian moral inclinations.

Using a confirmatory factor analysis, a two-factor solution of OUS-DE achieved appropriate model fit. This supports previous findings from the original OUS validation study [23] as well as more recent validation studies of the Turkish [29], Spanish [31], and French [30] OUS versions. A study assessing structural validity for OUS in 15 languages found that a different German OUS translation required multiple modifications to achieve a two-factor solution [32]. In our study, two modifications for the two-factor model were also required to achieve an adequate model fit: in IB-DE subscale, we needed to draw covariances between items IB-2 and IB-3 as well as IB-1 and IB-5. IB-2 (kidney donation) and IB-3 (leg sacrifice) items measure the principle of beneficence, and in particular, one’s agreement with a positive duty to sacrifice one’s body part to save another person’s life in an urgent situation. Due to the similarity of the two items, covariance between these items could have been expected. Covariation between items IB-1 (impartiality) and IB-5 (donation) could also be expected as both items involve referencing of a positive duty to help and care for others. These covariations have not been discovered in the original OUS validation study [23] or other OUS translations. Further studies should investigate possible conceptual overlaps between these items.

It has been recently suggested that the IB subscale should be revised or separated into more fine-grained subscales to represent the general beneficence, self-sacrifice, and action and omission distinction [110]. While covariations between items in this subscale indicated possible conceptual similarity among them, based on our results, we do not recommend a complete revision of the subscale. It is a central commitment of Utilitarianism to consider omissions to help and active harm as commensurable wrongs, as utilitarians consider consequences to be the sole arbiter of wrongness [2,3,22]. Divorcing it from other aspects of Utilitarianism, including impartiality, would lead to a skewed representation of Utilitarianism in the resulting measurement.

Finally, internal consistency of the scale indicated that the translated version of the questionnaire performs comparably well as the original English version and other OUS translations. Construct validity of the translated version was supported by moderate bivariate correlations between OUS-DE and self-reported Utilitarianism, as well as moderate-to-strong correlations between OUS-DE subscales and thematically corresponding moral dilemmas. The latter correlations are comparable with the results of the original OUS validation (see Study 2 in [23]. Moderate-to-strong associations between observations of OUS-DE and moral dilemmas indicate that they are measuring the same construct reliably and might be immune to response style or social desirability effects [111,112].

Study 2: effects of sociodemographic and personality variables on agreement with OUS-DE

The aim of this study was to investigate the stability of utilitarian inclinations in the general population and the origins of individual differences in self-reported Utilitarianism. To assess the stability of utilitarian inclinations, we have collected responses to OUS-DE and its subscales in two different samples at two different time points: in 2019 and in 2023 when the last protective measures of the Covid-19 pandemic were phased out. Based on previous research [80–82], we assumed that increased exposure to moral situations which require utilitarian solutions would increase agreement with OUS-DE, indicating that external circumstances can have an influence on moral beliefs in a general population.

Furthermore, we investigated individual differences in utilitarian inclinations among individuals due to their sex/gender, age, religiosity, and psychometric measures which conceptually are related to one’s moral dispositions such as moral competence, moral foundations, empathy, harm aversion and approach/avoidance trait. We expected to find a significant sex/gender difference with men reporting higher acceptance of IH-DE than women [50,51,56–59]. We also expected to find a positive relationship between age and IB-DE as well as a negative relationship between age and agreement with IH-DE [31,68]. Furthermore, we anticipated to find a positive relationship between religiosity and IB-DE [23,42]. Finally, positive relationships were expected between agreement with Utilitarianism and moral competence [44], empathy and Harm/Care foundation [15,23,30,35,38], while weak negative relationships were expected between agreement with Utilitarianism and harm aversion as well as behavioral approach trait [39,47–51].