Development and validation of the Japanese Moral Foundations Dictionary

The Moral Foundations Dictionary (MFD) is a useful tool for applying the conceptual framework developed in Moral Foundations Theory and quantifying the moral meanings implicated in the linguistic information people convey. However, the applicability of the MFD is limited because it is available only in English. Translated versions of the MFD are therefore needed to study morality across various cultures, including non-Western cultures. The contribution of this paper is two-fold. We developed the first Japanese version of the MFD (referred to as the J-MFD) using a semi-automated method—this serves as a reference when translating the MFD into other languages. We next tested the validity of the J-MFD by analyzing open-ended written texts about the situations that Japanese participants thought followed and violated the five moral foundations. We found that the J-MFD correctly categorized the Japanese participants’ descriptions into the corresponding moral foundations, and that the Moral Foundations Questionnaire (MFQ) scores correlated with the frequency of situations, of total words, and of J-MFD words in the participants’ descriptions for the Harm and Fairness foundations. The J-MFD can be used to study morality unique to the Japanese and also multicultural comparisons in moral behavior.


Introduction
Currently, one of the most active research areas in social and behavioral sciences pertains to how and on what grounds ordinary people form moral judgments. A central message from this flourishing body of research is that people quickly decide whether a particular act is morally right or wrong; however, it takes them a relatively long time to provide a "why" explanation for their judgment [1]. This intuitionist model of moral judgment has produced voluminous empirical research as well as a comprehensive theoretical framework-this is now formulated as the Moral Foundations Theory (MFT).
The central principle of the MFT is that people inherit a limited number of conceptual templates used for their intuitive classification of observed acts that are potentially relevant to PLOS  morality. Specifically, it is assumed that there are five major moral foundations including: (1) "Care," which focuses on not harming others and protecting the vulnerable; (2) "Fairness," which assumes equivalent exchange without cheating to be good; (3) "Ingroup," which concerns a collective entity instead of individuals, such as family, nation, team, and military; (4) "Authority," which postulates respect for authority, resulting in maintaining the hierarchy; and (5) "Purity," which involves a feeling of disgust caused by the impure. The MFT emphasizes that moral foundations meet not only individuals' adaptive need to fit into their community in the "correct" ways but also a collective need for the community to increase its unity and win against other groups. This is how and why moral foundations are typically shared with a high level of consensus among the community members, according to the MFT. The consensual nature of moral foundations should manifest most visibly in linguistic communication-this can mobilize the community toward solidarity and sanctity. In this group process, moral foundations are assumed to show their political aspects and provide a base for mobilizing members toward their collective goals. To test this hypothesis concerning "political" consensuality, Haidt and colleagues analyzed morality-relevant discourse in daily contexts [2]. A tool developed for this purpose was the Moral Foundations Dictionary (MFD), which quantifies virtues and vices associated with each moral foundation expressed in written texts [2].
The MFD contains a list of words related to one or several moral foundations such as "killing", "justice", and "loyal," which correspond respectively to the Care, Fairness, and Ingroup foundations. The usefulness of the MFD has been demonstrated in empirical studies and combined with the use of the LIWC software (Linguistic Inquiry and Word Count program [3]). For instance, in the research above, Graham et al. analyzed church sermons available online and found that liberal preachers were more likely than their conservative counterparts to use words relevant to Care, Fairness, and Ingroup but less likely to use words relevant to Authority and Purity. This is consistent with the general trends documented for these ideological camps on different measures [2].
Even though the MFD is useful for linguistic analyses of moral foundations, the dictionary is currently available only in English, and thus it is unknown to what extent we can generalize the findings to the linguistic communities outside of the English-speaking world. This is a problem because it is plausible that the contents-as well as the roles of different moral foundations-would vary across cultures. For instance, evidence shows that people living in Western cultures tend to emphasize Care and Fairness in their moral judgments, whereas non-Western people tend to rely more on Ingroup and Purity [4]. The use of the MFD translated into different languages might reveal similar differences in communication and discourses. An accumulating body of evidence concerning cultural differences would also be important in the context of criticism about the potential bias in morality research that leans toward the so-called WEIRD (Western, Educated, Industrialized, Rich, and Democratic) cultural samples [5][6][7]. Furthermore, social media platforms provide an excellent arena for analyzing human behavior in a natural setting, and some recent research has successfully applied natural language processing (NLP) to social media data to quantify people's moral behaviors [8,9]. Particularly notable are the findings by Dehghani et al., who identified a key role of Purity in social networking [10], and the work by Kaur and Sasahara [11] showing that although Care was the most dominant, Purity was the most distinct moral foundation in online conversations. Unfortunately, this evidence is also limited to English texts, because there is no publicly available dictionary that can be applied to texts written in languages other than English [12,13].
To overcome this limitation, we describe here how we developed a Japanese version of the MFD (J-MFD) using a semi-automated method. The J-MFD is publicly available online, and hence our methodology can serve as a useful model for further attempts to develop moral dictionaries in other languages.

Strategy for J-MFD development
The translation of a dictionary is beset with at least two difficulties. It is difficult to maintain consistency between translation outcomes via multiple translators because the accuracy of rendering from one language to another depends on a translator's linguistic ability. Moreover, the translated outcomes are subject to change and require constant updates because the actual use of the translated dictionary in text analysis may lead to a better translation and because language itself culturally evolves.
To resolve the first issue, we took advantage of computational methods and online linguistic resources and corpora. This collection of tools and data allowed us to produce as many translations as possible and ensure accurate translations while reducing human errors. To resolve the second issue, we released the J-MFD to the public so that researchers worldwide could freely use it and report issues, if any, on our website. These comments could be used for future updates.

Development of J-MFD
We translated the original MFD into our J-MFD via two online linguistic resources and two corpora with the aid of computational methods. The original MFD contains 324 English moral terms with 11 categories corresponding to "Virtue" or "Vice" (violates); each is associated with one of the five moral foundations (i.e., Care, Fairness, Ingroup, Authority, and Purity) as well with a more general or abstract category of morality (i.e., Morality General) [2]. Care is henceforth denoted as Harm in accordance with the notation of the MFD. The moral terms consisted of 156 words (e.g., impair) and 168 word stems (e.g., justifi � , which covers justification, justifier, etc.) There were some words associated with multiple categories such as "impair" (Harm Vice and Purity Vice); other words were associated with only a single category, such as "justifi � " (Fairness Virtue alone).
Our development followed five steps. First, our programs automatically collected all words that contained each of the word stems in the MFD by web scraping OneLook-an online dictionary metasearch engine (https://www.onelook.com) (Fig 1A!1C). For example, "justifi � " with the "Filter by commonness: Common words" option in Onelook returned 11 possible words for the word stem ("justifiable", "justifiableness", "justifiably", "justifies", etc). This procedure identified 891 words from all of the word stems in the MFD.
Next, we manually eliminated 58 words that were unrelated to morality (Fig 1C!1D). The remaining words comprised a list of moral words for translation.
Third, the remaining words were translated into Japanese via Weblio-an online dictionary and encyclopedia designed for Japanese speakers (https://ejje.weblio.jp). This process was also performed by web scraping, which allowed us to cover possible translation equivalents in Japanese (2044 words) (Fig 1D!1E).
Fourth, we took a frequency-based approach for word selection using two Japanese corpora: Japanese words based on the Balanced Corpus of Contemporary Written Japanese (BCCWJ) and the Tsukuba Web Corpus (TWC). BCCWJ is a corpus of contemporary written Japanese that contains 104 million words randomly sampled from books, magazines, newspapers, business reports, blogs, Internet forums, textbooks, and legal documents [14] (https://pj. ninjal.ac.jp/corpus_center/bccwj/). The TWC is a large corpus that contains 1.1 billion Japanese words obtained from 3.5 million Japanese web pages (http://nlt.tsukuba.lagoinst.info). We adopted the top ten most frequent Japanese words for every word stem and the top five for every word in BCCWJ and TWC, thereby filtering out words rarely used (Fig 1E!1F).
Finally, we adjusted the category assignments for each Japanese word after removing words unrelated to morality and words that failed in backtranslation using online dictionaries ( Fig  1F!1G). More specifically, we merged Japanese words whenever possible, using word stem representation. For example, "違反する" (Ihan-suru, or "violate") is a verb and "違反" (Ihan, "violation") is its noun form; these words can be merged to "違反 � " (n.b. "する" (suru, "do") to make a compound verb). Similarly, an adjective "安全な" (Anzen-na, "safe") and its noun form "安全" (Anzen, "safety") can be merged to "安全 � ." After the merge procedures, we examined whether Japanese moral word candidates can be back-translated to corresponding English words using online dictionaries. As a result, 23 words that failed this test were removed, and we were left with 741 Japanese moral terms, for which we adjusted the moral categories. This adjustment was necessary because multiple words (or word stems) with different moral categories could be translated into the same single Japanese word (or word stem); hence, a single Japanese word could belong to multiple categories. Among these categories, the central one (or ones) needed to be selected based on native Japanese knowledge and the definition of moral categories. For instance, "safe � ", "protect � ", "shelter", "secur � ", "defen � ", "guard � ", "preserve", and "obey � " all can be translated to "守る"; and "obey � " can fit in both Authority Virtue and Harm Virtue. Because the core meaning of "守る" is a Harm Virtue, we assigned it to Harm Virtue based on the judgment of three native Japanese speakers. Table 1 shows the number of words for each category and the total number of words in the J-MFD. As shown, the semi-automated procedures featured more words in Ingroup Virtue and Authority Virtue than in others. This seems to reflect Japanese culture, in which group harmony and hierarchy are more appreciated than individual interests. Example words from the J-MFD are listed in Table 2. The J-MFD and a computer program for Japanese word segmentation are publicly available online (https://github.com/soramame0518/j-mfd/). Note that Japanese texts are continuous strings of words and are not punctuated by blank spaces; thus, word segmentation is required for Japanese texts before using the J-MFD.

Validation of the J-MFD
To validate our J-MFD, we compared the mean frequencies of the dictionary words for the five moral foundations that were included in the descriptions about moral issues reported by Japanese participants. More specifically, 386 Japanese participants (238 men and 148 women; M age = 35.22, SD = 12.30) were recruited online using the Internet crowdsourcing service Macromill (https://www.macromill.com). Participants read brief explanations of Haidt's five moral foundations (see Supporting Information) and listed as many situations as possible that they thought followed and violated the five moral foundations.
The situations resulted in 16,033 sentences in total after eliminating the responses that were incomprehensible with respect to their meaning or were not related to morality (e.g., "I can't understand the meaning of the question"). Each sentence in these descriptions of situations was segmented into words, and five pools of morally relevant words (i.e., Harm-related, Fairness-related, and so forth) were constructed. For each of these pools, we computed the frequency ratio of appearances of J-MFD words associated with each moral foundation. To take an example of the word pool produced from the Harm-related context (Virtue and Vice combined), we separately counted the numbers of times that the Harm-related words, Fairness-related words, and so forth contained in the J-MFD appeared in each participant's descriptions. To obtain ratio scores, we divided those word counts by the size of the pool and by the total number of dictionary words associated with each moral foundation. Fig 2 shows the mean frequency ratio for each foundation in the J-MFD obtained from the Harm-related word pool. A one-way ANOVA showed a main effect of moral foundations, indicating a significantly higher frequency of Harm words than that of words from the remaining foundations. We repeated the same analyses for the remaining pools (i.e., Fairness-, Ingroup-, Authority-, and Purity-related), and similarly found the highest frequency ratios in each pool for the corresponding moral foundation, with the main effects of foundation (F (4,1328) = 52.95, p < 0.001) (See S1-S4 Figs in Supporting Information). These results demonstrate the validity of our J-MFD.

Relationships between moral descriptions and MFQ scores
We collected self-reported responses to the Moral Foundations Questionnaire (MFQ) [4] from the same Japanese sample to examine relationships between self-reported moral situations and MFQ scores in Japanese. This is important to show the applicability as well as the further validation of the J-MFD. To measure MFQ scores in Japanese, we used the 30-item version of the Japanese MFQ that was back-translated with the approval of the authors of the original MFQ (available at www.moralfoundations.org and in [15]). The Japanese version of the MFQ was found to have a five factor model as the MFT predicted [16] and has been used in other research on morality among the Japanese people (e.g., [17]).
Our assumption here was that people who have a high MFQ score on a certain moral foundation (e.g., Harm) may have a better-organized schema for the corresponding foundation. Thus, when asked to describe situations about the foundation, they could describe it more easily and appropriately than those who have a low MFQ score. This assumption can be tested by measuring for each of the five moral foundations, (1) how many situations they listed (Virtue and Vice combined), (2) how many words they used to describe the situations, and (3) how many foundation-related words they used to describe the situations. It should be noted that the task of mentally representing morality gets more specific in the order of (1), (2), and (3): to achieve high performance in (3), a more foundation-specific schema is required for choosing appropriate words related to the corresponding foundation. The measurement of (3) with the J-MFD is indispensable for examining the above-mentioned assumption. We investigated whether (1), (2), and (3) would correlate with MFQ scores for the corresponding moral foundations.
As for (1), the correlation of MFQ scores and the number of situations described by participants was significant for most of the foundations (Harm: r = .25, p < .01; Fairness: r = .16, p = .01; Authority: r = .14, p < .01; Purity: r = .22, p < .01) except the Ingroup foundation (r = .07, p = .19). As for (2), the correlation of MFQ scores and the number of total words included in participant-made situations was significant for all five foundations (Harm: r = .29, p < .01; Fairness: r = .20, p < .01; Ingroup: r = .12, p = .02; Authority: r = .15, p < .01; Purity: r = .22, p < .01). As for (3), the correlation of MFQ scores and the number of J-MFD words included in participant-made situations was significant for the Harm and Fairness foundations (r = .12, p = .02; and r = .11, p = .04, respectively), while there was no significant correlation for the other three foundations (Ingroup: r = -.02, p = .67; Authority: r = .04, p = .49; Purity: r = .06, p = .21). According to the results of (1)-(3), the correlation between self-reported moral descriptions by Japanese people and MFQ scores was consistent for the Harm and Fairness foundations but not for the other foundations. The implications of this finding are discussed in the next section.

Discussion
This work proposed a semi-automated method for translating the Moral Foundations Dictionary (MFD) and developed and validated its Japanese version (J-MFD). The J-MFD will be updated with revision via collaborative efforts by its users (https://github.com/soramame0518/ j-mfd/). Our method is beneficial for developing other language versions of the MFD, which are needed because multilingual versions allow us to test the Moral Foundations Theory (MFT) in different languages and compare diverse cultures using the same basis of the MFT [1,18].
We showed that the J-MFD allows us to correctly categorize moral-relevant situations in Japanese-written texts into the corresponding moral foundations, which serves as validation of the J-MFD. Furthermore, our correlation analyses showed that (1) the number of situations, (2) the number of words, and (3) the number of J-MFD words all consistently correlated with the MFQ score in the Harm and Fairness foundations, which implies the existence of a betterdeveloped schema. People were able to describe Harm-and Fairness-related sentences easily (i.e., (1) and (2)) and accurately (i.e., (3)). However, such consistent patterns across (1) to (3) were not observed in the Ingroup, Authority, and Purity foundations, which suggests the lack of specific schema for these foundations. A possible explanation for these results is that Harm and Fairness foundations are more fundamental than the other foundations and are better quantified in the MFQ. In contrast, the Ingroup, Authority, and Purity foundations may be more culture-dependent, and the MFQ, as it stands, may inaccurately measure these foundations, and therefore, may need a modification specific to Japanese culture. These interpretations are consistent with prior research findings, which show that Harm and Fairness are central to moral judgment cross-culturally [19,20] while Ingroup, Authority, and Purity are susceptible to political ideology, ethnicity, culture, and religiosity [2,[21][22][23][24][25]. Altogether, while our study shows that the J-MFD is a valid tool for morality research in Japanese culture, it also highlights the need for further research to scrutinize the causal relationship of MFQ scores and the use of moral-relevant words in multilingual settings, not only in Japanese.
It is important to note the overrepresentation of male participants in our sample, which featured 238 males and 148 females. Previous research using internationally diverse samples have found that females were more likely to show higher scores on the Harm, Fairness, and Purity foundations than males, with males scoring higher than females on the Ingroup and Authority foundations in the MFQ [4]. This work contributes to interdisciplinary collaborations in morality research across academic fields. First, the MFT can be tested using NLP with the J-MFD by analyzing languages that people express online [26]. In addition to the word-counting approach conducted in this study, NLP allows the J-MFD to be used for word co-occurrence analysis [27] and latent semantic analysis [8,11]. Second, the J-MFD can be used in combination with a Japanese emotion dictionary because specific emotions are often associated with moral judgment [28]. Third, morality matters in the business world as well; thus, it is important to investigate morality from the perspective of both leaders/employees in organizations and consumers [26,29].
Finally, the J-MFD allows future research to scrutinize theoretical frameworks about the standards of moral judgment in various fields of study; this would enrich cross-cultural research by facilitating comparison of morality-related texts in English and Japanese. We are aware that the J-MFD-as well as the original MFD-is not an inclusive dictionary; thus, we had to combine Virtue and Vice categories for words with low occurrence frequency. Both dictionaries might be improved by adding more relevant words [10,11]. The expansion of moral words in the J-MFD is critical to improving accuracy in measuring morality from written texts-that is one of our key future goals.