Development and initial psychometric assessment of the race-related attitudes and multiculturalism scale in Australia

Aim The present study aims to develop the Race-related Attitudes and Multiculturalism Scale (RRAMS), as well as to perform an initial psychometric assessment of this instrument in a national sample of Australian adults. Methods The sample comprised 2,714 Australian adults who took part in the 2013 National Dental Telephone Interview Survey (NDTIS), which includes a telephone-based interview and a follow-up postal questionnaire. We used Exploratory Factor Analysis (EFA) to evaluate the RRAMS’ factorial structure (n = 271) and then proceeded with Confirmatory Factor Analysis (CFA) to confirm the proposed structure in an independent sample (n = 2,443). Measurement invariance was evaluated according to sex, age and educational attainment. Construct validity was assessed through known-groups comparisons. Internal consistency was assessed with McDonald’s ΩH and ordinal α. Multiple imputation by chained equations was adopted to handle missing data. Results EFA indicated that, after excluding 4 out of the 12 items, a two-factor structure provided a good fit to the data. This configural structure was then confirmed in an independent sample by means of CFA (χ2(19) = 341.070, p<0.001, CFI = 0.974, RMSEA = 0.083; 90% CI [0.076, 0.091]). Measurement invariance analyses suggested that the RRAMS items can be used to compare men/women, respondents with/without tertiary education and young/older participants. The “Anglo-centric/Assimilationist attitudes” (ΩH = 0.83, αORDINAL = 0.85) and “Inclusive/Pluralistic attitudes” subscales (ΩH = 0.77, αORDINAL = 0.79) showed adequate reliability. Men and participants with low education had higher Anglo-centric/assimilationist attitudes and lower inclusive/pluralistic attitudes, suggesting construct validity. Conclusions The RRAMS appears to be a valid and reliable measure to evaluate multiculturalist attitudes in the Australian context. The instrument may be useful in the assessment and monitoring of interventions aiming to promote multiculturalist inclusive attitudes and to increase social cohesion in Australia.

Introduction Racism emerges whenever social and individual values, norms and practices of a given group are considered superior to others'. Racism occurs with the particular aim of creating, maintaining or reinforcing power imbalances, as well as the corresponding inequalities in opportunities and resources along racial lines [1]. Similar to most contemporary societies, Australia is characterized by co-existing expressions of cultural diversity on the one hand, and negative impacts of racism on social cohesion on the other [1]. In Australia, the mental health costs directly attributable to racism have been estimated at 235,452 disability-adjusted life years lost, which is equivalent to an average $37.9 billion in productivity loss per annum, or 3% of the Australian annual Gross Domestic Product (GDP) over 2001-2011 [2]. Such a strong relationship is an indication that racism may erode the very social fabric of the Australian society by producing mental disorders and suffering, which unevenly impacts upon racially marginalized groups.
Social conceptions that shape intergroup relations form the common ground upon which intergroup attitudes and discriminatory behaviour take place [3]. From an empirical viewpoint, findings suggest that racist attitudes are associated with racist behaviours and racial-ethnic minorities' experiences of discrimination [4]. Positive attitudes towards diversity, however, are negatively associated with discriminatory behaviour [5]. In this study, we propose to explore attitudes in relation to multiculturalism, a construct of special relevance to the social, economic and political fabric of contemporary Australia [6]. We focus on multiculturalism as an ideology of acknowledging and celebrating ethnic and cultural differences, in which the need for preserving cultural identities is recognized [7]. It reflects a "sensibility and [a] disposition towards cultural differences among large sections of the population" [8]. Data from the 2016 Australian Census revealed that one in three Australians were born overseas, and a similar proportion of individuals speak a language other than English at home. Nevertheless, assimilationist attitudes -expectations of conformation to the dominant culture-often prevail, as opposed to multiculturalist perspectives that accept and praise racial and ethnic-cultural diversity [9]. Understanding attitudes to multiculturalism can contribute to unveil the dynamics of racism and discrimination against minorities in the country, fostering public debate and policy formulation aimed to promote positive intergroup relations [10].
Research on ethnic-racial intergroup attitudes draws from theories on ideological attitudes that explain group-based dominance and social cohesion [11][12][13]. Social Dominance Orientation (SDO), for example, reflects the degree to which respondents believe that hierarchy-based dominance between social groups is natural [14]. Discrimination against minorities, therefore, can be explained by the degree of endorsement of the notion that group-based hierarchies are natural and inevitable [14]. Endorsement of group-based dominance and out-group prejudice tends to increase among those who highly identify with the dominant group, as they represent a mechanism of maintaining the in-group status quo [12].
Research on ethnic-racial intergroup relations in contemporary societies has also explored the Right-wing Authoritarianism (RWA) concept [15][16][17]. RWA is characterized by the endorsement of social conservative values, morality, collective security, group-based social cohesion, and strict obedience to social authorities [15,17]. Those who endorse RWA values can be more sensitive to threats to social stability, being prone to conservative values as to increase their perception of control and collective security [18]. Perception of threat has been shown to mediate the association between group identification and attitudes towards multiculturalism [11]. Those that consider immigrants or ethnic-racial minorities as a threat to the control of resources or maintenance of the dominant social values tend to endorse more conservative/assimilationist attitudes towards multiculturalism [11,19].
Sustaining dominant group status quo can also be achieved by not acknowledging ethnicracial inequalities in the population. The so-called colour-blind racial ideology denies the existence of racism and justifies racial inequalities as a result of personal decisions, meritocratic achievements, and market forces [20,21]. By denying racist practices and racial inequalities, it provides the discursive tools to downplay policy proposals aimed at promoting racial justice and therefore maintains the power imbalance between ethnic-racial groups [20]. Following this perspective, public denial of racism has been pointed as an obstacle to a deeper commitment to multiculturalism in Australia [13,22]. Although the existence of racism is acknowledged, most Australians fail to recognise the existence of Anglo-privilege, a necessary step in reducing the imbalance in resource distribution and political representation among ethnicracial groups [13].
Taken together, the results mentioned above point to the centrality of properly assessing the different facets of intergroup attitudes towards multiculturalism as to inform public debate and contribute to prevent and counteract discrimination. It is important to note that the majority of the available scales used to assess race-related attitudes have been developed and psychometrically examined among U.S. populations [7]. These tools may not be relevant or provide valid/reliable estimates of race-related attitudes in non-US contexts, though, given the considerable contextual dependency of racism. Historiographic and sociological accounts of racial dynamics usually emphasize Australian specificities in terms of colonization, past and contemporary immigration policies, and patterns of cultural diversity as key aspects.
Australia is a settler society that started with a policy of Anglo-celtic migration only. This was later expanded to include migrants from other European-backgrounds (e.g., Greeks, Italians), having only in the 1980s opened its borders to migrants of Asian and Middle-Eastern descent. These and other specificities (e.g., limited involvement in slave trade) cast serious doubts on the idea of simply adapting tools developed in a range of different countries to the Australian context. Just like other multiculturalist societies, including Canada and New Zealand, multiculturalism was debated at a national level as a state-policy in the 1970s. Backlashes from conservative sectors, nonetheless, contributed to prioritise an assimilationist perspective on the implementation of multiculturalism values in society. Australia has also historically dispossessed and oppressed the native Aboriginal Australians since British colonization with ongoing effects until present [23]. Our study does not focus on colonisation and racism faced by Aboriginal Australians as the unique features of these experiences can be diminished when considered under the umbrella of multiculturalism [24].
To the best of our knowledge, two measurement instruments that provide information on racial, ethnic, and cultural acceptance (i.e. race-related and multiculturalist attitudes) have been previously developed and assessed in Australia [7,25]. While the first has focused on intercultural understanding among teachers and students in schools [25], psychometric evaluation of the second was carried out in relatively young and convenience samples of primary and secondary school students (all younger than 15 years-old residing in Victoria) and community members (mean age of 23 years-old with 70% residing mainly in Victoria), which limits their applicability at a national level and among older age groups. Therefore, neither an integrated picture of attitudes towards multiculturalism across the country has yet been delineated, nor a range of strategies to advance racial equity based on this knowledge have been proposed.
The present study proposes the Race-related Attitudes and Multiculturalism Scale (RRAMS) as a measure of attitudes towards multiculturalism. The items were formulated to reflect social ideologies and collective beliefs that potentially influence ethnic-racial intergroup attitudes. The aim of this study was to verify its applicability to the Australian context by assessing the extent to which the RRAMS provides a valid and reliable measurement of multiculturalist attitudes in a sample of Australian adults across all states and territories. In particular, the internal validity of the RRAMS was assessed in terms of its configural structure (i.e., the number of underlying factors), metric properties-the magnitude of factor loadings-, as well as measurement invariance (i.e., whether it allowed meaningful comparisons across sociodemographic characteristics). External validity of the RRAMS was then assessed in term of its construct validity.

Study design and participants
This was an Australian population-based study, with data obtained from the 2013 National Dental Telephone Interview Survey (NDTIS), which includes a telephone-based interview and a follow-up postal questionnaire. The NDTIS has been carried out periodically by the University of Adelaide since 1994, and comprises a large national sample of Australian residents aged 5 years and over. The NDTIS is a random sample survey that collects information on the dental health and use of dental services of Australians in all states and territories. The survey also collects data on social determinants of oral health and wellbeing, which include detailed information on sociodemographic factors, such as household income, education, country of birth, remoteness of location and main language spoken at home. For the 2013 survey, an overlapping dual sampling frame design was adopted. The first sampling frame was created from the electronic product 'Australia on Disc 2012 Residential;' an annually updated electronic listing of people/households listed in the White Pages across Australia. Both landline and mobile telephone numbers were provided on records where applicable.
A stratified two-stage sampling design was used to select a sample of people from this sampling frame. Records listed on the frame were stratified by state/territory and region, where region was defined as Capital City/Rest of State. A systematic sample of records was selected from each stratum using specified sampling fractions [26]. To include households that were not listed in the White Pages, a second sampling frame comprising 20,000 randomly generated mobile telephone numbers was used. This sampling frame was supplied by Sampleworx and the mobile telephone numbers were created by appending randomly generated suffix numbers to all known Australian mobile prefix numbers. As the mobile numbers did not contain address information, the sampling frame could not be stratified by geographic region. A random sample of mobile numbers was selected from the frame and contacted to establish the main user of the mobile phone. This person was asked to participate in the telephone interview, provided that they were aged 18 years or over. All participant provided verbal consent to participate in the survey and datasets were de-identified to ensure anonymity [26].
Following the completion of the telephone interview survey, participants were invited to respond to the postal questionnaire component. Those who agreed were sent a covering letter with the questionnaire and reply-paid envelope enclosed. A reminder postcard was sent two weeks later, with, if necessary, two additional follow-up letters/questionnaires sent subsequent to the postcard. A total of 6,340 Australian adults aged 18+ years took part in the 2013 NDTIS, with 2,935 (46.3%) completing the follow-up postal questionnaire. Sample characteristics are displayed in Table 1. Two thirds of the sample were 45 to 98 years old and had Technical and Further Education (TAFE) or went to university. Women corresponded to 60.3% of the sample. The majority of participants were born in Australia (76.7%), 12.8% were originally from Europe and 10.5% from the other continents (Asia, Africa and the Americas).

Ethical approval
Ethical approval for the study was granted by the University of Adelaide's Human Research Ethics Committee (approval number HS-2013-036).
Phase 1: Item development. The RRAMS was developed by a group of researchers with expertise on the topics of racism, multiculturalism, and race-related attitudes in Australia. To ensure content validity [30], the scale was based on large surveys carried out in the country that were co-designed by the abovementioned group of researchers. These include the 2015-16 Challenging Racism Project [31] and the 2013 survey of Victorians' attitudes to race and cultural diversity [32]. The initial item development phase consisted in the design of items that reflect the different social ideologies that encompass multiculturalism and race-related attitudes. Discussions among the panel of experts were held until reaching consensus that the items comprehended a varied number of theoretical perspectives underpinning the construct of interest. A second group of experts-not involved in the first development phase-was then consulted for feedback purposes in relation to comprehensiveness and clarity of the items.
The final RRAMS was proposed as comprised by two subscales. The first subscale included six items reflecting theories and social ideologies in agreement with "Anglo-centric/Assimilationist attitudes." It included items reflecting alignment with RWA (e.g., 'We need to stop spreading dangerous ideas and stick to the way things have always been done in Australia'), agreement with SDO ('It is okay if some racial or ethnic groups have better opportunities in life than others'), endorsement of colour-blind racial ideology (e.g., 'We shouldn't talk about racial or ethnic differences'), zero-sum racist thinking (e.g., 'Racial or ethnic minority groups take away jobs from other Australians'), and endorsement of assimilationist ideology (e.g., 'People from racial or ethnic minority groups should behave more like mainstream Australians').
The second subscale comprised six items assessing agreement with "Inclusive/Pluralistic attitudes." It included low compliance to RWA (e.g., 'Some of the best people in our country are those who are challenging our government and ignoring the 'normal' way things are supposed to be done'), low SDO (e.g., 'We should do what we can to create equal conditions for different racial or ethnic groups'), acknowledgment of racism (e.g., People from racial or ethnic minority groups experience discrimination in Australia), acknowledgment of white privilege (e.g., 'Australians from an Anglo background (that is, of British descent) enjoy an advantaged position in our society'), and endorsement of multiculturalism (e.g. "People from racial or ethnic minority groups benefit Australian society"). Besides their theoretical relevance, these constructs have been found to be acceptable and appropriate for assessing population racerelated attitudes in previous national studies in Australia [31,32]. Response options for each item ranged from 'strongly disagree' (0), 'disagree' (1), 'neither agree nor disagree' (2), and 'agree' (3) to 'strongly agree' (4).
Phase 2: Identification of a potential factorial structure. Since the RRAMS was conceptualized to measure agreement with both conformity to the dominant ethnoculture ("Anglocentric/Assimilationist attitudes") and agreement with promotion of ethnic diversity ("Inclusive/Pluralistic attitudes"), an Exploratory Factor Analysis (EFA) was initially run to empirically test this assumption (i.e., that a two-factor solution would underlie the set of items). The factorial solution suggested by the EFA was then confirmed by means of a Confirmatory Factor Analysis (CFA) [33] in an independent sample to avoid capitalization on chance [34,35]. We randomly divided the NDTIS sample into one group for the EFA and another group for the CFA; see Table 1 for the distribution of each subsample according to sociodemographic characteristics. Considering that a sample size with at least 200 participants is sufficient for EFA under normal conditions (medium communalities and at least three items loading on each factor) [36] and CFA has higher sample requirements, 271 participants from the original survey were randomly selected for the EFA.
Factor retention relied on Scree Plot [37] criteria and Parallel Analysis (PA) [38]. In the PA, 1,000 random and resampled datasets with the same number of RRAMS items and respondents were generated. The rationale of the PA is that meaningful factors extracted in the �� EFA, Exploratory Factor Analysis. This refers to respondents whose data were analyzed with EFA in Phase 2 of statistical analysis.
��� CFA, Confirmatory Factor Analysis. This refers to respondents whose data were analyzed with CFA in Phase 3 of statistical analysis.
Factor extraction was conducted with Maximum Likelihood [39] and oblique rotation ("direct oblimin") [40]. Items with non-salient factor loadings (.<40) were deleted. Additionally, 100 bootstrapped samples were used to generate factor loadings' 95% confidence intervals [41]. Phase 3: Confirmation of the factorial structure in an independent sample. After a factorial structure was derived from the EFA, the instrument was assessed using CFA in an independent sample (n = 2,443). The estimation method was Weighted Least Squares [42], with a mean-and variance-adjusted (WLSMV) test statistic [43]. Missingness of individual item responses ranged from 0.9% to 2.2%, and this was handled with multiple imputation of 20 datasets using the fully conditional specification method [44]. We imputed information for individuals who responded to at least one item of the RRAMSs (n = 2,714). Rubin's rules [45] were used to pool point estimates and standard errors (SE). To evaluate model fit, the scaled χ 2 was used to test the hypothesis of exact-fit. Additionally, we used approximate fit indices, such as the scaled Comparative Fit Index (CFI) and scaled (for simplicity, the term 'scaled' will be omitted from now on.) Root Mean Squared Error of Approximation (RMSEA). Values of CFI � 0.96 and RMSEA � 0.5 indicate good model fit [46], while 0.5 < RMSEA � 1.0 indicates acceptable fit [35].
Since factorial structures derived from EFA do not necessarily imply good fitting CFA models (e.g. due to cross-loadings or residual correlations) [47], in case the factorial structure had a poor fit, model re-specifications were informed by standardized residuals, Modification Indices (MI) and the Standardized Expected Parameter Change (SEPC) [48]. Completely standardized solutions were reported throughout the paper.
Phase 4: Analysis of measurement invariance. An initial Multigroup CFA [49] was conducted to check if the same configural structure would hold for all sex, age, and educationbased groups-i.e., this was done to check whether configural invariance could be confirmed with the data at hand. The χ 2 , CFI and RMSEA and their previously described cut-off points were used to evaluate configural invariance. The second level of measurement invariance, metric invariance, was assessed to ascertain whether factor loadings were similar across the same groups. The final test, scalar invariance, was used to determine whether item thresholds were equal across sex, age and education. Given that scalar models are nested within metric models, and metric models are nested within configural models, metric and scalar invariance were evaluated through a Likelihood Ratio Test (LRT), namely the Δ χ 2 [50]. The Δ χ 2 statistic was computed in each imputed dataset and pooled according to Li, Meng [51] recommendations (i.e. D2 statistic). When the Δ χ 2 was statistically significant, the ΔCFI [52] was used to evaluate the magnitude of the difference. Models with ΔCFI � -.002 indicated lack of invariance [53]. Whenever measurement invariance was not achieved, tests of partial invariance were conducted [54].  [56]. The McDonald's O H has two advantages over the traditional and widely used Cronbach's α: It does not assume (1) tau-equivalence or a (2) congeneric model without correlated errors (i.e. locally independent items) [57]. Furthermore, the ordinal α is reported given that Cronbach's α underestimates reliability in ordinal Likert scales. Adequate methods for calculating ordinal α confidence intervals are not available [58].
Phase 6: Item reduction analysis. In the item reduction analysis, we evaluated inter-item correlations, corrected item-total correlations (CITC) and item difficulties. Inter-item correlations indicate the extent to which all items on a scale are examining the same construct without redundancy. Thus, inter-item correlations should be moderate (i.e. items that measure the same construct but also have unique variances) and items with correlations lower than .20 were considered for deletion [59]. The next step was the evaluation of CITC. One important aspect in instrument development is achieving a good balance between a small number of items (lengthy questionnaires can induce lower response rates [60]) and adequate reliability. A recent study by Zijlmans, Tijmstra [61] showed that the CITC [62] performed better than other methods at identifying which items can be removed while maximizing reliability. Therefore, items with the lowest CITC should be the first to be considered for removal. The corrected item-total correlation needs to be calculated within subscales, since items can only be summed into a total score when they measure the same construct [63]. For this reason, CITCs were calculated after the factorial structure was established (i.e. we had no prior information about which item belonged to which subscale to calculate corrected total scores). Given the ordinal nature of the data, the inter-item correlations and CITCs were investigated with non-parametric Kendall's τ [64].
Finally, due to the limitations of classical difficulty indices such as the p-value (i.e. proportion of correct responses given the total score) [65], we evaluated item difficulty with the LI IRF , the location index based on the item-response function [66]. The LI IRF is calculated based on the item locations (β i ), which are a well-known reparameterization of item thresholds (τ i ) of adjacent i and i +1 response categories [67]. The LI IRF indicates the value of the latent trait in which respondents have an average score of half the maximum item score. For example, in a 5-point rating scale (items ranging from 0 = Strongly Disagree to 4 = Strongly Agree), the LI IRF indicates the level of inclusive/pluralistic attitudes required for participants to score on average 2 (2 = Neutral). In our study, the LI IRF was chosen over item thresholds (τ i ) to convey item difficulty because of two advantages: the interpretation of the LI IRF is (a) easier, since it is a single index compared to four thresholds per item; and (b) more substantive, since it is based on the latent trait ("Anglo-centric/Assimilationist attitudes" or "Inclusive/Pluralistic attitudes") rather than on the latent response variables [68]. Nonetheless, for the sake of completeness, we also reported the item thresholds (τ i ).
Phase 7: Construct validity. To evaluate the RRAMS' construct validity, we investigated known-groups validity according to sex, education and age. Known-groups validity compares the levels of the constructs in different groups (e.g. men compared to women) and should be applied when it is known, theoretically or due to previous empirical research, that these groups differ on the variable of interest. Therefore, known-groups validity can inform whether the instrument is able to discriminate between two groups that are known to be different regarding the construct (e.g. individuals with more education have more inclusive attitudes). Investigation of known-groups validity is important in many instances, such as when there is no "gold-standard" method of measurement to which the instrument can be compared [69]. That is, since there is no "gold-standard" or established (based on robust psychometric evidence) instrument to measure race-related attitudes and multiculturalism in Australia, it is not possible to define what would constitute a good measure for the RRAMS to display convergent validity with. Furthermore, in our case, there is previous evidence of groups that are known to differ according to multiculturalism and race-related attitudes. For example, as multiculturalism can be perceived as identity-threatening by dominant group members [11,19], we expected men to have more conservative attitudes towards multiculturalism when compared to women [22,70]. The same pattern was expected for older participants (>45 years old) when compared to younger respondents [22,70,71]. Participants with a university degree, in turn, were expected to be more supportive of multiculturalism than those with lower educational attainment. This hypothesis is in accordance with previous findings showing that sense of economic security (economic, personal, and cultural), higher education and younger age were associated with more positive attitudes towards multiculturalism and lesser exclusionary attitudes [22,70,71]. Therefore, sex, age and education were chosen as the exogenous variables for the evaluation of known-groups validity. To assess known-groups validity, latent mean differences were calculated by constraining the latent means in one of the groups (i.e. women and participants with higher education) to zero, so this group would function as a reference group. Considering that latent variances were constrained to one in the completely standardized solution, latent mean differences are interpreted as effect sizes analogous to Cohen's [72] d [73]. Finally, we employed the Empirical Bayes model [74] to estimate factor scores, which were plotted using Kernel density [75] to inform not only the average but also the distribution of the latent trait according to groups.

Identification of a potential factorial structure
Investigation of the Scree Plot and PA indicated that 2 factors substantially explained more variance than factors extracted from randomly generated data (Fig 1).
It should be noted that, although the third factor accounted for more variance than the third factor extracted from the random datasets, the difference was trivial. For this reason, only two factors were retained. The next step consisted of the evaluation of factor loadings ( Table 2). Results showed that Item 2 ("Some of the best people in our country are those who are challenging our government and ignoring the 'normal' way things are supposed to be done"), Item 3 ("It is okay if some racial or ethnic groups have better opportunities in life than others") and Item 6 ("We shouldn't talk about racial or ethnic differences") did not have substantial factor loadings (>.40) and were therefore excluded. Item 5 had the smallest factor loadings (λ 2 = 0.440 95% CI [0.220, 0.610]). After deletion of these four items and EFA re-analysis, the two-factor solution achieved simple structure. This time, however, Item 5 did not achieve a substantial factor loading (λ 2 = 0.390; 95% CI [0.180, 0.590]) (S1 Table); that is, the factors explained only 19% of the variance of item responses ("communality"), while 81% of the variance was explained by other sources ("uniqueness"), such as measurement error. For this reason, Item 5 was also excluded from the analysis.

Confirmation of the factorial structure in an independent sample
The 2-factor model was then selected and its fit, examined (χ 2 (19) = 341.070, p<0.001, CFI = 0.974, RMSEA = 0.083; 90% CI [0.076, 0.091]). Since the null hypothesis of exact-fit was rejected (χ 2 (19) = 341.070, p<0.001), we proceeded with indices of approximate-fit. The CFI indicated a good fit to the data (>.960), while the RMSEA was adequate (0.5 < RMSEA � 1.0). Residual correlations are displayed in S2 Table. Considering the overall good fit of the model and that all items exhibited substantial factor loadings (Table 3), the two-factor model with 8 items was accepted. "Anglo-centric/Assimilationist attitudes" (e.g. "Racial or ethnic minority groups take away jobs from other Australians") was regarded as the first subscale, whereas the second comprised six items assessing agreement with "Inclusive/Pluralistic attitudes"

Analysis of measurement invariance
Next, measurement invariance by sex, education and age was evaluated (Table 4). Regarding sex, the LRT indicated that the metric model was not statistically different from the configural   better. When scalar invariance was evaluated, the pooled Δ χ2 was negative for both educationand age-based groups. Although a negative Δ χ2 is not interpretable (and, therefore, values were set to zero), these negative values can occur when the difference between models are small [76]. For this reason, the threshold constraints were regarded as tenable [77] and provided indirect support for scalar invariance.

Reliability
The

Item reduction analysis
Inter-item correlations ranged from 0.29 to 0.56 (Supplementary 3) and no correlations were lower than 0.20. The CITCs ranged from 0.39 to 0.58. Within the "Anglo-centric/Assimilationist attitudes" subscale, the easiest item was "We need to stop people spreading dangerous ideas and stick to the way things have always been done in Australia" (LI IRF = 0.00), while the hardest item was "Racial or ethnic minority groups take away jobs from other Australians" (LI IRF = 0.72) ( Table 3). That is, with respect to Item 10, respondents needed to have 0.72 standard deviations more Anglo-centric/assimilationist attitudes than the average Australian to produce an expected score of 2 out of 4. Item 10 was the hardest item in the "Anglo-centric/Assimilationist attitudes" subscale since its endorsement required more Anglo-centric/assimilationist attitudes than the other items. Within the "Inclusive/Pluralistic attitudes" subscale, the easiest item was "We should do what we can to create equal conditions for different racial or ethnic groups" (LI IRF = -1.58), while the hardest item was "People from racial and ethnic minority groups experience discrimination in Australia." (LI IRF = -0.80). The hierarchy of item difficulties was identical when average item thresholds (À t) were inspected (S4 Table).

Construct validity
Examination

Discussion
The current study aimed to present the RRAMS as a measure of attitudes towards multiculturalism in Australia and to examine some of its psychometric properties using data from a nationwide sample. Results showed that the two subscales of "Anglocentric/Assimilationist attitudes" and "Inclusive/Pluralistic attitudes" are initially valid and reliable for the Australian population. In the initial stage of psychometric assessment, we identified poorly performing items, and these were excluded. One of these was Item 2 ("Some of the best people in our country are those who are challenging our government and ignoring the 'normal' way things are supposed to be done"), an item originally designed to reflect RWA in relation to multiculturalism. Despite its original purpose, Item 2 might not reflect the cultural and racerelated topic in question. This is one possible explanation why the responses to this item were not strongly influenced by respondents' Inclusive/Pluralistic attitudes towards multiculturalism (only 12% of the variance was explained by the supposedly corresponding factor). For instance, the wording "challenging our government" can be interpreted as referring to a general debate not necessarily reflecting ethnic-racial differences on political representation and resources distribution. Future studies might test the item fit by emphasizing 'challenging our government' as pressuring for a political agenda that prioritizes reducing social inequalities among ethnic-racial groups and promotion of a pluralistic society. Items 3 ("It is okay if some racial or ethnic groups have better opportunities in life than others") and 6 ("We shouldn't talk about racial or ethnic differences") also performed poorly and failed to capture assimilationist views. Item 3 was designed to reflect respondent's SDO. It was hypothesized that participants with high SDO, and thus assimilationist views of multiculturalism, would endorse the item. Contrarily to expected, these respondents might have interpreted the phrasing 'some racial or ethnic groups' as a reference to ethnic-racial minorities. Conservatives might perceive affirmative action and social assistance policies as privileges and can endorse the notion that minorities 'have it easy.' Conservative attitudes such as that of RWA and SDO have been linked to social and economic conservatism, reflecting ideologies of competition and meritocracy [78]. The ambiguity left by the item wording can thus explain its failure in discriminating assimilationist attitudes. Item 6, in turn, might have not worked in its subscale because, again contrarily to our hypothesis, respondents with high assimilationist views might be willing to discuss racial and ethnic differences with the intent of promoting assimilationist and racist views [79]. Therefore, the item performed poorly as respondents in the different strata of assimilationist attitudes could be prone do endorse the item for different reasons. The last deleted item was Item 5 ("Australians from an Anglo background [that is, of British descent] enjoy an advantaged position in our society"). One possible explanation for the item's poor performance is that the recognition of privilege does not necessarily informs on inclusive/pluralistic attitudes. For example, a previous study in the Australian states of Queensland and New South Wales showed these as two independent dimensions [9]. The poor loading on the inclusive attitudes subscale suggests that respondents might not link acknowledgment of white privilege to notions of a pluralistic society. Taken together, these results potentially indicate that debates over multiculturalism in Australia need to promote awareness of the connection between Anglo-privilege and racism. Scholars advocate that challenging racism and privilege is as a necessary step towards promoting the abandonment of assimilationist views in favour of more inclusive perspectives [9,13]. The subscales "Anglo-centric/Assimilationist attitudes" and "Inclusive/Pluralistic attitudes" achieved metric invariance and scalar invariance according to sex. Furthermore, the two subscales achieved metric invariance according education and the results also (indirectly) supported scalar invariance. That is, "Anglo-centric/Assimilationist attitudes" and "Inclusive/ Pluralistic attitudes" influenced the item responses the same way in each group (metric invariance) and the items were not more difficult for one group compared to another (scalar invariance). The RRAMS items can thus be used to compare men/women, participants with/without tertiary education and young/older participants, and the scores will reflect true differences regarding "Anglo-centric/Assimilationist attitudes" and "Inclusive/Pluralistic attitudes" rather than measurement bias [35].
After ensuring measurement invariance between subgroups, we compared the factor scores between men and women, participants with and without tertiary education, and participants up to and over 45 years of age. The stronger predictor of assimilationist and inclusive attitudes was education, while sex also influenced both constructs. Furthermore, older individuals were more likely to have higher assimilationist attitudes. The role of education in promoting inclusive/pluralistic has been previously established [22,70] and suggests education as an important target for future interventions aimed at promoting multiculturalism in Australia. The results also indicated that men and older individuals had stronger assimilationist attitudes in comparison with women and younger counterparts [71]. In general, the associations of the two subscales with sex, education, and age conformed to the theoretical expectations and provide further evidence of the RRAMS' construct validity.
With regards to reliability, the "Anglo-centric/Assimilationist attitudes" and "Inclusive/Pluralistic attitudes" subscales showed adequate reliability (>.70) [80], since values between .70 and .80 are considered appropriate for research purposes [81]. In case the RRAMS is used in the future in high-stakes scenarios (i.e. where decisions need to be made based on scale scores) [82], new items should be developed to increase reliability.
In the item reduction analysis, all items displayed moderate inter-item correlations and CITC, so no items needed to be removed. The item with the smallest CITC was Item 7 ("People from racial or ethnic minority groups benefit Australian society"), followed by Item 4 ("We should do what we can to create equal conditions for different racial or ethnic groups."). Since reliability was only modest, we considered that further shortening the scale would be more detrimental in terms of reliability and content validity than beneficial as a means of creating a briefer measure. In addition, with the exception of Item 1 ("We need to stop people spreading dangerous ideas and stick to the way things have always been done in Australia.") and Item 12 ("People from racial and ethnic minority groups should behave more like mainstream Australians."), items difficulties were spread across the latent trait. Once again, although Item 1 or Item 2 could potentially be removed due to similar difficulties, we believe removing additional items would be detrimental to content validity and the psychometric properties of the scale.
One limitation of the current study was that we were not able to evaluate convergent and discriminant validity. The RRAMS was originally applied at the 2013 NDTIS, a study that focused on collecting information on the use of dental services in Australia and did not include other psychosocial measures. For this reason, we considered known-groups validity to be the best strategy to investigate the RRAMS' construct validity. While the results from known-groups validity were in accordance with theoretical expectations (e.g. inclusive attitudes were more present in individuals with more education), future studies need also to investigate other forms of validity, such as convergent/discriminant and predictive validity.
For example, future studies should evaluate whether the scores from the "Inclusive/Pluralistic attitudes" subscale are positively correlated (i.e. convergent validity) with scores from other instruments evaluating multiculturalist and inclusive attitudes. Our analyses did not account for sampling weights, meaning that our sample is not representative of the Australian population. It is important to highlight, however, that our study included Australians from all age groups and socioeconomic backgrounds across all states and territories of the country. Furthermore, to the best of our knowledge, this is the largest sample in which a measure of attitudes towards multiculturalism has been employed in Australia. Lack of representativeness and its implications to the validity of scientific findings are central to longstanding discussions in the literature [83]. Because the purpose of the current analysis was to assess the psychometric properties of the RRAMS, as opposed to purely describe prevalence estimates, we do not believe that the lack of representativeness of our sample limits the validity of inferences made here. The fact that a study sample is representative of some larger population does not mean that the associations between variables in the sample will apply to every subgroup of the population. The overall association is simply an average value that has been balanced according to the distribution of people in these subgroups. If a sample that is representative of the sex distribution in the target population, the results will not necessarily be apply to both males and females, but only to a hypothetical participant that is "weighted" on sex. Subgroups analyses are necessary if one wishes to investigate relationships between variables by subgroups, which we have performed during the criterion validity assessment stage.
In conclusion, we successfully developed a comprehensive race-related attitudes and multiculturalism scale to the Australian context. We used robust, cutting edge psychometric techniques and data from a large, nation-wide survey. The small number of items (eight) means the instrument will likely be readily used by policy makers and in ensuing research. Future studies should assess the scaling properties of the instrument by using parametric and nonparametric Item Response Theory techniques. The instrument may, nevertheless, be useful to inform on multiculturalism attitudes across the country and hopefully contribute to a public debate aimed to promote multiculturalist inclusive attitudes with the potential to increase social cohesion in Australia.