Spirituality as a Scientific Construct: Testing Its Universality across Cultures and Languages

Using data obtained from 4004 participants across eight countries (Canada, India, Japan, Korea, Poland, Slovakia, Uganda, and the U.S.), the factorial reliability, validity and structural/measurement invariance of a 30-item version of Expressions of Spirituality Inventory (ESI-R) was evaluated. The ESI-R measures a five factor model of spirituality developed through the conjoint factor analysis of several extant measures of spiritual constructs. Exploratory factor analyses of pooled data provided evidence that the five ESI-R factors are reliable. Confirmatory analyses comparing four and five factor models revealed that the five dimensional model demonstrates superior goodness-of-fit with all cultural samples and suggest that the ESI-R may be viewed as structurally invariant. Measurement invariance, however, was not supported as manifested in significant differences in item and dimension scores and in significantly poorer fit when factor loadings were constrained to equality across all samples. Exploratory analyses with a second adjective measure of spirituality using American, Indian, and Ugandan samples identified three replicable factors which correlated with ESI-R dimensions in a manner supportive of convergent validity. The paper concludes with a discussion of the meaning of the findings and directions needed for future research.


Introduction
Interest in spirituality has grown considerably in a variety of scientific and health disciplines including, but not limited to, psychology, medicine, nursing, social work, counseling, sociology, and organizational management. As a manifestation of this interest, significant efforts have been put forth to generate conceptualizations of the construct which make it accessible to quantitative research. However, despite such efforts, there exists a fair amount of divergence and disagreement regarding what does and does not constitute the content domain of spirituality [1]. In fact, the main points of debate can be organized around four inter-related issues.
The first concerns the extent to which spirituality can be treated as separate from religion and religiousness and defined in a way which does not invoke theistic and metaphysical concepts [2][3][4][5][6][7][8][9][10]. The second relates to the degree of complexity of the construct; while it appears investigators in the area generally concur that spirituality is multidimensional, there is little consensus regarding the number and content of the dimensions to be included to sufficiently and thoroughly delineate it [5,[11][12][13]. The third centers upon the emerging recognition that many definitions of spirituality, particularly those that try to operationalize it as separate from religion, may be contaminated with well-being concepts [2,[14][15][16]. The last involves whether or not spirituality can be understood as a universal domain of functioning rather then as something that is expressed in unique and specific ways across age, sex, and, most importantly for the present study, culture [17][18][19][20].
In regard to the fourth issue, as the social, behavioral, and health sciences have started to more energetically embrace multiculturalism and has come to view religion as an important aspect of cultural difference [21], attention to establishing the manner in which spirituality is an emic or etic construct has risen. This is an important development in our view, not just for studies of spirituality and religion, but for all of science since there are indications that empirical findings may not generalize outside of the cultural environments in which they are investigated [22,23]. In this vein, a cursory survey of the available cross-cultural literature on spirituality provides a somewhat conflicted picture. On the one hand, investigations on quantitative tests such as the Spiritual Transcendence Scale (STS) [24], the Brief Multidimensional Measure of Religiousness/Spirituality (BMMRS) [25], the Daily Spiritual Experiences Scale [26], the Faith Maturity Scale [27], and the Religious Coping Questionnaire (RCOPE) [28] offer some indications that they demonstrate satisfactory reliability and reasonably good factorial, convergent, criterion and/or incremental validity with different cultural, ethnic, and religious groups [29][30][31][32][33][34]. However, at the same time, studies utilizing more qualitative modes of inquiry have tended to provide argumentation and evidence pointing to spirituality as being a culturally bound concept [35][36][37][38]. Given this state of affairs, it is difficult to discern whether or not the contradictory findings are the product of methodological biases or weaknesses or instead reflect something substantive about the nature of spirituality which may require a reconsideration of how it should be studied scientifically.
Notwithstanding the incongruence of findings across methodologies which deserves attention in its own right but is beyond the scope of this paper, a more critical inspection of the published psychometric research indicates that the evidence backing an etic view of spirituality is not without discrepancies and limitations. For example, as with all areas of quantitative research in psychology and the behavioral sciences, there are studies which produce contradictory results. To illustrate, with the Spiritual Transcendence Scale, which is perhaps the most extensively studied test with different cultural and ethnic groups, research done in Australia and the Czech Republic has failed to find strong support for Piedmont's [24] factor structure [39,40].
More generally, virtually all of the extant investigations show inadequacies in sampling and statistical analysis. In terms of the former, there is a lack of sampling across multiple cultures within a single study. Instead, the prevailing trend has been to use data drawn from one culture [1,32,33,41] and for the researchers to make broad generalizations about cross-cultural validity based upon the properties of the measurement tool within that culture. With respect to the latter, the available empirical studies have yet to employ stringent structural and measurement invariance testing involving direct evaluation of model fit using appropriate statistical procedures across tenable alternative models [42][43][44][45]. Ostensibly, if anyone is to claim that a measure and its associated concepts are universally valid, then there is a need to not only incorporate samples from a variety of cultures to better ensure that results are actually generalizable but it is also important to be sure that the model of choice demonstrates superior fit to the data as compared to reasonable alternative models.
Another set of issues plaguing existing quantitative research concerns the measures themselves. Nearly all tests that have been the focus of study have been developed by American researchers from a predominantly Western Judeo-Christian perspective using American samples. As has been argued by some [38], concepts such as transcendence and faith that are the focus of many tests are often implicitly couched within Western philosophy and religious doctrine in a manner which makes their appropriateness for use in cross-cultural research dubious. Also, most instruments were devised either to measure fairly specific concepts or were designed for relatively specific purposes. For instance, the Brief Multidimensional Measure of Religiousness/Spirituality was constructed not with regard to what actually should or should not comprise spirituality but instead was created expressly for the purposes of identifying the types of spiritual and religious variables which appear to hold the most potential to uncover a relation with physical and psychological health. Though it may be presumed that empirical data supporting the validity of a specialized assessment tool across cultures provides some basis to think that tests of similar but more inclusive concepts will also show validity, this is really a matter of empirical verification and one which requires the utilization of instruments which are not delimited to representing only selected elements and features of spirituality but instead are more overtly aimed at inclusively identifying and incorporating all its major components and traits [46].
Taken together, it should be apparent that in order for any substantive advances in the cross-cultural study of spirituality to occur, effort must be made to address as many of the aforementioned problems as possible. With this in mind, the purpose of the present study was to investigate the validity and generalizability of spirituality as a quantitative construct across cultures and languages with attention given to these challenges.

But What is Spirituality? The Need for an Adequate Definition and Taxonomy
As a logical starting point, it struck us as important to first overview some of the general definitions of spirituality that are reflective of the better scholarship in the area so as to identify potential commonalities on which to base our own conceptual and methodological approach to the topic. Curiously, and despite the points of controversy cited earlier, there appears to be a fair amount of consistency, with most definitions placing emphasis on transcendence as a core feature. For instance, Elkins, Hedstrom, Hughes, Leaf, and Saunders [47] define spirituality as "a way of being and experiencing that comes about through awareness of a transcendent dimension and that is characterized by certain identifiable values in regard to self, others, nature, life, and whatever one considers to be the Ultimate" (p.10). The Fetzer Institute/National Institute on Aging Working Group [25] characterize spirituality as being "concerned with the transcendent, addressing ultimate questions about life's meaning, with the assumption that there is more to life than what we see or fully understand" (p. 2). More recently, Pargament [48] has defined spirituality as the "sacred domain" which concerns "ideas of God, higher powers, divinity, and transcendent reality" (p. 32). As well, Koenig [49] sees spirituality as "distinguished from all other things-humanism, values, morals, and mental health-by its connection to the sacred, the transcendent. . .Spirituality is intimately connected to the supernatural and religion, although it extends beyond religion" (p. 116-117). As a final example, de Jager Meezenbrock and colleagues [2] define spirituality as "one's striving for an experience of connection with oneself, connectedness with others and nature and connectedness with the transcendent" (p. 338).
Based upon this small sampling of definitions which have appeared across four decades of research, one might conclude that the most efficient and straightforward way of construing spirituality is to define it as that aspect of human functioning, experience, and existence which concerns the transcendent. Admittedly, there is a certain appeal to such a parsimonious definition as it appears to reconcile areas of divergence across researchers. Unfortunately, it embodies a problem already alluded to, namely it utilizes the concept of transcendence and, more specifically, transcendent reality as the cornerstone for how to understand spirituality. As argued most poignantly by Helminiak [50][51][52] such a term is presumptively linked to religious and theological systems that themselves are concerned with what is essentially metaphysical/ supernatural and do not fit well with the contemporary scientific worldview.
Though we very much appreciate Helminiak's position on the matter, we do not precisely share it as we consider it possible to use the term transcendent in a manner which minimizes its other worldly and supernatural connotations as would be the case if it was explicitly employed as an adjective to convey the phenomenological qualities of spirituality (particularly spiritual experience) or, alternatively, if it were framed as a psychological process responsible for shifts in identity and personological functioning (e.g., see [53]). Nonetheless, we maintain that the use of such terminology runs the risk of fostering misunderstanding around the veridical nature of spirituality itself, as there does not appear to be any way within science of establishing whether or not such terms actually represent something "real," which is not reducible to more recognized and accessible biopsychosocial processes and mechanisms, or only signify something that is just an aberration of psychological functioning. The extensive debates concerning the differentiation of spirituality and psychopathology (especially dissociation and psychosis) replete in the literature may be seen as a tangible product of the confusion caused by the attribution of certain experiential states to so-called transcendent dimensions which can accompany the use of such terms [54][55][56][57][58][59][60][61][62]. Consequently, while there is no doubt in our minds that transcendence and other related concepts belong to the domain of spirituality, such notions are probably not best used to delineate its cardinal nature definitionally, at least if the goal is to arrive at a scientifically defendable conceptualization of spirituality.
An alternative approach to defining spirituality is to place emphasis on it as a natural phenomenon that is most centrally experiential in nature but is also accompanied by and/or manifested in neurophysiological, cognitive, characterological, and behavioral expressions that can be viewed as antecedent to, concomitant with, or determined by experience. The experiential core of spirituality itself can be conceptualized as consisting of experiences that (a) have certain phenomenological qualities involving modifications in the operations of self and identity relative to normal modes of functioning which have brain based correlates [63][64][65], (b) have an impact on thought, behavior, lifestyle, and personality and (c) lead to lasting changes in how one understands self, other, and the universe [66].
Such a definitional approach is copasetic with the work of several researchers and theoreticians; for instance, Grof [67,68] has argued that spirituality is a natural human developmental potential which involves shifts and transformations in conventional personality functioning that result in higher modes of health, integration, and well-being. Also, Hay and Socha [69] have proffered that spirituality is a natural human phenomenon which has both evolutionary and sociocultural significance. Finally, through the concept of self-transcendence, Cloninger [70] has incorporated spirituality into his biopsychosocial model of temperament and character.
A major advantage of defining spirituality in this manner is that it permits for its inclusion within naturalistic science in a way which does not explicitly require the use of religious and theological ideas but, at the same time, does not completely deny the utilization of such ideas and systems of thought as hermeneutic tools for the interpretation of spiritual phenomena. It also opens up the possibility of exploration and investigation of practices such as prayer, meditation, and contemplation as vehicles for facilitating the activation and maintenance of spirituality in a manner that is not constrained to the confines of doctrinal or institutional religiosity. In other words, while not wholly synonymous with spirituality, religion can be regarded as a major agent for fostering the emergence and understanding of spiritual experience and its significance for self, other, and reality without stifling scientific inquiry [71].
Considering its benefits and especially how it helps to address the problems associated with the use of metaphysical concepts, we elected to use this approach for a definition of general spirituality. Succinctly stated, spirituality is a natural aspect of human functioning which relates to a special class of non-ordinary experiences and the beliefs, attitudes, and behaviors that cause, co-occur, and/or result from such experiences. The experiences themselves are characterized as involving states and modes of consciousness which alter the functions and expressions of self and personality and impact the way in which we perceive and understand ourselves, others, and reality as a whole.
With this working definition of spirituality, it should be apparent that spirituality is complex, involving experiences with specific phenomenological features as well as cognitive and behavioral components. The question now becomes-what are the unique qualities and dimensions that make up this complex domain of functioning? As noted earlier, there appears to be fairly broad agreement that spirituality is a multi-faceted concept. However, the number and content of those facets vary across the available models and measures with some proposing a few as two [72] and others as many as nine [47]. Such a state of affairs presents considerable difficulties for researchers as there is little by way of guidance as to which one would be most suitable to use as a comprehensive model for cross-cultural research. For the sake of the present study, we elected to use the model of MacDonald [12,73].

Expressions of Spirituality: Model and measure
Motivated to address the problems with definition and measurement seen in the research, MacDonald [12] completed a series of conjoint exploratory factor analyses using a wide variety of instruments designed to assess spirituality and related concepts available in the literature with data obtained from two large samples of Canadian university students. His findings provided strong evidence supporting the existence of five robust dimensions which he argued could serve as a framework for organizing existing empirical findings on the relation of spirituality to other aspects of functioning (e.g., health, personality, social behavior) and provide direction for future research and theory development. The dimensions are Cognitive Orientation toward Spirituality (i.e., beliefs about the existence, validity, and relevance of spirituality for one's sense of identity and daily functioning), Experiential/ Phenomenological Dimension (i.e., spiritual, mystical, religious, and transcendent experience and their phenomenological features including changed sense of self and perceptions of sacredness, divinity, holiness, and connectedness commonly tied to such experiences), Existential Well-Being (i.e., sense of meaning and purpose and perceived capacity to handle the existential adversities of life), Paranormal Beliefs (i.e., beliefs in the existence of paranormal phenomena and abilities) and Religiousness (i.e., intrinsic commitment to religious ideas, values, and practice for their own sake). Concurrent to developing the model, MacDonald also constructed a 100-item measure to operationalize the five dimensions. Named the Expressions of Spirituality Inventory (ESI), MacDonald [12,73] found the instrument to demonstrate satisfactory reliability and convergent, discriminant, criterion, and factorial validity. Immediately subsequent to the publication of his initial findings, MacDonald [73] devised a shorter 32-item version of the test (ESI-Revised or ESI-R) using items from the parent scale selected on the basis of the uniqueness of content and item-toscale reliability.
MacDonald's [12] model and measure have a variety of strengths which make them ideal for our purposes. First, the model appears to capture the common latent constructs that are tapped by available spirituality instruments in a way which makes it one of the most comprehensive approaches to spirituality presently available. As a concrete illustration of this, Mac-Donald [12] found in his analyses that all dimensions were comprised of strong loadings from two or more measures but none of the instruments employed in his analyses, including those that are themselves attempts at comprehensive models [47,74], loaded substantively on all five dimensions. With this mind, it is important to acknowledge that some investigators have been critical about the inclusion of paranormal beliefs in his analyses and model [75], and that Mac-Donald himself has raised issues with existential well-being representing a discrete aspect of spirituality [15,76]. In response to such criticisms, particularly the latter one, it is important to remember that the dimensional model is based upon the latent constructs found within existing tests. In regard to the former criticism, MacDonald [12] has justified the incorporation of paranormal beliefs by pointing out that many faith systems accommodate beliefs in phenomena typically considered paranormal (e.g., miracles in the Judeo-Christian-Islamic traditions, siddhis in the Hindu tradition). More recently, independent research has suggested that spirituality may be best differentiated from other concepts through belief in supernatural spirits [77]. MacDonald found that such beliefs contribute to his paranormal beliefs dimension. Thus, it may be argued that paranormal beliefs have a place within a comprehensive model of spirituality.
A second asset of MacDonald's model relates to his efforts at addressing the problems linked to defining spirituality as separate from religion. In particular, he was cognizant of the issues associated with the relation of spirituality to religion and to the predominance of Western conceptualizations of spirituality in the psychological literature. In response, when planning his analyses, he made sure to include tests that measure constructs derived from Eastern cultural and faith systems [78,79] as well as instruments tapping religious variables most commonly associated with devout belief and practice thought to represent spirituality (i.e., intrinsic religiousness) [14,80,81]. The result was the identification of a dimension (i.e., Religiousness) which incorporates religious practice and beliefs in the existence of a higher power but that excludes doctrinal religiosity. Though his factor analytic findings suggest that the religiousness factor is more reflective of Western religious traditions (e.g., a measure of Eastern spirituality loaded negatively while a measure of Western spirituality loaded positively), statistical comparisons of people with various religious backgrounds on this dimension did not uncover any significant differences between denominational groups [12].
As a third strength, the model and/or instrument have been used in a variety of studies and has proven useful for theory development [71], test validation [82,83], and empirical investigations of the relation of spirituality to personality, social, and health variables [9,[84][85][86][87][88][89]. As a product of this work, the model has also helped to clarify how spirituality manifests multifarious relations to functioning with some dimensions showing more positive associations (e.g., Existential Well-Being, Cognitive Orientation, and Religiousness), and others showing mixed (e.g. Experiential/Phenomenological Dimension) to negative relations (Paranormal Beliefs) [90,91]. Thus, each dimension appears to uniquely and incrementally contribute to our understanding of how spirituality impacts functioning.
A fourth advantage, which may appear on the surface to be a liability, concerns the pattern of intercorrelations between the dimensions. MacDonald [12] found significant correlations between various pairs of the five ESI dimensions with the association between Religiousness and Cognitive Orientation emerging as conspicuously high (e.g., when correlating factor scores for these dimensions, r = .63; when correlating ESI dimension scores, r = .73). Though his factor analytic work revealed that these dimensions emerged separately, it may be argued that such results were the product of the samples used and that a four dimensional model wherein Religiousness and Cognitive Orientation are combined would be more parsimonious. This provides a good basis on which to compare and test four and five factor models to determine the best fitting model.
Fifth and finally, the ESI is one of the few measures of spirituality to include items to assess response validity (i.e., honesty of responding) and, as importantly, face validity (i.e., the respondent's perception that the test is actually measuring spirituality). Since spirituality is a highly subjective phenomenon which could be said to be essentially ineffable [7], it would be unreasonable to expect any test to wholly and accurately capture it as it is directly known within a person's experience. With the inclusion of a face validity item, researchers are permitted the opportunity to directly evaluate the extent to which different test-takers see the test as measuring something that is similar to their own understanding of spirituality.

Study Design and Hypotheses
In order to provide a rigorous evaluation of the cross-cultural generalizability of spirituality as a psychometric construct, we adopted a complex approach to study design that attempted to address to the various shortcomings of the available research. In particular, we aimed to examine the reliability, factorial validity, and configural and measurement invariance of the ESI-R across samples drawn from several countries in disparate geographic areas that included both individualistic and collectivistic cultures. To assess the impact of culture alone versus culture and language together, samples from cultures in which English is a dominant language received all measures in English while the ESI-R and other tools used were translated to the dominant language spoken for the remaining samples. To ensure that our findings were not merely the product of the ESI-R, we incorporated a second novel questionnaire of spirituality developed through a content analysis of narrative descriptions of a spiritual person (see method section) to determine if it produced a factor structure akin to that of the ESI-R and to serve as a convergent validation measure.
In terms of research expectations, several hypotheses were tested. In particular, for all cultural samples, it was expected that (a) the ESI-R would be perceived as measuring spirituality as per responses to the face validity item, (b) the ESI-R would demonstrate satisfactory reliability and similar patterns of intercorrelations between dimensions as well as associations with demographic variables (i.e., age and sex), (c) the ESI-R would show structural or configural invariance with the original five factor model demonstrating superior goodness of fit to the data relative to a four factor model in which COS and REL dimensions are combined into a single dimension, (d) the ESI-R would be found to demonstrate measurement invariance, and (e) the spirituality adjective measure would produce factors similar to the ESI-R which would also show a congruent pattern of intercorrelations with each other and associations with demographic variables.

Methods Participants
Data used in the present study were obtained from 4325 university student volunteers across eight different countries including Canada (n = 938), India (n = 800), Japan (n = 205), Poland (n = 400), the Slovak Republic (n = 178), South Korea (n = 660), Uganda (n = 518), and the United States (n = 626). The Canadian data were originally used in MacDonald [12] but were included here to permit reanalysis and comparison to the other samples. The remaining data were gathered between 2000 and 2006. Data for the Polish sample were obtained as part of a study on religious orientation but were not used in that study [92]. Finally, the data of 247 participants of the American sample have been used in a study to examine the relation of spirituality to well-being measures [15].

Measures
Demographic Survey. A survey form was used which obtained basic demographic information (e.g., age, sex, religious affiliation).
Expressions of Spirituality Inventory-Revised (ESI-R) [73]. The ESI-R is a 32-item selfreport questionnaire developed from a longer 100-item parent instrument [12,76] designed to operationalize a five dimensional model of spirituality created through the conjoint factor analysis of 19 different tests selected due to their perceived representativeness of the content domain of spirituality. While the items for the 100-item ESI were included in the test on the basis of factor and reliability analyses, the selection of 30 of the items for the ESI-R was based upon item content uniqueness and reliability (e.g., corrected item-to-scale total correlations). The last two items are the same as the parent version of the test; item 31 is a face validity item ("This test appears to be measuring spirituality") and item 32 an honesty-of-responding item ("I have responded to all items honestly"). Each dimension is tapped by six items. The test employs a five point scale ranging from 0 (Strongly Disagree) to 4 (Strongly agree) which is used by respondents to rate the extent to which they agree with the content of the item. For the interested reader, the 30 items for the test appear in a table in the results section.
Spirituality Adjective List (SAL) [93]. The SAL is a 40 item self-report instrument that utilizes a five point response scale ranging from 0 (Strongly Disagree) to 4 (Strongly Agree). The items were developed from a thematic content analysis of written narrative descriptions of a spiritual person obtained from 50 Canadian university students. In completing the analysis, the test authors first reviewed the written narratives and identified clear descriptors/adjectives. Thereafter, the authors devised items to embody adjectives that were found to be most commonly used across respondents. Items for the SAL in order of appearance on the questionaire are as follows

Procedure
The questionnaires were administered to students at universities in their respective countries. In all cases, brief presentations about the study and the need for participants were made to classes by one of the researchers and/or a research assistant under the supervision of one of the researchers. Students who expressed interest in participating completed paper-and-pencil copies of the measures either during class time, during scheduled testing sessions, or were given hardcopies to complete and return to the researcher or research assistant. In Canada, the United States, India, and Uganda, the instrument was given in English. For all the remaining samples save Poland, the test was translated using the standard translation-back translation procedure. For the Polish sample, the test was translated using a committee approach wherein a number of people fluent in both English and Polish collaboratively worked to create the translation [92]. Along with the ESI-R, the SAL was also given to respondents in the Indian, and Ugandan samples and to approximately 300 of the American participants.

Ethics Statement
All data gathering was completed in a manner consistent with standard ethical practices for questionnaire based psychometric research in place at the time of data collection. Approval was obtained prior to data collection either through established institutional research review committees/boards or through institutional officials when such committees did not exist. For the American data, approval was granted by the University of Detroit Mercy Institutional Review Board. For the Canadian data, approval was obtained from the University of Windsor Research Ethics Committee. For both samples, the first author (D.A.M.) was the primary researcher involved with data collection. For the Indian data, permission was obtained from chairpersons and administrative heads of the University of Mysore Faculties of Arts and Humanities, Commence and Management, Education, Law, and Science and Technology. Data collection was completed by the fifth author (K.K.K.S.). For the Japanese data, approval was granted by the Academy of Counseling Japan and the Saybrook University Institutional Review Board. Data gathering was done by research assistants supervised by the second author (H.L. For the American, Canadian, Indian, Japanese, Korean, and Ugandan samples, written informed consent was obtained prior to the completion of the questionnaires. For the Polish and Slovakian samples, informed consent was communicated verbally. In these instances, only individuals who gave verbal consent were provided with hardcopies of the questionnaires to complete. Verbal consent was considered sufficient by the boards and/or officials at all involved institutions for questionnaire based psychometric research. For all samples, participation was voluntary and no personal identifying information was obtained.

Data Analysis
The approach to analyzing data for this study was multi-tiered and involved looking at questionnaire scores at both the item and scale level and with the samples combined and separated. First, to ascertain the extent to which the ESI-R demonstrated face validity, responses to item 31 which asked participants to rate the extent to which they viewed the test as measuring spirituality were analyzed via ANOVA and examination of response frequencies. Next, descriptive statistics and reliabilities for the ESI-R items and dimension scores were calculated for all samples combined and then for each country sample separately. ESI-R scores at both item and scale levels were then examined as a function of country via ANOVA. These analyses were done in response to the recommendations of Byrne and Watkins [43] who suggested that examination of score differences across cultures should be included in any evaluation of measurement invariance. Since there is evidence that the ESI-R dimensions may differ as a function of age and sex of respondent [12,18], product-moment correlations were next calculated with the ESI-R dimensions and these two participant variables across all country samples. Thereafter, inter-correlations between the ESI-R dimensions were computed. Next, both exploratory (EFA) and confirmatory factor analyses (CFA) were completed in order to assess the structural consistency and factor and measurement invariance of the ESI-R dimensions. Both approaches were employed in this study in response to issues raised regarding the use of confirmatory factor in the evaluation of personality inventories [94,95]. In consideration of the fact that the original ESI was developed using an EFA approach very similar to that employed for creating measures of the Five Factor Model of personality [12,76], these issues seemed to us to be applicable to this study. The utilization of CFA based techniques to assess structural and measurement invariance was done in a manner consistent with experts in the area of Structural Equation Modeling and CFA [42,44,45,96]. Finally, the SAL was examined across American, Ugandan, and Indian samples using EFA to identify latent factors and to construct subscales based upon similar patterns of varimax rotated factor loadings. Reliabilities of the emergent subscales, subscale intercorrelations, and correlations with the ESI dimensions were then calculated.

Results
Prior to beginning any analyses, data were examined for completeness, accuracy, and evidence of response bias (e.g., perseverative responding, dishonest responding as indicated by a disagree response to item 32 on the ESI-R asking if questions were responded to honestly). Any cases demonstrating one or more of these problems were excluded from all analyses. This resulted in a total 321 participants being removed from the study. Table 1 presents information on the basic demographic characteristics for the total combined samples and for each country sample separately including age, gender, and religious affiliation.

Face Validity of the ESI-R Across Country Samples
As an initial set of statistics, we focused on the response to item 31 that asks participants to rate the extent to which they perceive the ESI-R as measuring spirituality since it struck us as a good initial indicator of the extent to which spirituality in general and the instrument itself hold up across cultures. With all but three of the 4004 participants providing a response to this item, the mean response for all samples combined is 2.85 (SD = 1.06). Examination of frequencies for the pooled samples indicates that 70.4% of participants responded "agree" or "strongly agree" to the item. For a more detailed analysis, we examined differences across each country sample via a one-way ANOVA. Table 2 reports the descriptive statistics, response frequencies and ANOVA results. The ANOVA emerged significant (F(7, 3993) = 131.07, p<.001, η 2 = . 19).
Post-hoc analyses (Scheffe's test), uncovered a variety of statistically significant pair-wise differences between country samples. Notwithstanding these statistical findings, examination of response frequencies indicates that the majority of respondents from the Canadian, American, Indian, and Ugandan samples responded with agree or strongly agree to the item. For the remaining samples, the majority of responses fall between neutral to strongly agree.
In order to get a sense of the extent to which responses to the face validity item are associated with scores on the ESI-R, correlations were calculated between ESI-R item 31 and all ESI-R dimensions and items (35 correlations in all) using pooled data (N = 4001). With the dimension scores, all coefficients were significant at p<.01 or lower and ranged in absolute value from |.04| for EWB to |.17| for REL. For items, most correlations were small; nine coefficients fell at r = |.10| or higher with the strongest correlation falling at r = |.23|. Correlations were also

Reliability Analyses of the ESI-R
Scale reliabilities were calculated for each ESI-R dimension for the total combined sample and each country sample separately. Tables 3 and 4 present the scale and item-level descriptive and reliability statistics for the total combined sample including Cronbach's alphas and corrected item-to-scale total correlations for all items. Table 5 presents the scale level descriptive statistics and reliability coefficients for each country sample separately along with the mean corrected item-to-scale total correlations for all items within each ESI-R dimension.
As can be seen in the tables, alphas and item-to-scale correlations are reasonably good for the total combined sample and each country sample with only a two notable exceptions. For the Indian and Ugandan samples, Paranormal Beliefs produced unacceptably low alpha coefficients (α = .59 and .45, respectively).

Tests of Differences Across ESI-R Dimension Scores
A series of one-way Analyses of Variance (ANOVAs) were completed to see if significant differences exist on ESI-R dimension scores as a function of country (see Table 5). All five ANO-VAs emerged significant (COS-F(7, 3996) = 97.03, p<.001; EPD-F(7, 3996) = 20.96, p<.001; EWB-F(7,3996) = 15.74, p<.001; PAR-F(7, 3996) = 68.02, p<.001; REL = F(7, 3996) = 168.86, p<.001). Effect sizes as reflected in eta-squared are small for EPD and EWB and medium to large for the remaining dimensions. Post hoc analyses (Scheffe's test) revealed a large array of statistically significant (p<.05) pairwise differences for all five ANOVAs across the country samples. Only general trends in these findings will be described here. For COS, only two posthoc tests were not significant (between Canada and India, and Slovakia and Korea). Similarly, for REL, only two pairwise comparisons, between Japan and Korea, and Slovakia and Korea, emerged non-significant. For EPD, Uganda was found to produce significantly higher scores than all other countries. The Polish and Indian samples produced significantly higher scores than Canada, Korea, and Slovakia. Last, the American sample has a significantly higher score compared to the Korean sample. For EWB, the American sample generated a significantly higher score than all other countries. At the other extreme, the Polish, Japanese, and Slovakian samples did not produce any pairwise differences with any countries. Finally, for PAR, the Indian and Korean samples produced significantly lower scores compared to each other and all other countries. The Canadian sample generated a significantly higher score than all other countries save Japan and Slovakia.

Tests of Differences Across ESI-R Item Scores
One-way ANOVAs examining all 30 ESI-R item scores as a function of culture were calculated (see Table 6). Akin to what was found with the dimension scores, all 30 ANOVAs were significant with effect sizes ranging from small for all EPD items, small to medium for EWB items,

Associations of ESI-R Dimensions with Age and Sex
Product-moment correlations were computed between the ESI-R dimensions for age (in years) and sex (male coded 0 and female coded 1) for the total combined sample and each country sample separately (see Table 7). When considering the findings for the combined sample, while there are a number of statistically significant correlations, the coefficients are generally low in magnitude for both age and sex. Closer inspection of the findings across the country samples reveals some notable differences in the strength of associations. With age, the Korean sample followed by the Slovakian sample produced at least one correlation of moderate magnitude. With sex, the most substantive correlations were found with the American, Canadian, and Ugandan samples. Looking at the findings across the ESI-R dimensions, Religiousness and Cognitive Orientation toward Spirituality appear to be most strongly and significantly related to age and sex. Given the significant findings obtained with age and sex and the observed significant differences for all ESI-R dimensions as a function of culture, it was surmised that interaction effects between the three variables should be examined. In response, five 2 (sex) x 2 (age; two groups based on pooled median split-18-21 years versus 22 and older) x 8 (country samples) ANO-VAs were computed wherein each ESI-R dimension served as the dependent variable. In all five ANOVAs, non-significant three-way interactions and non-significant two-way interactions between sex and age were produced. For Existential Well-Being, no two-way interactions were significant. For Cognitive Orientation toward Spirituality, significant interactions between age and country (F(7, 3896) = 9.01, p<.001) and sex and country (F(7, 3896) = 3.78, p<.001) were found. A similar pattern of significant two-way interactions was generated for the Experiential/Phenomenological Dimension (for age x country-F(7, 3896) = 4.84, p<.001; for sex x country-F(7, 3896) = 2.02, p<.05). For Paranormal Beliefs, a significant interaction was obtained between sex and country (F(7, 3896) = 6.56, p<.001). Finally for Religiousness, a significant interaction was found between age and country (F(7, 3896) = 5.71, p<.001). Effect sizes for all significant interactions were small (eta-squared ranged from .00 to .02).
When running the 2x2x8 ANOVAs, examination of cell frequencies revealed that the Slovakian and Japanese samples had far too few participants in some subgroups. Since unequal cell sizes can have a distorting effect on ANOVA results, we re-ran all five analyses excluding the Slovakian and Japanese samples to make sure that evidence of interaction effects was robust. For COS, significant two-way interactions were found with sex and country (F(5, 3553) = 4.99, p<.001) and age and country (F(5, 3553) = 11.89, p<.001). For EPD, significant two-way interactions emerged for sex and country (F(5, 3553) = 2.44, p<.05) and age and country (F(5, 3553) = 6.32, p<.001). For EWB, significant two-way interactions were found for sex and age (F(1, 3553) = 3.88, p<.05) and age and country (F(5, 3553) = 2.70, p<.05). For PAR, a significant two way interaction emerged with sex and country (F(5, 3553) = 9.01, p<.001). Finally, for REL, a significant two-way interaction was found between age and country (F(5, 3553) = 7.14, p<.05) and a significant three-way interaction emerged (F(5, 3553) = 2.24, p<.05). While these results are not identical to those found using all country samples, thus suggesting that asymmetry of cell sizes did have some impact on our main analyses, they still point to the existence of interaction effects.

Inter-correlations of ESI-R Dimensions
Product-moment correlations were next calculated between the ESI dimensions for pooled samples and for each country sample (see Table 8). Examination of these coefficients reveals a few conspicuous trends. In particular, the correlations between COS-REL, COS-EPD, EPD-REL, and EPD-PAR came out significant and of moderate to high strength in every set of analyses regardless of sample. Another observable trend concerns the three samples with the lowest numbers of Christian participants (i.e., Indian, Korean, and Japanese). Specifically, with these three samples, correlations of moderate strength were found between PAR-REL and PAR-COS. Lastly, though some statistically significant coefficients were obtained with EWB, the size of the coefficients for the total pooled sample and each country sample is consistently small.

Exploratory Factor Analyses (EFAs)
Since the original ESI was developed using EFA [12,76], we decided to first complete two principal axis factor analyses extracting and varimax rotating five factors in an effort to see if the factors were replicable with the ESI-R. In the first analysis, data from all participants were used (N = 4004). In the second, all data except those from the Canadian sample were used (n = 3072). This latter analysis was done in response to the fact that the Canadian sample was employed by MacDonald [12,76] to develop the ESI. We reasoned that the analysis would be a better indicator of the replicability of the factors. Rotated factor loadings for the two analyses can be seen in Table 9. For both analyses, elevated factor loadings (i.e., loadings .30 or higher) were found for all ESI-R items in a manner wholly consistent with what MacDonald [12] reported. In particular, items comprising each of the ESI-R dimensions loaded strongly on separate factors with the exception of COS and REL which produced notable loadings on the first factor followed by COS generating strong loadings on a second separate factor. While not reported in this article for the sake of brevity, we also ran the analyses with oblique (oblimin) rotation. Pattern matrices, which provide information on the unique association of a variable to a factor, showed elevated loadings for all items on completely separate factors (i.e., COS and REL did not produce loadings .30 on the same factor).
To determine whether or not the ESI-R dimensions remained replicable across sexes, we completed principal axis factor analyses for all males (n = 1436) and females (n = 2558) separately. The analyses were set to extract five factors and obliquely (oblimin) rotated factor loadings examined. For both sexes, the items comprising the five ESI-R dimensions loaded strongly on separate factors. We did a similar pair of principal axis factors with age. Using a median split, we created a young (i.e., 21 years and younger) group (n = 1995) and an old group (i.e., 22 years and up) (n = 1943). The pattern matrix from both solutions showed loadings consistent with the ESI-R dimensional structure.
For the sake of thoroughness, the factorial stability of the ESI-R dimensions as a function of perceived face validity was also examined. More specifically, responses to ESI item 31 were used to create two groups. One group consisted of all participants who responded strongly disagree, disagree, and neutral to the item (n = 1184) and a second group was comprised of participants who responded agree or strongly agree (n = 2817). Principal axis factor analyses involving the extraction and oblique (oblimin) rotation of five factors produced solutions supportive of the five ESI-R dimensions; pattern matrices showed that all items loaded on clearly identifiable factors.

Confirmatory Factor Analyses (CFAs)
In order to evaluate the goodness-of-fit of the dimensional model underlying the ESI-R, a series of maximum-likelihood confirmatory factor analyses using Analysis of Moment Structures (AMOS) software were completed. Due to the fact that COS and REL are highly inter-correlated, it was considered worthwhile to also examine the goodness-of-fit of a four factor model wherein these two dimensions were combined into one to see which model (four versus five factors) resulted in better fit.
In total, 18 CFAs were done (9 four factor and 9 five factor), using data for the combined samples first, followed by separate analyses for each country sample. The standardized regression weights along with a variety of fit statistics for the combined sample analysis can be found in Table 10. The overall model fit statistics for the analyses for each country separately can be found in Tables 11 through 14.
In all analyses for both four and five factor models, inspection of parameter estimates indicated that all regression weights (i.e., factor loadings) came out statistically significant as did all error variances (p<.05 or lower). Alternatively, examination of covariances (i.e., the intercorrelations between ESI-R dimensions), revealed some differences across models and samples. For instance, all covariances emerged significant with the total combined sample in both four and five factor models. However for the Canadian sample, the four-factor model produced nonsignificant estimates for all covariances involving EWB and in the five-factor model, five covariances were nonsignificant (i.e., all involving EWB, and PAR-REL). For the American sample, two covariances were nonsignificant in the four-factor model (i.e., EPD-EWB, and combined COS/REL-PAR) and three in the five-factor model (i.e., PAR-REL, COS-PAR, and EPD-EWB).
For the four-factor model for the Polish sample, one covariance estimate was nonsignificant (i.e., between combined COS/REL-EWB) and two were nonsignificant in the five-factor model Note. Overall fit statistics for the models were as follows: Four factor model:  (i.e., EWB-REL, and COS-EWB). In the Slovakian sample, four emerged nonsignificant in the four-factor model (i.e., all with PAR and combined COS/REL-EWB), and six in the five-factor model (i.e., all with PAR, EWB-REL, and COS-EWB). In the Ugandan sample, two covariance estimates were not significant (i.e., EPD-EWB and combined COS/REL-PAR) and three in the five-factor model (i.e., PAR-REL, COS-PAR, and EPD-EWB). In the Indian sample, the covariance estimate for the combined COS/REL and EWB was not significant in the four-factor model while in the five-factor model, two came out non-significant (i.e., EWB-REL and COS-EWB). In the Korean sample, two covariances were found to be nonsignificant (i.e., EPD-EWB and combined COS/REL-EWB), and three for the five-factor model (i.e., EWB-REL, EPD-EWB, and COS-EWB). Lastly, for the Japanese sample, two covariance estimates were not significant in the four-factor model (i.e., EPD-EWB, and combined COS/REL-EWB) and three in the five factor model (i.e., EWB-REL, EPD-EWB, and COS-EWB).
When comparing the overall model fit statistics between four and five factor models for the total combined sample and for each country sample, the correlated five factor model emerged superior as reflected in significant reduction in chi-square values (i.e., the five factor model produced significantly lower chi-square values than the four factor model). Based on this, the five factor model became the focus of our remaining analyses.
Notwithstanding the findings supporting the five-factor model, closer examination of the fit indices suggests that overall model fit was not wholly satisfactory across all analyses. On the positive side, fit statistics for the combined total sample, Canadians, Americans, and Indians provide reasonably good support for model fit (e.g., despite chi-square being significant and the chi-square/df ratio exceeding 3.0, GFI, AGFI, NFI, RFI, IFI, TLI, and CFI are close to or exceed .90; RMSEA is lower than .08 and SRMR is close to or lower than .05, [42,44,96]). The fit statistics for the Ugandan sample, while not as compelling, also appear to be at least somewhat adequate. On the other hand, for the remaining samples, all of which used translated versions of the ESI, fit statistics are less consistently supportive of good fit of the five factor model.
To identify possible causes for the poorer model fit (i.e., model misspecification), modification indices were examined for the Polish, Slovakian, Korean, and Japanese samples. While a number of modification indices were generated, none of them indicated that any part of the model for these four samples could be respecified in a manner that made rational sense. For instance, there was nothing pointing to correlated error variances suggesting possible systematic measurement error due to unintended overlap in item content [45]. Similarly, there were no modification indices which strongly supported the re-assignment of an ESI-R item from one dimension to another in a way that would be defendable from a conceptual point of view or would generalize beyond a single country sample.
Finally, to determine the extent to which parameter estimates for the five-factor model are stable and generalizable beyond the current samples, maximum likelihood bootstrap analyses were completed for the total combined sample and each country sample. Examination of 90% bias corrected confidence intervals for estimates generated from 1000 bootstrap samples revealed non-zero value ranges (i.e., the confidence interval did not contain zero; Byrne [42,96] notes that if zero is contained in the interval then one cannot reject the hypothesis that the parameter value for the population is zero) for all variances and regression weights for all samples. Confidence intervals for nonsignificant covariances as reported above, conversely, were found to include zero.

Test of Measurement Invariance
As the most rigorous test of the ESI-R, we completed a series of CFA analyses wherein a freely estimated five factor model was compared to a model with parameter estimates constrained to equality and the change in goodness of fit evaluated simultaneously across each country sample. Based upon the previous CFAs done for each country separately, it was decided that we would test a model with only the factor loadings constrained as factor inter-correlations varied across samples and appeared likely to contribute to poor model fit. Table 15 presents the freely  estimated standardized regression weights for the country samples along with essential fit statistics. Table 16 provides an overview of the model invariance testing analyses that were done. For the first analysis, the baseline model (i.e., the model with freely estimated loadings) was compared to the constrained model for all eight countries. For the baseline model, CFI and RMSEA values reflect adequate fit though the chi-square emerged significant. For the constrained model, chi-square remained significant and the CFI value falls below .90. Comparison of the change in chi-square across the two models indicates that the constrained model reflects a significantly poorer fit suggesting non-invariance.
While Byrne [42] recommends systematically modifying and testing the constraints in a model to identify elements that are invariant versus non-invariant across samples, we reasoned that such an approach was not practical in the case of our study as there are simply too many comparisons to be made with a 30 item test across eight samples. Instead, we adopted the approach of examining the same constrained model with different sets of country samples as we saw this as being more consistent with our hypotheses. In this vein, we evaluated our baseline model to a constrained model using the four samples which completed the ESI-R in English (i.e., American, Canadian, Indian, and Ugandan). The CFI and RMSEA for both models reflect adequate fit though the change in chi-square still emerged significant. We next used just the American, Canadian, and Indian samples with the same result (i.e., good CFA and RMSEA but significant change in chi-square). We did the same analyses comparing just Americans and Canadians, Americans and Indians, and Canadians and Indians, respectively. In all cases, the same pattern of findings were obtained; the baseline model and constrained model showed adequate CFI and RMSEA values but the change of chi-square came out significant with the constrained model always demonstrating poorer fit. We then completed analyses comparing the Polish and Slovakian samples, and the Korean and Japanese samples, respectively. With these, though the RMSEA was still satisfactory, the CFI was below .90 for both the baseline and  constrained models. Regardless, in both instances, the constrained model was found to produce a significantly poorer fit.

Analyses Involving the Spirituality Adjective List (SAL) and ESI-R
To further evaluate whether or not the findings reported thus far were the product of the type of test used (and by association the type of item and test development strategy), exploratory principal component analyses were used to examine the internal structure of the SAL with the American, Indian, and Ugandan samples. In all cases, the analyses were set to extract and orthogonally (varimax) rotate five components. To ascertain their association to the ESI dimensions, regression based component scores were calculated and used in a correlational analysis. The rotated component loading coefficients can be found in Table 17. The correlations of the component scores to the ESI-R dimensions are reported in Table 18. Examination of Table 17 reveals that for all three country samples, all five components house elevated loadings (i.e., loadings .30 or greater) for at least four SAL items. Also, while there appear to be a large number of differences in the pattern of item loadings across the three countries, there are also some points of similarity which find corroboration in the correlations with the ESI-R as per Table 18. In particular, component one for the American sample and component two for both the Indian and Ugandan samples produce their highest correlation with ESI-R Existential Well-Being (r = .65, .61, and .27, respectively, all p<.001) and have common loadings from items 3, 4, 5, 11, 12, 17, 35, and 39. All of these items concern positive selfevaluation (e.g., understanding self, feeling happy and content, sense of completeness or wholeness, identification of self as moral). Component 2 from the American sample and component one from both the Indian and Ugandan samples produce their highest correlation with ESI-R Religiousness (r = .85, .77, and .72, all p<.001) followed by ESI-R Cognitive Orientation toward Spirituality (r = .67, .56, and .71, all p<.001) and share elevated loadings with items 8, 9, 10, 22, 23, 31, 37, and 40. The content of these items revolve around putative religious beliefs and behavior (e.g., self identification as being religious and blessed, participation in activities such as prayer and church services, and beliefs in God or a higher power). Component five from all three analyses show their strongest correlation with ESI-R Paranormal Beliefs (r = .60, .55, and .39, p <.001 for American, Indian, and Ugandan samples, respectively) and have strong loadings from items 13, 16, 24, and 27. All four of these items have content which has obvious ties to paranormal beliefs (e.g., belief in life after death, supernatural powers, and ghosts). Components 3 and 4 from all three solutions show much less similarity to each other in terms of the pattern of high item loadings and their correlations with the ESI-R dimensions. In terms of the correlations, for the Indian and Ugandan samples, though some statistically significant coefficients were obtained, the correlations are generally of small magnitude. For the American sample, no significant correlations were obtained between component four and any ESI-R dimension but, unlike the other samples, the correlations with component three were statistically significant and of medium magnitude between four of the five ESI-R dimensions (i.e., all but ESI Paranormal Beliefs). Taken as a whole, these findings suggest that three of the ESI-R dimensions (i.e., Religiousness, Existential Well-Being, and Paranormal Beliefs) find fairly good representation in the SAL components. ESI-R Cognitive Orientation toward Spirituality (COS) also seems to be implicated as it is highly correlated with the SAL component reflective of Religiousness in a manner similar to what is seen in the ESI-R. However, unlike the ESI-R where COS is a discrete dimension, this is not observed with the SAL. Only the ESI-R Experiential-Phenomenological Dimension does not find any approximate manifestation in the SAL items.
In consideration of these results, we elected to create three subscales with the SAL items common to all three country samples so as to see if they function adequately in terms of reliability and if they produce a similar array of associations as found with the ESI-R. Descriptive and reliability statistics, SAL subscale inter-correlations, and correlations with the ESI-R dimensions can be seen in Table 19.
Analyses show a pattern of findings that are generally consistent with what was found for the ESI-R though some deviations are noted. First, one-way ANOVAs uncovered significant findings for all three SAL subscales with effect sizes generally on par with what was seen with ESI-R dimensions assessing the same constructs (SAL Religiousness: F(2, 1419) = 54.84, p<.001, η 2 = .07; SAL Existential Well-Being: F(2, 1419) = 35.66, p<.001, η 2 = .05; SAL Paranormal Beliefs: F(2, 1419) = 160.53, p<.001, η 2 = .19). Post-hoc analyses (Scheffe test) showed that all country samples were significantly different from one another for all three SAL subscales. Second, reliability analyses indicate that the three SAL subscales produce mostly satisfactory inter-item consistency coefficients and good corrected item-to-scale total correlations. The only exception was the SAL Paranormal Beliefs subscale which generated a marginal alpha for the Ugandan sample. Third, in terms of associations with demographic variables, akin to the ESI-R, the SAL subscales produce a pattern of small correlations with age. With sex, SAL Religiousness produced significant and moderately sized correlations in all three country samples while the remaining two SAL subscales generated small coefficients. Fourth, for all three country samples, the SAL subscales produced correlations with the ESI-R dimensions supportive of convergent validity (e.g., scales of the same name from both measures produce their strongest associations with each other), though not discriminant validity (e.g., SAL subscales produce moderate to strong correlations with more than one ESI-R dimension in virtually all cases). The only results which appear to diverge from those with the ESI-R concern the intercorrelations of SAL subscales; for all three country samples, SAL Religiousness was found to produce moderately sized significant correlations with both SAL Existential Well-Being and SAL Paranormal Beliefs. Also, the correlations between SAL Paranormal Beliefs and SAL Existential Well-being, while small in magnitude, are positive while they came out negative with the ESI-R.

Discussion
This investigation offers a wealth of information that has substantive ramifications for the cross-cultural study of spirituality. Related to our research expectations, results provide generally satisfactory support for the first three hypotheses, no support for our fourth expectation and mixed support for our fifth. To elaborate, consistent with the first hypothesis, the ESI-R appears to demonstrate reasonably good face validity with a notable number of participants responding "agree" or "strongly agree" to ESI-R item 31 (i.e., "this test appears to be measuring spirituality") and relatively few across the country samples responding "disagree" or "strongly disagree." Though the interpretation of this outcome is tempered by the significant ANOVA result which found a multitude of pairwise differences across country samples, the overall implication is that the ESI-R is seen by substantial number of people with differing cultural and/ or linguistic backgrounds as measuring something akin to what they consider as "spiritual." As per the second hypothesis, the ESI-R was found to produce satisfactory reliability coefficients and corrected item-to-scale correlations for the pooled sample and mostly adequate alphas and correlations for the country samples separately. In terms of inter-correlations between dimensions, all samples generated significant correlations of moderate to high magnitude between COS, REL and EPD, and EPD and PAR. A noteworthy and unexpected trend, however, was observed with the samples that had the lowest representations of Christians (i.e., Indian, Japanese, Korean). Specifically, significant associations were found between PAR and REL, and PAR and COS which were not observed with the remaining samples that were more Christian dominant. Lastly, correlations of the ESI-R dimensions with demographic variables show similar trends across most country samples; with the exception of the Koreans, coefficients were of generally low magnitude and in the direction where age and females showed associations with higher scores (especially with COS and REL). In the case of the Korean sample, correlations tended to be of more moderate size with both age and sex.
Third, evidence of factor replicability, configural invariance, and superiority of a five factor over a four factor model was provided by the EFAs and CFAs with the pooled sample and with the CFAs done for each country separately. With the former analyses, ESI-R items loaded in a manner very similar to MacDonald [12] both with and without the Canadian sample included in the analysis. In the CFAs, while loadings were significant for all items in both the four and five factor models, the five factor model displayed a significantly better fit to the data as reflected in both the change in chi-square and virtually all other fit indices. For CFAs involving the country samples, item loadings were ubiquitously significant for all models tested, but the five factor model consistently demonstrated better goodness-of-fit. With that stated, inspection of numerous fit indices for the five factor model for each country separately suggests that the model demonstrates elements of misfit for the Korean, Japanese, Polish, Slovakian, and Ugandan samples. Nevertheless, modification indices were examined and there were no indications of how the model could be meaningfully respecified in a congruent manner across all samples, so the five factor model appears to be the most defendable.
The fourth hypothesis which predicted that the ESI-R would demonstrate measurement invariance was not corroborated by our findings. Tests comparing an equality constrained model to a freely estimated correlated five factor model consistently revealed that the constrained model had significantly poorer fit. Also, significant differences were found at the item and dimension level as a function of country as per ANOVA findings.
The fifth research expectation, namely that the SAL would produce an internal structure which emulates the ESI-R dimensions found some support as factors generally corresponding to ESI-R Religiousness, Paranormal Beliefs, and Existential Well-Being were found. ESI-R Cognitive Orientation toward Spirituality was also observed to generate notable associations with the same factor as Religiousness, a result which seems copasetic with what we found with each and every country sample in this study. Though the ESI-R Experiential/Phenomenological Dimension was not clearly represented in the SAL factor structure, examination of SAL items reveals a complete absence of content related to subjective spiritual experiences, so this finding makes some sense. In addition, when three subscales were created for the SAL, they demonstrated a pattern of findings in terms of reliability, and correlations with demographic variables mostly similar to what was seen for the ESI-R. One point of divergence between the SAL and ESI-R, however, concerned the inter-correlations of subscales. With the SAL, Religiousness was found to produce significant and moderately sized coefficients with both Paranormal Beliefs and Existential Well-Being for all three country samples. Such associations were not found with the ESI-R. All the same, many of the findings with the SAL seem to be fall in line with our expectations.
So what do all of these findings tell us? In general, it appears that when defined and assessed quantitatively, spirituality may be viewed as a viable concept which empirically behaves in a similar manner across cultures. It also seems that spirituality is best treated as a multidimensional construct made up of related but unique components. While the number of these components was found to vary as a function of the inclusiveness item content seen in measures employed in the present study, based upon the results using the Expressions of Spirituality Inventory-Revised, it may be argued that spirituality is comprised of at least five dimensions. At the same time, the results indicate quite clearly that spirituality is not a concept that "transcends" culture and holds a firm universality of meaning. Rather, it seems the opposite holds true; the specific meaning ascribed to spirituality appears to be intrinsically bound by culture and cannot be fully understood without consideration given to cultural factors. That is, while there are similarities, spirituality is not the same across cultures. The significance of our results seem quite apparent-nomothetic approaches to the study of spirituality are at best incomplete and at worst run the risk of misrepresenting the construct and any associations claimed to exist between it and other aspects of functioning. Accordingly, a concrete recommendation for future research is for investigators to be mindful of the role and influence of culture and to augment quantitative methods based solely on self-report questionnaires with other hard quantitative procedures (e.g., direct behavioral observation, neurophysiological measures; see [97] regarding the latter) and, as importantly, qualitative data gathering strategies which permit for culture-specific content to be procured and concurrently analyzed. We offer this suggestion not just for research using samples drawn from different nation states but also for studies using samples of different ethnicities obtained within more pluralistic societies (e.g., United States, Canada, United Kingdom). As well, we strongly suggest that any and all empirical findings generated with samples obtained from one culture be tested and replicated with samples taken from several other cultures prior to making any claims regarding generalizable scientific knowledge. In this vein, we encourage investigators throughout the world to challenge and expand upon our findings using samples from the same and different cultures. The manner in which we report our results in this paper (e.g., item and dimension descriptives, factor loadings for pooled and separate samples) was deliberately done in a way so as to facilitate direct comparisons with other samples.
Our findings hold other important implications. First, while not demonstrating measurement invariance, the five dimensional model of MacDonald [12] did receive support for its configural invariance and its pattern of associations with the SAL were similar across cultures. Given that the dimensions have been found to be differentially related to psychological functioning [91], it seems reasonable to conjecture that such results may also be manifested in studies with different cultures. Though this is an empirical question which would be best answered by future cross-cultural research, when considering the current state of the science, it seems necessary if not prudent at the present time to discourage investigators and practitioners from characterizing the association of spirituality to functioning in solely positive terms [98] as it appears likely that any link found may be a product of how spirituality is defined and measured [99]. Future studies need to either be more inclusive in terms of what they are considering to be spirituality or acknowledge up front that they are only focusing on specific facets of the construct domain.
A second notable implication concerns the results involving the demographic variables. In particular, age and sex were found to be significantly correlated with at least one dimension or scale from both the ESI-R and the SAL for every cultural sample. In addition, significant interaction effects were obtained between culture and age and/or sex for all ESI dimensions save Existential Well-Being. Considering past research which has uncovered such associations with the original ESI and other measures [18,100,101], it seems reasonable to argue that spirituality may not only differ in precise meaning across cultures but also across age and sex and as a function of the interaction of all three of these variables (however, see [17]). Clearly, more studies are needed to fully substantiate this interpretative possibility. Nevertheless, it appears as though our findings further challenge assertions regarding the universality of spirituality and call attention to the need to better account for its diversity of experience and expression in research and application as a function of individual differences [102].
There is one other aspect of our results which deserves mention and it concerns Existential Well-Being. Though many operationalizations of spirituality have been criticized on the grounds that they are confounded with well-being [14] and available evidence suggests that ESI-R Existential Well-Being (EWB) may be best treated as something separate from spirituality [15], results in the present study give some reason to reflect more on the issue. In particular, while we observed that ESI-R EWB was modestly associated to the other ESI-R dimensions for all cultures, our findings with the SAL indicate that something akin to existential well-being comprises a replicable component which correlates moderately with other elements of spirituality, most notably Religiousness and Cognitive Orientation toward Spirituality. Since the SAL was developed based upon written narratives describing a spiritual person provided by a sample of Canadian university students and the data we analyzed came from three differing cultural samples (i.e., American, Indian, and Ugandan) of students, it seems as though existential well-being may not be a confound but rather an essential characteristic seen to be linked to higher levels of spirituality. If understood in this light, then the place of existential well-being within the content domain of spirituality may be reframed in terms of a directional relationship with the other dimensions. That is, existential well-being may be construed as an outcome variable concerning how one evaluates one's own functioning as a product of the other spiritual dimensions.
In fact, this is something that has already been proposed. MacDonald [71] proffered a directional model with the ESI/ESI-R dimensions wherein Religiousness and the Experiential/ Phenomenological Dimension comprise core social and biological factors that have a co-deterministic influence on the emergence and incorporation of cognitive schema into a person's sense of self and reality (Cognitive Orientation and Paranormal Beliefs). In turn, he conjectured that all four of these dimensions have a bearing on how a person perceives and appraises his/her quality of life as manifested in existential well-being. Though not the focus of any empirical scrutiny to date, this bio-social-psychological model represents a promising development as it elevates the ESI/ESI-R dimensions beyond the mere description of spirituality to a level of organization, rigor, and explanatory power that fully integrates all of the dimensions in a compelling and scientifically testable way.

Limitations
Notwithstanding the many key findings, the present study may be seen to suffer from a variety of limitations that need to be kept in mind when critically considering the meaning and generalizability of the results. First, all participants across all cultures were university students. While it may be argued that the consistent use of students served as a basis to make more apt comparisons across samples (e.g., it increased internal validity), the fact of the matter is that students differ from the general adult population in almost any country in terms of age, experience, and socioeconomic status [23,103]. Resultingly, there is a strong need for research to be done across cultures with samples drawn from more diverse adult populations. Second, unevenness of sample sizes may be viewed as having a deleterious effect on the stability of our results, especially for those samples that are relatively small (i.e., Japanese and Slovakian). Future investigations replicating and extending our findings with larger samples are needed. Third, even though we included two measures of spirituality developed via different means in an effort to see if test construction strategy and test content had an impact on our results, both instruments are descriptive and not theory based. As well, our study lacked additional criterion measures that could have been used to better determine if the ESI-R and SAL produce similar patterns of associations across cultures. Behavioral rating variables such as frequency of attendance to religious events and/or frequency of engagement in private spiritual activities (e.g., prayer or meditation) and a broader array of health variables would be good to incorporate into any future cross-cultural studies as would a range of measures that are both descriptive and theory-driven. Fourth, despite evidence generally supporting the structural invariance of MacDonald's [12] five dimensional model, our findings showed that the instrument demonstrated greater problems with goodness-of-fit with non-English language samples. While we attempted to ensure that translations were done adequately in terms of preservation of essential content and meaning of the items, we could have done more to better evaluate linguistic equivalence and cultural adaptedness prior to data gathering (e.g., we could have completed pilot testing of the translated ESI to identify potential problems with the translations) [104][105][106]. As a result, we cannot conclude with certainty that the relatively poorer fit of observed with non-English language samples was due to inadequacies with the translations or to bonafide cultural differences.

Conclusions
Spirituality is an area of human functioning that has garnered greater attention and legitimatization and will undoubtedly continue to be the focus of research for many years to come. Nevertheless, the philosophical and methodological challenges it presents to science are substantial and should not be ignored. The present study serves to elevate our awareness of these complexities by highlighting the importance of culture and language in how spirituality is conceptualized, operationalized and measured. It is our sincere hope that the findings from this investigation contribute to more sensitive and socioculturally contextualized approach to theory development and inquiry.