Evaluating the psychometric quality of school connectedness measures: A systematic review

Introduction There is a need to comprehensively examine and evaluate the quality of the psychometric properties of school connectedness measures to inform school based assessment and intervention planning. Objective To systematically review the literature on the psychometric properties of self-report measures of school connectedness for students aged six to 14 years. Methods A systematic search of five electronic databases and gray literature was conducted. The COnsensus-based Standards for the selection of heath Measurement INstruments (COSMIN) taxonomy of measurement properties was used to evaluate the quality of studies and a pre-set psychometric criterion was used to evaluate the overall quality of psychometric properties. Results The measures with the strongest psychometric properties was the School Climate Measure and the 35-item version Student Engagement Instrument exploring eight and 12 (of 15) school connectedness components respectively. Conclusions The overall quality of psychometric properties was limited suggesting school connectedness measures available require further development and evaluation.


Introduction
The concept of school connectedness has received growing attention from researchers and educators in recent years due to its reported impact on health, social and academic outcomes [1][2][3].Students who have a stronger sense of school connectedness are more likely to: engage in socially appropriate behaviours; have higher levels of self-esteem; obtain better grades; display acceptable conduct at school; and are more likely to graduate than students with a lower sense of school connectedness [4][5][6][7].Longitudinal research suggests that students' sense of school connectedness in early schooling increases engagement in risk behaviour's such as smoking, marijuana use, alcohol consumption and sexualised behaviour in later schooling [2,[8][9][10].Recent evidence also suggests that students with a lower sense of school connectedness are more likely to experience clinical anxiety and depression during their schooling and in later life [3,11].
School connectedness presents an attractive focus for educators, school psychologists and researchers as it is a subjective concept that is amenable to change through the provision of appropriate school based supports [8,12].School connectedness literature is being used widely to inform the development of school based interventions, as well as inform educational policy and reform [13,14].The Australian Early Years Learning Framework [15] is an example of this; centred around the notion that for students to experience learning that is engaging and supportive of success in later life, they need to first have a sense of belonging to their school community.As such, there is a need for valid and reliable measures to assess the effectiveness of school based interventions targeting school connectedness, in order to minimise the long term documented impacts of reduced school connectedness on students' academic success and socio-emotional wellbeing.Furthermore, access to school connectedness measures with sound psychometric properties will assist in gaining further evidence to support the use of school based interventions and assist in informing educational policy and reform.

School connectedness: Theoretical underpinnings and definition
Despite growing interest in the concept of school connectedness, there is considerable debate regarding the definition of school connectedness.Many terms have been used inter-changeably in the literature to describe school connectedness including school climate, belonging, bonding, membership and orientation to school [16,17].As a result, the operationalisation and measurement of school connectedness has been challenging.
Theoretical models of school connectedness are most commonly embedded within psychology literature.Deci and Ryan's [18] self-determination theory is regularly referred to within school connectedness literature [19][20][21][22][23].This theory proposes that for an individual to be motivated and to function optimally, a set of psychological needs such as relatedness, competence and autonomy must be supported [18].Relatedness refers to a need to feel a sense of belonging with peers and teachers [18,24].Competence is the need to feel capable of learning and autonomy is the need to feel that you have choice and control at school [18,24].These three innate psychological traits are often cited to account for human tendencies to ". ..engage in activities, to exercise capacities and to pursue connectedness in social groups" [24]; all of which are foundational skills in developing students' sense of school connectedness.Self-determination theory suggests that students with a strong sense of relatedness or belonging to their peers, teacher and school community are in a better position to learn and more likely to perform better at school due to improved wellbeing and resilience.Furthermore, students who perceive their school environment to be fair, ordered and disciplined and who feel in control of their academic outcomes at school, are more likely to engage and feel connected at school.Deci and Ryan's [18] self-determination theory illuminates the impact affective, behavioural and cognitive factors have in supporting or hindering a student's sense of school connectedness.
Early research relating to school connectedness has focused on affective aspects of school connectedness [17,25].Affective engagement, also referred to as psychological and emotional engagement, refers to a student's feelings towards his/her school, learning, teachers and peers [17,25,26].Affective engagement is accurately captured in Goodenow's [27] definition of school connectedness, which is the ". ..extent to which a student feels personally accepted, respected, included and supported by others" [27] in the school environment.This definition, however, does not take into consideration behavioural and cognitive factors that can also impact a student's sense of school connectedness, which have been explored in more recent school connectedness literature.Behavioural engagement includes observable student actions of participation while at school and is investigated through student conduct, effort and participation [5,28,29].Conversely, cognitive engagement includes students' perceptions and beliefs associated with school and learning [5,28,29].That is, to feel connected to school the student must be actively involved in classroom and school activities, including school organised extracurricular activities, and actively think about how they can involve themselves in the learning process at school.Wingspread's Declaration of School Connections [30], which describes school connectedness as a ". ..belief by students that adults in the school community care about students learning and about them as individuals and can be represented by high academic expectations from teachers with support for learning, positive teacher-student interactions and feelings of safety" [30], more accurately captures behavioural and cognitive aspects of school connectedness.
Several reviews have focused on defining the meta-construct of school connectedness [7,25,31].These reviews highlight that the construct of school connectedness has evolved over time-from a relatively simple construct focusing on students' general feelings towards school; to a more complex multi-dimensional construct comprising not only students' feelings towards school, but also their perceptions and beliefs towards school and learning, and their involvement in classroom and playground activities and school events.Researchers in the field postulate that definitions of school connectedness should include the triad of indicators (i.e., affective, behavioural, and cognitive) and facilitators (i.e., personal and contextual factors) that influence connectedness [25].Indicators ". ..convey a student's degree or level of connection with learning while facilitators are factors that influence the strength of the connection" [25].Although this definition has been proposed, authors of this study have not found a definition of school connectedness that fully encapsulates all of these components.Following an extensive review of the literature, authors of the study thematically categorised factors contributing towards students' sense of school connectedness under affective, cognitive and behavioural domains illustrated in Table 1.For the purposes of this review, these domains and concepts will be subsumed under the broader construct of school connectedness.Collectively, the concepts in Table 1 are critical dimensions of students' experiences in school.Together, they are essential in promoting student development and overall academic success.These concepts are often targeted within individual and school wide interventions strategies.As such, there is a need for measures that assess these school connectedness domains and constructs both crosssectionally and longitudinally.

Measuring school connectedness
Not surprisingly, given the difficulties in defining school connectedness, there are various ways in which this concept has been measured.The differences in the way the concept is measured are theoretical and methodological.The theoretical background of the researcher often determines how school connectedness is measured.For example, Jimerson, Campos and Grieif [31] identify and assess student motivation as an affective indicator of school connectedness with a background in psychology; while Fredricks, Blumenfeld and Paris [7] identify it as a cognitive indicator with a background in educational psychology.While motivation is an intrinsic process, it manifests itself extrinsically through student behaviour [32].Therefore, authors of this study have categorised student interest or motivation as a behavioural indicator of school connectedness (see Table 1).
The purpose of assessing school connectedness often determines how the construct is measured.Some measures have been developed specifically for the school context (e.g., What's Happening In This School [33]), whereas others extend their exploration to the home and community environment with subscales or items that refer to school (e.g., Adolescents Sense of Wellbeing Related to Stress [34]).Some measures have been developed specifically to assess students' sense of school connectedness in particular subjects such as maths, science or physical education (e.g., What's Happening In This Class (Singapore version) [35]).Some measures focus on assessing an individual student's sense of connectedness (e.g., Student Engagement Instrument [36]), whereas others aim to assess an individual's perception of connectedness at a classroom or school level (e.g., Classroom Environment Scale [37], Classroom Peer Context Questionnaire [38]).Schools conducting research into school connectedness will often tailor their measurement approach based on their needs; for example, whether they want to gain an understanding of their schools sense of connectedness to inform funding allocation, versus whether they want to identify individual at-risk students to inform the provision of school supports [39].
There is debate within the literature regarding whether self-report or proxy report measures should be used when evaluating school connectedness [40].Many would argue the subjective nature of school connectedness makes it less amenable to third party report [17,31].For example, the teacher may observe the student to play with peers or engage in the curriculum, but the student themselves, for whatever reason, may not feel like they are a part of their school community.Self-report measures help to depict the student's personal perception of their experience at school.Teacher-report methods may be more suitable in capturing behavioural components of school connectedness such as the students' level of effort or persistence at school that can be objectively observed [41].As previously mentioned, students will experience a sense of connectedness when their needs of autonomy, competence and relatedness are met within the school environment [24].The assumption is that students' feelings of being included and accepted at school, as well as the perception they are making important contributions to the school community, help to create and maintain feelings of connectedness.Therefore, in order to gain an accurate depiction of students' sense of school connectedness, the use of student selfreport measures is warranted and will be the focus of this particular review.
The differences in the way school connectedness is defined makes it difficult to compare measures to each other in an attempt to identify the most valid and reliable tool to use in the school context.As children spend more time in schools than any other place outside their homes, it is important to be able to validly and reliably assess student experiences within school so that appropriate supports can be provided [39].Furthermore, it is important to be able to reliably measure this construct with students in early primary school, to prevent or minimise the long term documented impacts of reduced school connectedness on student outcomes.
The COSMIN taxonomy has been successfully applied to more than 560 systematic reviews [42,43].The COSMIN checklist is a standardised tool that can be used to critically appraise the methodological quality of studies reporting on the psychometric properties of measures [43].The COSMIN checklist was chosen for this systematic review as it has been developed following extensive international consultation and consensus among experts in the field of psychometrics and clinimetrics.The COSMIN was used in the current review to compare the psychometric properties of existing school connectedness measures, originally developed in English that capture affective, cognitive and behavioural domains of school connectedness using self-report methods for students aged six to 14 years of age.It is expected that this systematic review will assist in the choice of instruments measuring school connectedness, by providing an objective account of the strengths and weaknesses of self-report measures available for school aged children.

Methods
The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement guided the methodology and writing of this systematic review.The PRISMA statement is a 27-item checklist that is deemed essential in the transparent reporting of systematic reviews [44].A completed PRISMA checklist for the current review is accessible (see S1 Table ).

Eligibility criteria
Research articles, published manuals and reports detailing the psychometric properties of self-report instruments designed to measure school connectedness of students aged six to 14 years of age were deemed eligible for inclusion in this review.To be included, abstracts and instruments needed to address all three school connectedness domains (i.e., behavioural; affective and cognitive); address at least five of 15 concepts within school connectedness domains (see Table 1); provide validity evidence for students aged six to 14 years of age; be specific to the school context; have psychometrics properties published within the last 20 years; and be written in English.Psychometrics properties published more than 20 years ago were deemed out-dated.Measures were excluded if the full text of the article was not retrievable; they were specific to a subject area (e.g., maths or science) or a student population (e.g., students with craniofacial abnormalities).Measures that provided validity evidence for students requiring special education assistance were included in the review, as long as the sample also included typically developing students.Dissertations, conference and review papers were excluded as they are not peer reviewed, and the search yielded sufficient results.

Information sources
The first systematic literature search was performed on the 13 th June 2016 by two authors using the following five electronic databases: CINAHL, Embase, ERIC, Medline, PsycINFO.Subject headings and free text were used when searching each database.A gray literature search was also conducted using Google Scholar and PsycEXTRA between the 21 st and 27 th July 2016 to identify additional measures.See S2 Table for a complete list of search terms used across all searches.A second literature search was conducted on the 18 th September 2016 using the title of the measure and its acronym in CINAHL, Embase, ERIC, Medline and PsycINFO to identify additional psychometric articles not identified in the first search.To be comprehensive, websites of publishers of assessments in education and social science such as Pearson Education, ACER and Academic Therapy Publications were searched.

Study selection
Abstracts were reviewed using three dichotomous scales to determine (a) if the study involved students aged between 0 and 18 years (yes/no), (b) if the instrument measured school connectedness or related terms (e.g., group membership, learner engagement, school community relationship, student participation, school involvement) (yes/no) and (c) if the study reported on the psychometric properties of the measure (yes/no).Results from the three dichotomous scales were then combined to generate a single ordinal scale from 0 to 3; 0 indicating the abstract did not meet any criteria and 3 indicating the abstract met all three criteria.A random sample of 40% of abstracts was generated using an electronic random allocator (www.random.org).Based on previous systematic reviews using COSMIN [45][46][47], this percentage was deemed sufficient to detect systematic error.The random sample was reviewed by the primary author and an independent rater to establish inter-rater reliability.Inter-rater reliability between raters was deemed excellent: Weighted Kappa = 0.814 (95% CI: 0.791-0.836).Abstracts that did not meet any of the criteria or met only one of the criteria were excluded from the study.Abstracts that met two or three of the criteria were reviewed a second time and discussed by the primary author and independent rater to gain consensus and ensure only studies meeting all eligibility criteria were included in full text review.The primary author then rated the remaining abstracts and 132 full texts articles meeting all three criteria.Articles were excluded if the full text did not meet criteria (see Fig 1).Scoring a random sample of abstracts first, allowed the researcher to learn from the process and avoid systematic errors.

Data collection process and data extraction
Information from articles were extracted under the following descriptive categories: purpose of the measure, number of subscales, total number of items, response options and time to complete, article reference and sample characteristics.The information extracted from articles was guided by the Cochrane Handbook for Systematic Reviews [48] Section 7.3a and the Systematic Reviews Centre for Reviews and Dissemination [49].

Methodological quality
The methodological quality of included studies was assessed using the COSMIN taxonomy of measurement properties and definitions for health-related patient reported outcomes [43,50].The COSMIN checklist is a standardised tool and consists of nine domains: internal consistency, reliability (including test-retest reliability, inter-rater reliability and intra-rater reliability), measurement error, content validity (including face validity), structural validity, hypotheses testing, cross cultural validity, criterion validity and responsiveness [43].Refer to Table 2 for the definitions of all psychometric properties as defined by the COSMIN statement [50].Responsiveness was not evaluated as a psychometric property as it would have increased the size of the review exponentially and was deemed outside the scope of this review.Criterion validity was also not evaluated due to the absence of a 'gold standard' measure of school connectedness.Cross-cultural validity was not evaluated as instruments included in the review were developed and published in English.Interpretability is not considered to be a psychometric property under the COSMIN framework and was therefore not described or evaluated in this review.
Each domain of the COSMIN checklist includes 5 to 18 items focusing on various aspects of study design and statistical analyses.A 4-point rating scale proposed by Terwee et al. [51] enables an overall methodological quality score from poor to excellent, to be obtained for each measure.Terwee et al. [51] suggests taking the lowest rating of any item in the domain as the final quality rating, however this makes it difficult to differentiate between subtle psychometric qualities of assessments.Therefore a revised scoring system was applied and presented as a percentage: Poor (0-25%), Fair (25.1%-50.0%),Good (50.1%-75%) and Excellent (75.1-100%) [47].As some COSMIN items only have an option to rate as good or excellent, the total score for each psychometric property was calculated using the formula detailed below, to accurately capture the quality of psychometric properties [43]: After the studies were assessed for methodological quality, the quality of psychometric properties were evaluated using modified criteria by Terwee [51] and Schellingerhout et al. [52].A summary of the criteria used for rating the quality of internal consistency, content validity, structural validity and hypothesis testing is detailed in Table 3.Finally, each measurement property for all instruments was given an overall score using criteria set out by Schellingerhout [52].An overall quality rating was created by combining the study quality scores measured by COSMIN and the psychometric quality ratings as measured by Terwee et al. (2007) and Schellingerhout [52].This method has been used successfully in previous psychometric reviews [45,53].The COSMIN checklist [51] and Terwee [51] and Schellingerhout et al. [52] criteria accommodates studies that use both Classical Test Theory (CTT) and Item Response Theory (IRT) methodology.
To maximise consistency of ratings, the fifth author of this study who has extensive experience in the area provided training to the primary author and an independent rater on how to complete the COSMIN checklist and to determine the quality of the psychometric properties.The first author scored all the papers.A random selection of 40% of COSMIN ratings and all psychometric quality ratings were scored by an independent rater.Both raters met until 100% consensus was achieved when ratings differed in category.The fifth author met with the two

Psychometric property Definition a
Validity: the extent to which an instrument measures the construct/s it claims to measure.

Content validity
The degree that the content of an instrument adequately reflects the construct to be measured.

Face validity b
The degree to which instrument (items) appear to be an adequate reflection of the construct to be measured.

Construct validity
The extent to which the scores of an instrument are consistent with hypotheses, based on the assumption that the instrument is a valid measure of the construct being measured.

Structural validity c
The extent to which instrument scores adequately reflect the dimensionality of the construct to be measured.

Hypothesis testing c
Item construct validity.

Cross cultural validity c
The degree to which the performance of items on a translated or culturally adapted instrument are an adequate reflection of the performance of the items in the original version of the instrument.

Criterion validity
The degree to which the scores of an instrument satisfactorily reflect a "gold standard".

Responsiveness
The capability of an HR-PRO instrument to detect change in the construct to be measured over time.

Interpretability d
The extent to which qualitative meaning can be given to an instrument's quantitative scores or score change.

Internal consistency
The level of correlation amongst items.

Reliability
The proportion of total variance in the measurements due to "true" differences amongst patients.

Measurement error
The error of a patient's score, systematic and random, not attributed to true changes in the construct measured.Scores: + = positive rating, ?= indeterminate rating, -= negative rating, ± = conflicting data, NR = not reported, NE = not evaluated (for study of poor methodological quality according to COSMIN rating, data are excluded from further evaluation).b Doubtful design or method is assigned when a clear description of the design or methods of the study is lacking, sample size smaller than 50 subjects (should be at least 50 in every subgroup analysis), or any important methodological weakness in the design or execution of the study.
Hypothesis testing: all correlations should be statistically significant (if not, these hypotheses are not confirmed) AND these correlations should be at least moderate (r > 0.5).raters to resolve differences in ratings when a consensus could not be reached (Weighted Kappa: 0.886, 95% CI: 0.823-0.948).

Data items, risk of bias and synthesis of results
All data items for each measure were obtained.Items that were not reported were recorded as 'NR'.Risk of bias was assessed at an individual study level using the COSMIN checklist.Studies that obtained a high rating were deemed to be at low risk of bias and studies that obtained a low rating were deemed at high risk of bias.Psychometric properties only received a 'positive' or 'negative' rating if clear and appropriate methodology was reported.If unclear or inappropriate methodology was used, an 'indeterminate' rating was recorded; providing further evidence for risk of bias.Ratings from individual studies and psychometric properties were then combined to create an overall rating for each psychometric property of each measure.Risk of bias is subsumed into final results.

Systematic literature search
A total of 3,754 abstracts were retrieved from database searches, including duplicates.The total abstracts from subject heading and free text word searches across databases were: CINAHL = 656, Embase = 1,060, ERIC = 724, Medline = 789, PsycINFO = 525.Reference lists of included articles were searched for additional literature.A total of 1,763 duplicates were identified across the five databases and removed.After the removal of duplicate abstracts, a total of 1,991 articles were screened for inclusion in the review.Of these studies, 132 full text articles on 87 measures were assessed for eligibility.Of these 87 measures, 15 met the inclusion criteria and 72 were excluded.Refer to S3 Table for an overview of the 72 excluded instruments and the reasons for exclusion.The references of two manuals were identified for two included instruments; however, because they were irretrievable they were not included in the review.Therefore, psychometric properties of 15 measures were obtained, which were assessed using 18 research articles and 1 research report.

Included school connectedness measures
Table 4 summarises characteristics of 15 measures that met inclusion criteria and articles reporting on psychometric properties.All measures were developed and validated with typically developing students from a range of ethnic and socio-economic backgrounds in the United States, except for one, which was developed in New Zealand [54].The majority of measures were developed with an adolescent sample (12 to 18 years), with only a small number of measures developed and validated with students under the age of 12 years [55,56].Only three measures extended their samples to include students receiving special education services; however, these students made up less than 15% of the total sample [55,[57][58][59].The majority of studies had large sample sizes, with the median sample size being 1,642 (range of 77 to 47,488).All of the measures that met eligibility criteria were published after 1996.Of the 15 measures, 11 were published within the last 10 years (since 2006).All measures collected responses via pen and paper questionnaires and were conducted within the school setting.Some measures were administered verbally to students who identified as having English as their second language.Table 5 summarises the domains of school connectedness measured by each instrument.The subdomains were categorised following a thematic synthesis by four members of the (Continued) research team based on the definitions or descriptions of the scales and/or subscales in included studies.Subdomains were identified and subsumed under the most relevant domain: (1) affective (i.e., feelings of acceptance, belonging and inclusion; feelings of respect and being respected; value importance of school; feelings of safety; sense of autonomy and independence and academic self-efficacy), ( 2) cognitive (i.e., perceptions of-teacher relationships and support; peer relationships and support; academic support; discipline, order and fairness; and the value parents place on school) and ( 3) behavioural (i.e., involvement, participation and engagement; effort and persistence; conduct and interest and motivation).No single instrument measured all aspects of affective, cognitive and behavioural domains of school connectedness.The measure that measured the most aspects was versions of the Student Engagement Instrument (i.e., 35 item, 33 item and elementary version) [36,55,57,60,61], which measured 12 of 15 affective, cognitive and behavioural components of school connectedness.

Psychometric properties
Table 6 summarises quality ratings of psychometric studies and therefore risk of bias as determined by COSMIN.All measures included in the review were found to have good to excellent study quality for internal consistency, structural validity and hypothesis testing and poor to Notes.
Ã Purpose of measures: descriptive (i.e.describes current status, problems, needs and/or circumstances); discriminative (i.e.distinguishes between individuals or groups on a characteristic or underlying dimension); predictive (i.e.classifies individuals into pre-defined categories of interest), evaluative (i.e.detects magnitude of change over time within one person or a group of people after intervention).Refer to S1 File for further information about excluded publications and reasons for exclusion. https://doi.org/10.1371/journal.pone.0203373.t004 excellent study quality for content validity.Internal consistency and structural validity were the most frequently reported properties having being described in 17 and 16 studies respectively.Content validity was described for eight measures and hypothesis testing for 10 measures.Five studies reporting on hypothesis testing, described findings for more than one hypothesis.Of the 15 included instruments, six were revisions of earlier versions of measures of school connectedness (i.e., SEI-35 item [36], SEI-33 item [57,60,61], SEI-Elementary [55], Developmental Study Centre's School Climate Survey-Abbreviated Version [59], SPPCC-Adapted [54], SCM-Adapted [69]).These measures were evaluated separately as the item pool and response format of these measures had been changed.For 11 measures only single studies were identified.The SEI (33 item version) [57,60,61] and the SCM [67,68] had the most studies; reporting on psychometric properties in three research articles.Thirteen measures reported on two or more of six psychometric properties (average 3; range 1-4).The PSES [62] and the Developmental Study Centre's School Climate Survey (Full Version) [56]  were the only measures to report on one psychometric property.Many measures had no published information relating to content validity including the PSES [62], SESQ [13], SEI-33 item version [57,60,61], Developmental Study Centre's School Climate Survey (Full Version and Abbreviated Version) [56,59], SBI-R and SCM (Revised Version).The only study that was excluded from further analysis in the review was by Voekl [65] for receiving a poor COS-MIN rating for content validity.Refer to Table 7 for a summary of the quality of psychometric properties of included measures based on Terwee et al. [51] and Schellingerhout et al. (2012).Refer to Table 8 for a summary of the overall psychometric quality ratings per psychometric property for each measure as evaluated against Schellingerhout et al [52] criteria.A description of the criteria used to rate overall psychometric quality can be found in the notes section of Table 8.

Discussion
There is no universally accepted definition of school connectedness; however, the construct is referred to regularly within the literature and is a key area in informing educational policy and reform [39].The reliable and valid measurement of school connectedness is important to researchers and educators, to minimise the long term documented implications of reduced school connectedness on students' academic success and socio-emotional wellbeing through the provision of appropriate school based supports.This systematic review provides a comprehensive summary of the quality of psychometric properties of self-report school connectedness measures available for students aged 6 to 14 years using the COSMIN taxonomy of measurement properties.

Quality of the studies using the COSMIN taxonomy
Construct validity, within the COSMIN taxonomy, comprises structural validity, hypothesis testing and content validity [43].To confidently select and use measures in research it is important to understand ". ..how well [the] measure assesses what it claims to measure and how well it holds its meaning across varied contexts and sample groups" [45].Construct validity supersedes all other psychometric properties in measurement development as it is irrelevant if an instrument has good reliability if the construct which it measures is not well established.Many instruments are currently being used to assess school connectedness or related terms.Interestingly, however, the majority of studies in this review failed to adequately define or conceptualise the construct of school connectedness.Rather, studies focused on describing the methodology they used to develop the measure, including the statistical analyses used to test psychometric properties.
A lack of conceptualisation of school connectedness has made it difficult to: (a) adequately compare measures in this review; (b) determine if included measures fully operationalise the https://doi.org/10.1371/journal.pone.0203373.t006construct of school connectedness; and (c) determine whether students sense of school connectedness has changed, or whether change is due to the evolving nature of the construct and the way it is understood currently by researchers and educators in the field.As illustrated in Table 5, none of the measures included in this review, fully capture all aspects of school connectedness and in addition, the quality of descriptions were lacking.The majority of studies included in this review fail to explicitly state the intended purpose of the measure.That is, whether the instrument was originally intended as an outcome measure to evaluate changes over time following the implementation of school based supports or whether it was intended purely as a diagnostic tool to identify whether school based supports are required.Without this information, researchers and educators may make inappropriate choices and misinterpret assessment findings; leading to errors in clinical judgement.Future research should focus on developing a universal definition of school connectedness and further validate included measures.
Test-retest, inter-rater and intra-rater reliability and measurement error were not reported for any measures included in this review.Given that psychological constructs, such as school connectedness, are relatively stable over time it is important to utilise measures that have low error and are able to detect minor changes over time.Preliminary reliability testing is necessary to evaluate an instruments responsiveness.Without this information, it is difficult to make evidence based informed choices when selecting measures in research.This being said, some measures included in the review such as the SSES [39] have been used in research to evaluate changes in school connectedness over time.Although responsiveness was not evaluated in this review, researchers and educators should exercise caution when using included measures due to a lack of information on their reliability.Some studies included in the review reported verbal administration of measures to students who identified as using English as their second language.This method of administration places a high demand on students' expressive and receptive language skills as well as their verbal comprehension and memory recall resulting in a potential for error in the recorded true scores.Minor changes in question wording, question order or response format can result in different findings [40].This method of questionnaire administration may have impacted the quality of findings in these studies.Furthermore, it is important to consider inherent bias that exists with self-report measures.Student responses may be affected by their perception of support within their school-". ..they may take into account social norms when responding, which may result in social desirability bias" [40].Methods do exist to reduce this problem such as assuring students of confidentiality and anonymity; however, this can increase students suspicions about the sensitivity of the topic [40].Many studies included in the review failed to explicitly state how measures were administered and/or did not report on efforts to minimise the impact of social desirability bias on data quality.
Although the focus of this review was to evaluate the psychometric properties of school connectedness measures for students aged 6 to 14 years, the samples of included studies largely comprised older students up to the age of 18 years.Students under the age of 12 years represented approximately 25% of samples in included studies.This calls into question the utility and appropriateness of these measures with younger student populations.When examining included measures in more detail, it was noted many measures had lengthy item pools.For example, the Developmental Study Centre's School Climate Survey (Full Version) [56] and the SESQ [13] included 100 and 109 items respectively.Not only would these measures be time consuming, they would require a great deal of concentration for a young student to complete.It is important to be able to validly and reliably assess students' sense of school connectedness in early primary school in order to identify and support at-risk students to prevent the longterm documented implications of a lack of school connectedness on student outcomes.Future research should focus on validating included measures with younger students to ensure measures are age appropriate and can be reliably and validly used in this population.

Overall quality of psychometric properties
The overall quality of measurement properties critiqued in this study varied widely.The school connectedness self-report measures with the strongest psychometric properties were the SCM [67][68][69] and the 35-item version of the SEI [36].The SCM [67][68][69] addressed eight of 15 school connectedness components (see Table 5) and reported on four of six psychometric properties (see Table 6); scoring strong positive ratings for content validity and hypothesis testing, a moderate positive rating for internal consistency and a conflicting rating for structural validity.The 35-item version of the SEI [36] reported on four of six psychometric properties; scoring strong positive ratings for internal consistency and content validity and indeterminate ratings for structural validity and hypothesis testing.Interestingly, however, the SEI [36] addressed the most (i.e., 12 of 15) school connectedness components of any measure included in the review; suggesting that the SEI [36] not only has promising psychometrics but encompasses a broader range of school connectedness components.The school connectedness measure with the poorest psychometric properties was the SPPCC [54], reporting on three of six psychometric properties; scoring strong negative ratings for internal consistency and structural validity, and conflicting results for content validity.Across all measures and measurement properties there were a number of conflicting ratings (14%), many indeterminate ratings (41%), and missing data (36%); suggesting more research is required to determine the psychometric qualities of these measures.
An in-depth discussion about the statistical frameworks used in included articles is outside the scope of this review; however, it is noteworthy to draw reader's attention to the fact that none of the measures included in this review were tested at an item level using IRT.All measures were tested using CTT.A major limitation of CTT is its relatively weak theoretical assumptions and circular dependency; that is "(a) the person statistic (i.e., observed score) is (item) sample dependent and (b) the item statistics are (examinee) sample dependent; which poses some difficulties in CTT's application in some measurement situations" [70].IRT was developed to address the main limitations of CTT.However, IRT does have its own limitations in that it is a complex model requiring much larger samples of participants compared to CTT [71].Even with the need for larger samples when using IRT, the benefits of IRT outweigh the singular use of CTT [70,71].IRT assists in determining whether (a) a measure has any redundant items; (b) items are functioning sufficiently to adequately capture the construct of interest; and (c) the response format is operating appropriately [70].Future research should test included measures using IRT to gain a more in-depth understanding of measures functioning at an item level.

Limitations
Although every effort was taken to ensure the scientific rigor of this systematic review, there were a number of limitations.Information published in languages other than English were not included.Therefore, there may be some relevant findings regarding the psychometric properties of measures that were not included in this review.In addition, authors of included studies were not contacted therefore some information may have been overlooked.Furthermore, evaluating the quality of criterion validity, cross cultural validity and responsiveness was outside the scope of this review.

Conclusion
As school connectedness is both a precursor to and an outcome of academic success, it is important to be able to reliably and validly assess students' sense of school connectedness in order to accurately identify and support at-risk students [17,39].The current systematic review reported on the psychometric properties of 15 self-report school connectedness measures for students aged between 6 and 14 years of age.The measures with the strongest psychometric properties was the SCM and the 35-item version SEI exploring 8 and twelve (of 15) school connectedness components respectively.This systematic review highlighted the need for further research to examine the psychometric properties of existing school connectedness measures that were identified as having moderate to strong positive evidence.

Table 3 . Criteria of psychometric quality rating based on Terwee et al. [50] and Schellingerhout et al. (2012). Psychometric property Score a Quality criteria b Content validity
+ A clear description is provided of the measurement aim, the target population, the concepts that are being measured, and the item selection and target population and (investigators or experts) were involved in item selection ?A clear description of above-mentioned aspects is lacking or only target population involved or doubtful design or method Ã # items consistency and !100) AND Cronbach's alpha(s) calculated per dimension and Cronbach's alpha(s) between 0.70 and 0.95 ?No factor analysis OR doubtful design or method -Cronbach's alpha(s) <0.70 or >0.95, despite adequate design and method d + MIC < SDC OR MIC outside the LOA OR convincing arguments that agreement is acceptable ?Doubtful design or method OR (MIC not defined AND no convincing arguments that agreement is acceptable) -MIC !SDC OR MIC equals or inside LOA, despite adequate design and method; a

Table 8 . Overall quality score of assessments for each psychometric property based on levels of evidence by Schellingerhout et al. [52]. Measure Internal consistency Reliability Measurement error Content validity Structural validity Hypothesis testing
NRNotes.Levels of Evidence: Strong evidence positive/negative result = Consistent findings in multiple studies of good methodological quality OR in one study of excellent methodological quality; Moderate evidence positive/negative result = Consistent findings in multiples studies of fair methodological quality OR in one study of good methodological quality; Limited evidence positive/negative = One study of fair methodological quality; Conflicting findings; Indeterminate = only indeterminate measurement property ratings (i.e., score = ?inTable7);NR = Not reported; Not Evaluated = studies of poor methodological quality according to COSMIN excluded from further analyses.https://doi.org/10.1371/journal.pone.0203373.t008