Measuring child and adolescent well-being in Denmark: Validation and norming of the Danish KIDSCREEN-10 child/adolescent version in a national representative sample of school pupils in grades five through eight

KIDSCREEN-10 is a generic instrument for measuring global health-related quality of life among 8-18-year-old children and adolescents. This study examines the criterion-related construct validity and psychometric properties of the Danish language version of the KIDSCREEN-10 using Rasch models. A further aim was to construct Danish norms based on the resulting person parameter estimates from the Rasch models. Data consists of a nationally representative cross-sectional survey of 8171 children in the 5th to 8th grade of primary school in Denmark. No adequate fit to the Rasch model or a graphical loglinear Rasch model could be established for the KIDSCREEN-10 in the full sample of children (n = 8171). Results based on analyses with increasing samples sizes showed that even with the smallest sample item 3 (Kid3) of the KIDSCREEN-10 did not fit the Rasch model. After elimination of Kid3, substantial local dependence and differential item functioning relative to gender and grade level was still present. Already with a sample size of 630 fit to the Rasch model or a graphical loglinear Rasch model adjusting for local dependence and differential item functioning was not established. Therefore, generation of Danish norms was not realizable, as this requires valid sum scores and estimates of the person parameters for an adequate number of cases. Thus, the Danish language version of the child/adolescent self-report KIDSCREEN-10 questionnaire cannot be recommended for use in population-level studies. Neither can use in small sample be recommended as adjustment for differential item functioning and local dependence is ambiguous.


Introduction
The data for this article originates from the project Children and young people's reading 2021 [Børn og unges laesning 2021], which examines children's reading habits and experiences at four grade levels in Danish schools and so it has four target populations -students enrolled in the fifth, sixth, seventh, and eighth grade (Hansen et al., 2022).The overall purpose was to provide generalizable insights into the reading habits of each of the four target populations.Importantly, the study's questionnaire also included the KIDSCREEN-10 items, which makes the data relevant for this article.
Children and young people's reading 2021 utilized a two-stage random sample design, with a sample of schools drawn as a first stage and one intact class of students selected from each target grade in participating schools as a second stage.Intact classes of students were sampled rather than individuals to secure less disruption to the school's day-to-day business.For practical reasons, the selection of schools was split into two separate sampling plans.In one sampling plan, schools were selected to participate with a class in the fifth and sixth grade from a list of schools with students enrolled in the two target grades.In the other sampling plan, schools were sampled to participate with a class in the seventh and the eighth grade from a list of schools that had students enrolled in both the seventh and eighth grade.
The data was collected in Autumn 2021 -from the 15 th of September to the 17 th of Novemberusing online surveys that children answered in a classroom setting.The data collection resulted in completed responses from 2119 children in the fifth grade (from 115 schools), 2118 children in the sixth grade (114 schools), 1976 children in the seventh grade (105 schools), and 1958 children in the eighth grade (106 schools).

Target population
Children and young people's reading 2021 had four target populations, i.e. children enrolled in the fifth, sixth, seventh, and eighth grade.In the Danish school system, compulsory education consists of ten years of primary and lower secondary education, including one pre-school year (ISCED level 0) and years 1 -9 (grades).Children normally start in pre-school the year they turn six, and they enroll in the first grade the following year (i.e.first year of ISCED Level 1).Consequently, when children are in the fifth grade, they are in their fifth year of schooling and most of them are in the age range of 10 to 11. S1-1 All students enrolled in a target grade were part of their respective target population, regardless of their age.As students were sampled in two stages, first by random selection of schools and then random selection of a class from participating schools, it was also necessary to identify and define eligible schools.Essentially, all schools were eligible for the study, if they had students enrolled in the respective target grade, regardless of the type of school.

Sampling frame
The project used two sampling frames.One with all schools that had students enrolled in the fifth and the sixth grade (N=1599), and another with schools that had students enrolled in the seventh and the eighth grade (N=1319).
The two sampling frames were constructed based on data from the Danish Departmental Agency for IT and learning [STIL; Styrelsen for IT og Laering] in Spring 2020 (Pettersson et al., 2022).STIL is an agency supporting the Department of Education in Denmark and it keeps an updated database of all schools in Denmark.
Each sampling frame was structured as a spreadsheet containing a single row per school.Every row contained important information such as a unique identification number, contact information and the values of the stratification variables for the school (i.e.school type) and the school measure of size.For a full overview, both sampling frames included the following information: As Children and young people's reading 2021 was designed to describe and summarize reading habits in each of the four target grades, it was important that the sampling frames gave comprehensive coverage of each national target population.However, as is often the case with sampling in educational studies (e.g.LaRoche et al., 2017, p. 3.6), the project chose to exclude a subset of students from the target populations because of social and operational factors (Pettersson et al., 2022).Students were excluded based on the following criteria: School Level Exclusions  Very small schools (less than four students on a relevant target grade)  Schools only providing instruction to students in the student-level exclusion categories listed below (i.e.special needs students) Student Level Exclusions  Students with functional disabilities: students who had physical disabilities that meant they could not answer the survey  Students with intellectual disabilities: students who were evaluated by teachers to have intellectual disabilities beyond poor academic performance.It also covered children who were either mentally or emotionally unable to follow the general instruction.However, schools were informed that they were not allowed to exclude children based solely on poor academic performance  Non-native language speakers: students who typically had received less than one year of instruction in the native language at the time of the study  Parents rejected participation: students whose parents refused their children's participation in the study In addition, a number of schools were excluded because of the sampling design.Schools were sampled with an overlap in each sampling plan, i.e. schools were sampled to participate with students in two target grades (either 5 th and 6 th or 7 th and 8 th grade), thereby making it necessary for schools to have students enrolled in both grades to qualify.For instance, if a school for some reason had students in the fifth grade, but not in the sixth, the school was excluded from the sampling frame.Clearly, in this example, this would lower the population coverage of students in the fifth grade.

Sampling design
The basic sample design used is a stratified two-stage cluster sample design with a sample of schools drawn as a first stage and one intact class of students selected from each target grade as a second stage.The selection of schools was split into two independent sampling plans.In one sampling plan, schools were selected to participate with a class in the fifth and sixth grade from a list of schools with students enrolled in the two target grades.In the other sampling plan, schools were sampled to participate with a class in the seventh and the eighth grade from a list of schools with students enrolled in both the seventh and eighth grade.Sampling was divided into two sampling plans instead of one because it would be resource-intensive for schools to participate with a class in four different grades.Also, splitting the four grade levels into two sampling plans had an additional benefit, because a substantial number of schools in Denmark cover the 5 th and the 6 th grade (also known as "middle school", mellemtrinnet in Danish) but not the 7 th and 8 th grade (equivalent to the first years of "lower secondary", udskolingen in Danish).Thus, having two sampling plans instead of one increased the coverage of each target population.

Stratified two-stage cluster design
The stratified two-stage cluster sample design used mirrored the Danish sampling design in PIRLS 2016 (Mejding et al., 2017) 1 .Both sampling plans were executed as follows (Pettersson et al., 2022).
First stage: In the first sampling stage, schools were selected based on a sampling frame.Before selection, schools were divided into two groups based on school type (i.e.public or private) using explicit stratification.Within each explicit stratum, the schools were then selected using a systematic random sampling with probabilities proportional to size (PPS).The PPS technique gives larger schools, i.e. those with more students, a higher chance of being selected than smaller schools.But this difference in selection probabilities among larger and smaller schools is counterbalanced in the second sampling stage, where a fixed number of classes are selected with equal probability from each sampled school 2 .In addition, two replacement schools were selected for each sampled school.If a sampled school refused to participate, replacement schools were used to limit a loss in sample size.
Second stage: Within each participating school, one intact class of students were selected from each target grade 3 .Every sampled school provided a list of classes in the target grade that they were chosen to participate with.For instance, schools sampled to participate with a class in the fifth and the sixth grade provided two lists: one with all classes in the fifth grade and another with all classes in the sixth grade.Using these lists, a class from each grade was selected using simple random sampling.

Considerations about sample sizes
Children and young people's reading 2021 aimed to achieve reasonably small standard errors for survey estimates.The student sample for each grade level should ensure a level of precision defined by confidence intervals of ±3.5 percentage points for percentages (Pettersson et al., 2022).To meet this precision requirement, Children and young people's reading 2021 required an effective sample of 800 students on each target grade, i.e. if students were sampled with simple random sampling.
However, when cluster sampling is used, it is necessary to take account of the effect of intra-cluster correlation (i.e.ICC, the strength of correlation within clusters) on sample size calculations.This effect is known as the design effect (DEFF), a correction factor that is used to adjust the required sample size for the cluster sampling.To get an estimate of the design effect, the project used Danish data from PIRLS 2016, which had a similar stratified two-stage cluster design4 , and calculated the design effect for one of this project's main outcome, reading enjoyment (Pettersson et al., 2022).Subsequently, the design effect was used to adjust and estimate the number of schools needed to participate, and the student sample, albeit the design effect naturally varies on different outcomes depending on the ICC for the outcome.Students completed the online survey in a classroom setting.The survey was made available through personalized links that were sent out to students' emails through the survey tool SurveyXact.In each school, a school coordinator (e.g., a teacher or a secretary) were present during the survey response session.School coordinators were briefed about the response session by the project team.Their responsibilities included: introducing the survey to the children, logging in, leading students through the survey, and making sure that students who fell ill were given a later chance to participate.

S1-3
To ensure representativeness in a study, minimizing non-response bias is essential.As shown in S1-4 Note.Numbers show participating entities and the participation rate is in parentheses.
S1-5 Table presents the demographic composition of the total sample and population.The sample consisted of an equal split between boys and girls, as did the population.The distribution of grade was roughly divided in four equally sized groups in both the sample and population.Public school attendance accounted for 78.8% of the sampled children, slightly lower than that of the total population at 80.2%.Sample proportions across small, medium, and large schools were similar to those found in the overall population (33.9 vs 33.4%, 35.0 vs 33.4 %, and 31.0 vs 33.2%).In terms of home language spoken by students surveyed, Danish was predominant at a rate of 75% among the sampled, unfortunately we do not know this information in the population.
S1-5 Notes.Information about the population is based on data about all Danish students from the 5 th to the 8 th grade in the school year 2020/2021.It was collected from the Danish educational data warehouse [Uddannelsesstatistik].Categories on school size are defined based on population terciles.
In the following, we present descriptive statistics on the student responses in terms of response time and dates conditional on grade.S1-6 Four tables are supplied in this section, i.e. one for each target grade.Each table presents a comparison between participating students (sample) and the population on the previously mentioned characteristics.The first column (1) shows the distribution in the sample calculated with designweighted data 7 .Next, follows the distribution in the population (column 2).Column (3) shows the mean difference between these two groups, and column (4) the standard error of the difference.Standard errors are derived using the Jackknife repeated replication method for variance estimation (Foy 2013). 8 7 The project Children and young people's reading 2021 calculated sampling weights for each grade in order to make accurate estimates of population traits for each of the four target groups.Under ideal conditions with a stratified PPS two-stage cluster sample, that is, with no non-response and no disproportional sampling of subpopulations, this design would lead to "self-weighted" samples where all units sampled in the last stage of sampling have similar selection probabilities (LaRoche et al., 2017).However, in reality, and also in this data collection, there was some non-response.Consequently, sampling weights were calculated to account for varying selection probabilities.
Sampling weights were computed separately for each populations, corresponding to the target grade that each sample were designed to represent.They were calculated using the exact same procedure as in PIRLS 2016 (LaRoche et al., 2017, pp. 3.18-3.22)because the sampling design mirrored the stratified two-stage cluster design used in PIRLS 2016.
The final student sampling weight accounted for selection at three levels: school, class, and student; each level had a base weight, i.e. the inverse of the selection probability, and an adjustment for nonparticipation.Additional information on calculating sampling weights is available upon request from the corresponding authors.
8 To estimate the sampling variance, we used the Jackknife Repeated Replication method for variance estimation.It is computationally straightforward, provides unbiased estimates of the sampling variance and appears to be the most-used approach in educational surveys when exploiting a two-stage cluster sample design (Foy & LaRoche, 2017;Schulz, 2020).
In essence, the JRR technique works by grouping the PSUs (schools) two-and-two into Jackknife sampling zones in accordance with the stratification in the sample design, before repeated samples are drawn from each sampling zone, compared in terms of variability against the original full sample, and then used to obtain an estimate of the sampling variance.The JJR technique applied followed the procedure used in PIRLS 2011 (Foy, 2013).In a first step, schools were paired into sampling zones.The sampling zones were constructed within each explicit stratum, and we paired schools adjacent to each other in the sorted sample frame into one sampling zone.If an explicit stratum had an odd number of schools, the students in the remaining school were randomly divided into two quasi schools, thus forming an extra sampling zone.We constructed 58 sampling zones in the sample for students in the 5 th grade (among 115 schools), 58 sampling zones in the sample for 6 th grade (114 schools), 53 sampling zones in the sample for 7 th grade (105 schools), and 54 sampling zones in the sample for 8 th grade (106 schools).In the next step, we assigned a multiplication value to each school.Within each of the sampling zones, we randomly assigned one school a multiplication value of 2 and the other school a value of 0. For each of the sampling zones, we computed a replicate weight.In short, this means that one of the paired schools had a contribution of 0, the second a double contribution, and all other schools remained the same.
The replicate weight was derived by multiplying the final student weight with the jackknife's multiplication value for the respective sampling zone while keeping student weights for all other zones unchanged.Thus, for instance, in the sample for students in the 5 th grade, we ended up with 58 subsamples that generated 58 unique replicate sampling weights.The replicate sampling weights were then used to estimate the statistic of interest 58 times.The variation across these jackknife estimates provided an estimate of the sampling variance.For a more comprehensive description of the JRR technique applied, please see Foy 2013.
Table provides details about the four target populations.Information about the population size is based on data about all Danish schools from the school year 2018/2019.It was received from the Danish Departmental Agency for IT and learning [STIL; Styrelsen for IT og Laering] (Pettersson et al., 2022) 2022).
Table displays the calculation of sample size requirement, and it includes an adjustment for expected non-response on both the school-and student-level.It shows that precision requirements would be met with a school sample of 120 schools and a student sample of 1920 students.However, when including expected school (and student) non-response, at least 150 schools had to be sampled.Invitation and recruitment of schools to participate.Schools were invited by email.If schools did not respond, the headmaster was contacted by telephone.Recruitment of schools to participate occurred in spring 2021. 52. stage: Collecting information on respondents from school administrations, i.e. name, gender, birth date, and email.Schools were obliged to send administrative information on participating children.The collection of information happened in August and September 2021.3. stage: Collecting survey responses.Responses were collected between September 15 th and November 17 th , 2021.
Table, the study had approximately 2000 students participating at each grade level and achieved noteworthy participation rates: 70-75% for schools and around 85% for students in each target grade.Altogether, the data collection results in a total sample of 8171 students from the 5 th to 8 th grade participating from 199 unique schools 6 .
Table.Characteristics of total sample of students in 5 th to 8 th grade Table shows the number of completed responses, mean response time, and median response time by target grade.Fig S1-1 shows the number of responses conditional on response date, while Fig S1-2 presents the distribution of survey response time (in minutes) with the red dotted line highlighting the mean response time.Comparing student samples and target population in each gradeThis section uses the information available about the sample and the population to explore if the sample of each target grade was demographically similar to the population.All variables used here (gender, school type and school size) are based on administrative information received either from schools or from the Danish educational data warehouse [Uddannelsesstatistik] (school year 2020/2021), available from the Danish Ministry of children and education at https://uddannelsesstatistik.dk/Pages/main.aspx.