Segregation within school classes: Detecting social clustering in choice data

We suggest a new method for detecting patterns of social clustering based on choice data. The method compares similar subjects within and between cohorts and thereby allows us to isolate the effect of peer influence from that of exogenous factors. Using this method on Norwegian register data, we address the question of whether students tend to cluster socially based on similar background. We find that common background correlates with making the same choices of curricular tracks, and that both exogenous preferences and peer influence matter. This applies to immigrant students from the same country, and, to some extent, to descendants of immigrants, but not to students from culturally similar countries. There are also small effects related to parents’ education and income.


Introduction
With an increasing availability of large-scale data documenting people's choices and behaviour, observations of people's actual choices have become more accessible in a variety of situations. What these data typically do not measure directly, however, is information on the mechanisms behind these choices, such as the degree to which people interact and influence each other. We will here present a method for inferring propensities for peer influence between people based on similarity of choices, drawing inferences from administrative registry data from upper secondary school on students' educational choices. We want to study socalled peer effects and school segregation at the micro-level; more specifically, we want to see if students of the same immigrant background within the same school and cohort influence each other's choices of curricular tracks. The purpose is two-fold: to illustrate that it is indeed possible to trace out meaningful patterns of interaction in static data such as administrative registers, by suggesting a specific method, and to use that method to actually detect such patterns and address a substantial sociological issue. We will start by describing this issue and how it can be addressed using the presented method, before moving on to the details of the method.

School segregation
Ethnic school composition is usually measured as the concentration of immigrant peers at school level [1,2], or within educational cohorts within schools [3][4][5][6][7][8], or recently at several levels simultaneously, by including schools nested within school districts [9]. For immigrant students, ethnic school segregation means less exposure to the receiving society and fewer opportunities of learning the language [10]. These, and other, studies of segregation effects provide indirect support for the existence of peer influence, yet they lack data on social networks and what actually happens within the classrooms and in the school yard. Thus, the literature on school segregation has often assumed that social interaction takes place, yet, until recently, there were few quantitative studies on how students actually interact in school (see for instance [11]). Because direct information on social interactions will often not be available, and because registry and similar databases provide us with rich information contained within representative and reliable data on choices for a large number of cohorts and geographical areas, we aim at developing a method that allows us to infer clustering patterns at an aggregate level based on other variables, such as social background characteristics, using such data. Here, we use this method to assess the importance of (mainly) ethnic peer influence on the choice of curricular tracks in upper secondary schools. The article employs Norwegian administrative register data with information on all upper secondary school students in five major cities over the years 2006-2011. Specifically, we will explore similarities in first year students' choice of curricular tracks for their second and third year in upper secondary school. As we will explain, this is a choice that does not only have an educational impact, but also determines which students will remain classmates; it is thus also, to some degree, a choice of friends.
By studying students' choices we aim at identifying peer effects between individuals who share the same socially significant attribute [12]. We focus mainly on ethnicity, but gender, age group and social background are other relevant examples. Most previous work has pointed to the fact that racial or ethnic homophily is one of the most important factors in friendship formation [12,13], and one study suggests that segregation primarily takes place among students at the same grade level [14]. There are, however, other studies documenting that cross-ethnic friendships are also frequent [15].
Our main research question is whether we may infer peer influence on students' choice of curriculum, based on the actual choices that we can observe. More specifically, we are interested in whether being and immigrant in general, immigrating from or having parents from a specific country, as well as originating in culturally similar countries is associated with peer influence on the choice of curricular specialisation in Norwegian upper secondary school. We suggest a novel method that can address this kind of questions and associated methodological challenges. We build our argument on the following logic: 1. Students have educational preferences before they start in upper secondary school. These preferences vary to some degree by students' own characteristics, such as their immigrant status, gender and social background. Preferences are also likely to be affected by other factors, such as previous teachers, older students and other role models, and peer effects within lower secondary school and neighbourhoods. By studying curricular choices made by students in different educational cohorts at upper secondary school, we will document the strength of these exogenous preferences (as revealed in students' educational choices).
2. Our main goal however, is to study choices induced by peer influence based on common social background characteristics, which can be considered a form of endogenous rather than exogenous preference formation. Students interact at school during their first year, and they may thereby influence each other's educational choices. These social dynamics, revealed through patterns of social interactions, are likely related to students' immigrant status, gender, social background, and other characteristics. Our main focus here is on immigrant status. 3. The challenge, then, is to differentiate between exogenous preferences and peer influence. We do so by comparing the two; that is, to assess the strength of peer influence during the first year at school we compare the educational choices within cohorts with the educational choices between cohorts. If the difference between the two are noticeable (as determined by a significance test), we interpret this difference as an outcome of endogenous preference formation due to social interaction at school.
Using this method, we will first explore potential peer influence effects related to immigrant background; second, we will see if these effects persist among descendants of immigrants, and, if so, if they apply to country of (parents) origin, or to some higher level of aggregation, such as cultural distance. Third, we additionally address other socio-economic factors, such as gender and parents' education and income.

Causes of segregation in school
There are mainly two reasons why we should expect to find evidence of social influence among students. First, individual opportunities and choices are often affected by social ties [16,17]. In addition, individual choices affect others' opportunities, as illustrated in Schelling's [18] model on neighbourhood segregation; see also examples by [19][20][21]. Our study is limited to exploring the first part of these social dynamics. Given the importance of social ties, the vital question would then be with whom do we socialise?
Friendship formation is usually based on the well-documented preference for similarity in social relations [22][23][24][25]. At the group-level, it has been shown that status equality is essential for positive inter-group relations [26][27][28]. Associated with this argument is the notion of social distance, that is, the perceived affinity between people. When making choices, people prefer to conform to what other people with small social distance do [17]. In line with this, previous studies have shown that cross-ethnic ties are less stable [12,29], and emotional support is more common in intra-ethnic ties [30]. We would thus expect homophilic dyadic interactions and friendship networks among students with similar characteristics, such as ethnic origin, and we would expect these interactions to bear some influence on students' choices of educational specialisation [31] (see also [32]).
Second, homophily in friendship formation as well as perceived ingroup belonging may also be related to language. Language acquisition is closely related to social identification [33], and at school social ties are likely to be formed among immigrant students with similar language. Two German studies illustrate the crucial role of language [34,35] in friendship formation, arguing that immigrants using the new language were more likely to befriend also native students (see also [36]). Thus, ethnic ingroup peer interaction as well as language proficiency are often strong predictors of ethnic identity [36], and immigrants of similar ethnic origin might prefer to stay together in class because they share a common ethnic identity. Theoretical modelling (e.g. [37]) has also supported the argument of ingroup preferences as a result of common references. Social interaction theory would thus lead us to expect social clustering of immigrant students from the same country of origin, and, from the same logic, we would expect less social interaction between descendants of immigrants, who, fluent in Norwegian, would be more likely to associate also with natives.
Third, immigrant students, including descendants of immigrants, may experience harassment, racism, and discrimination at school, which might contribute to ethnic friendship homophily. A number of studies in the US have addressed peer interactions among blacks and whites in school, showing that students from underprivileged backgrounds may feel less at a disadvantage when they are together with similar peers [38][39][40][41][42][43][44][45][46][47][48].
However, one study of ethnic stereotypes has shown a hierarchy, with "Norwegians and Swedes at the top, followed by Poles, and then Pakistani and Iraqi (Muslim) immigrants, with Somali immigrants and Roma people at the bottom" [49]. This ethnic hierarchy corresponds with cultural and geographical distance from Norway, and if students apply this ethnic 'ranking' on each other we might expect cultural distance to matter for their definitions of ingroups and outgroups.

Social ties in school
During the last 15 years, and especially since the release of the CILS4EU dataset [50], a growing number of studies have addressed ethnically contingent social ties in European schools. When students are asked to list their friends, students of the same ethnicity are more likely to be on that list [30,51,52], and they also tend to be similar in cultural and socioeconomic characteristics [53]. Meanwhile, other data, collected in London, has suggested cross-ethnic friendship to be frequent and of high quality [15].
Here, we have a different approach, detecting consequences of intra-ethnic interaction and choice similarity that are revealed indirectly. We thereby utilise another type of measurement, complementing previous studies on self-declared ties. We thus aim to make both a substantial and methodological contribution to research on patterns of social exchange and their effects. Given that collecting data on stated preferences is costly, we often only have access to behavioural data, and the proposed method can make use of such already available data. Also, our results complement those derived from CILS4EU and other datasets, since we here detect preferences based on students' actual behaviour, and thus avoid discussions about potential misreporting, boundary definitions and wishful thinking. Finally, when studying actual behaviour, the focus is on the consequences of social interaction-in this case educational choice-which is often what the investigator is mainly interested in.

The present study
In the Norwegian school system, students in their last year of lower secondary school (at the age of 15) choose a study programme at upper secondary school for the next three years, either academic or vocational programmes. After the first year, they choose curricular specialisation tracks within their programme for years two and three. For example, a student who has chosen a general academic programme can specialise in natural sciences, social sciences or languages for years two and three. In general, the students on a specialisation track are a subset of the students on the whole programme. Typically, school classes, with a maximum of around 30 students, are formed within the specialisation tracks (with one or more school classes for each specialisation track dependent on the size of the student body). This means that students' choice of specialisation track not only determines what subjects they will study, but also which of their classmates in the first year will most likely remain their classmates in the second or third year.
In the first year, students within the same class typically also engage in common activities such as eating, breaks, sports etc., which makes it very hard not to interact. Thus, students may coordinate their choices with their friends in order to remain classmates. Choices of specialisation tracks for the second and third year at school should therefore correlate with social ties within the class in the first year. We will distinguish between two mechanisms: individualbased choices related to common exogenous preferences on the one hand, and, on the other hand, choices related to peer influence (endogenous educational preferences and/or a preference for being together in class).
Using our data, we cannot differentiate between educational preference formation due to peer exposure in class and students making educational choices to maintain their social ties. There are recent methods to separate these mechanisms by means of a stochastic agent-based modelling approach [54,55], but this approach requires rich and detailed longitudinal data on social ties. Whether choices result from maintaining social ties or from being influenced by the same peers, however, both are still effects from social clustering. Also, for our data, we do not have the problem of cause and effect: our observation is a choice that was made after a year of social exposure to peers.
In what follows, we will present the suggested method for inferring increased propensities for social clustering based on educational choices. We will then describe the data in further detail. The Results section first focuses on choices with respect to common country of origin (for both immigrants and descendants), robustness checks, and a discussion about how we should interpret effect sizes. We then proceed with broader measures than country of origin (e.g., through cultural distances), and also examine other demographic and socio-economic variables. Finally, we discuss the significance of the results and the potential use of the suggested method for further studies.

Methods
We develop a three-step analysis to distinguish between the two mechanisms of exogenous educational preferences and choices related to endogenous (social and educational) preference formation that results from social interaction in class. We first explore similarity in first-year students' choice of curricular tracks (within classes/educational cohorts). We expect these choices to be affected by their common exogenous preferences as well as endogenous effects. Second, we explore similarity in choices made by students that were not classmates (between classes/educational cohorts). This gives us a measure of common exogenous preferences between students making their choices in different years. Comparing choices within and between school classes, we aim at isolating choices related to within-class peer influence from choices related to exogenous educational preferences.
Our research question and data pose methodological challenges. For example, most linear and discrete choice models study choices in social isolation. There are relatively recent attempts at incorporating the impact of social networks and peer influence (e.g. [56]), but generally these assume a known network structure and include a measure of overall, instead of dyadic, peer effects (e.g., here it could be the classroom composition) on agents' preferences or educational outcomes (for an overview, see [57]). Here, however, we are studying choices that are made simultaneously and the extent to which pairs of students make the same choices. Our unit of analysis is thus the dyad, which is, in turn, nested in a network structure.
We know of no existing alternative method that could directly address simultaneous dyadic choices, controlling for structural dependencies (and exogenous effects). For example, in a McFadden discrete choice model, the peer effects terms would end up being infinitely recursive (see also [58,59]). Therefore, we suggest a combination of statistical methods, where, as will be described below, we study correlation coefficients between the existence of links (or ties/edges) in graphs constructed with respect to the input and output variable, respectively.
The method has been implemented in R, and the code is available on Github [60].

Data requirements
We will illustrate the method using the specific case of choices of curricular tracks in school, but let us first, and more abstractly, provide the general assumptions for the kind of the data that can be used. The requirements increase the more that needs to be controlled for.
We are assuming data for a number of individuals where two variables are to be compared for whether similarity between two individuals in one of the variables is associated with similarity in the other, meaning that dyads are the units of analysis. More formally, the data could be construed as two two-mode matrices A and B, where the rows of the two matrices represent the same individuals and the columns the two variables, respectively. The aim is to compare the two one-mode matrices AA T and BB T . We can control for structural dependencies using conventional methods (Quadratic Assignment Procedure, QAP, described below).
If data are grouped by a third variable, such that only individuals in the same group are to be compared with each other, then a summary measure can be computed using our combined meta-analytic QAP approach below.
If data are further grouped by a fourth variable, such that the groups of the third variable are subsets, then our method provides a way to measure the net effect of common membership (for pairs of individuals) in the subgroup defined by the third variable as compared to the group defined by the fourth. For a meaningful interpretation, all the subsets should be defined by common properties. The implications of demonstrating a significant such net effect is that properties that are exclusive to the subgroups as compared to the groups make individuals that are more similar along the first dimension more similar along the second.

Measuring the effect size
The units of analysis are dyads, and each student in a class is paired once with each other student. This means that we have one matrix of the independent and one of the dependent variable, representing similarity between the students in each dyad. The matrices have the properties of adjacency matrices, and can each equivalently be represented by a graph: one graph where nodes represent students and links are drawn between students with the same characteristics, and another graph where the links represent similar educational choices. (Note that these are not social ties, but links indicating similarity between the nodes in a variable. Even though these are not necessarily social networks, they still have the mathematical properties of networks, and conventional methods apply.) Constructing these two graphs for all students in all classes, our research question is then whether the two types of graphs are correlated on the aggregate level. See Fig 3 for an example of what the two types of graphs can look like.
More formally, we have individuals i 2 {1, 2, . . ., n}, each with an individual trait and a choice outcome. Let v i be the individual trait, say, country of origin for individual i, and V = {v 1 , v 2 , . . ., v n } be the countries of origin for all n immigrant students in a school class. From this given data, we construct an adjacency matrix A such that A ij = 1 if v i = v j and A ij = 0 otherwise, that is, A indicates whether students i and j are from the same country. In this example, A is a matrix of binary variables, but it can also be generalised to a distance matrix, containing scale variables, where, for example, A ij is a normed difference between v i and v j . For the same individuals, we let w i be the choice variable, say, choice of specialisation within a study programme, with W = {w 1 , w 2 , . . ., w n }. From this given data, we construct the corresponding adjacency matrix B, that is, B ij = 1 if w i = w j , students i and j made the same choice, and B ij = 0 otherwise. The effect measure is the (element-wise) correlation ρ AB between A and B.
Graph correlation coefficients can tell us whether having the same background, say, is associated with making the same choice. However, we are interested in whether such an outcome is caused by social exposure, or if it can be explained by higher propensities for students of given demographic or socio-economic characteristics to make certain choices. Another possible covariate is that going to certain schools may increase the probability of certain combinations, for example because a school may have profiled itself in a specific specialisation. In larger cities, schools are segregated by ethnicity and socio-economic status. These biases in school choice could affect the result. Thus, we need to control for both common exogenous educational preferences, related to demographic or socio-economic characteristics, and school-specific conditions.
The impact of both of these factors can be tested by comparing students not to their classmates, but to other students in the same programme and school, but who started in a different year. Instead of constructing one graph for each class (defined by school, programme and year), we have constructed a graph including all years (defined only by school and programme), with possible links between students only if they are not in the same class (so dyads of students from the same year are not included). This design allows for exogenous preferences and school-specific conditions to give correlations between the demographic or socio-economic variable and choice graph, while excluding within-class peer influence and choices based on retaining friends as classmates.
In more formal notation, we build adjacency matrices A and B over all cohorts {C 1 , . . ., C m } from the same school and programme, where all individuals belong to exactly one cohort. Values A ij and B ij are assigned as above, but for i, We thus perform a three-step analysis. First, we compute a within-class correlation coefficient, where students are compared pairwise to their classmates, with links between those in the same class with the same characteristics in the first graph, and those making the same choice in the second. This provides a measure of both endogenous and exogenous effects. Next, we compute a between-class correlation coefficient, where students are compared pairwise to everyone on the same programme in the same school, but in different cohorts (e.g., students starting in 2006 are compared to students starting in 2007-2011), thus measuring exogenous effects. Finally, we compare these two coefficients and measure the excess effect of demographic variables. To the extent that the exogenous effects are indeed of the same size in the first and second step, the excess effect isolates the endogenous effects, which we argue are effects of social exposure, which should be mainly driven by social ties (preferences for remaining together) and social influence from selected peers.

Dealing with statistical dependencies
A complicating factor when studying choices in dyads is that we have strong dependencies, perhaps most evidently in what is known as triad closure. Given students A, B and C, if both B and C have made the same choice as A, then we know that B and C must also have made the same choice. This is the extreme, deterministic, case of a triad closure: it is not even possible to have exactly two links in a triad where the links signify having a trait in common. We need to control for the underlying graph structure. This can be done through the Quadratic Assignment Procedure (QAP), which is a strategy enabling statistical significance testing taking the graph structure into account [61].
In the QAP, the input graph is relabelled randomly, such that the characteristic under study retains its distribution, but is assigned randomly to all individuals. If, say, A and B have the same country of origin, then they will have a connection in the input graph. Depending on whether they made the same educational choice, they may have a connection also in the output graph. We retain the connections between nodes, but relabel them in the input graph, such that A and B may now be labelled C and D, say, and will be compared to these nodes in the output graph. Representing graphs as adjacency matrices, this amounts to randomly permuting the rows of the matrix and then applying the same permutation to the columns. The null hypothesis of the QAP test is that the observed correlation was drawn from the distribution of correlation coefficients on the set of all relabellings of the graphs. In practice, by repeating the relabelling procedure and computing simulated correlation coefficients, we can approximate this distribution and compare it to the observed correlation. The null hypothesis is rejected at significance level α if less than a fraction α of the simulated values are greater than the observed value.
Note that developing an alternative method based on an approach such as discrete choice modelling would also require similar simulations of different possible scenarios, with several design choices, in order to resolve the infinite recursion of the peer effects term.

Summarising over classes
Conducting studies over several classes, it is not straightforward how to combine these into one single measure. Students over all schools are facing the same kind of choice, so both the independent and the dependent variables have the same meaning (e.g. a risk ratio of r means that students of the same origin are r times as likely as those from different origins to make the same choice). However, the effect sizes from each class are not directly comparable, as they depend on the graph structure and the variance. In order to estimate a summary effect, and to assess the consistency across classes, it is clear that the individual effect sizes need to be weighted. A conventional method is to weight them by their precision (measured by variance, e.g., small classes typically provide less confidence in the estimate of the actual effect than large classes). There is a method that also allows for the actual effects to vary between classes.
If we consider each class a separate experiment, then we can perform a meta-analysis over all experiments. In a fixed-effect model, this amounts to computing a weighted average of the effect sizes. The weights are commonly set to be the inverse variance of the effect in each study. Our studies fulfil the assumptions of samples being drawn from the same population, using the same variables etc.
There are, however, also sources of heterogeneity in that the graph structure and the distribution of educational choices vary over classes. Even if students from the same country would be equally more likely to choose similarly irrespective of school and class, the true effect size is still subject to structural limitations, varying between classes. In order to account for this, we mainly use a random-effects model instead, which reduces the differences in weights by adding to each within-study variance a random effects variance component measuring the variability between studies. For our main analysis, we also present the fixed effect sizes. In the present studies, the random effects variance is small, and there is thus little difference between the two models. For an overview of meta-analyses, we refer to [62].
The weighted mean effect ρ is computed as where k is the total number of studies, ρ i is the correlation coefficient for study i, and w i the corresponding weight, computed as where s 2 i is the within-study variance for study i, and τ 2 is the between-studies variance, which is in turn estimated according to the [63] method, computed as where v i ¼ 1=s 2 i is the inverse within-study variance for study i (see also [62, p. 72-74], where Y i is the effect size, ρ i ). The computations are the same for a fixed-effect model, except for excluding the between-studies variance, which amounts to setting τ 2 = 0.
Finally, we do not compute the variances and weighted mean directly over the correlation coefficients, but rather over their respective Fisher's z transformed values [64,65]. We then transform the weighted mean � z back to the original scale. The z value of a correlation ρ is given by In the extreme cases where ρ = 1 (which would happen mainly for simulated coefficients over small classes), we suggest to replace ρ by ρ − ε, where, for example, ε = 0.0001. This had no observable effect in our study.

Combining approaches
We will combine the two approaches for dealing with statistical dependencies and summarising over classes into what we can label a meta-analytic QAP approach. Following the idea of QAP, for a class i 2 {1, . . ., n}, we simulate 2m graphs with the same graph structure, producing m simulated correlation coefficients r 1,i , . . ., r m,i , and thus a probability distribution of the correlation coefficient under the null hypothesis of no effect in the given graph structure. From this we can estimate the variances s 2 i and thus also the between-study variance τ 2 . The inverse of the sum of the within-and between-study variances (w i from the previous section) of this distribution then gives us the weight applied to the actual correlation coefficient, ρ i , and the weighted sum ρ.
The aggregated correlation coefficient ρ needs to be compared to an aggregated distribution of simulated coefficients. Using the first simulated correlation coefficient r 1,k from each class k 2 {1, . . ., n}, we can also use the derived variances to compute a weighted average simulated correlation � r 1 . Reiterating this process for all of the simulated graphs for each class produces m weighted average correlations � r 1 ; . . . ; � r m , and thus a probability distribution of possible metaanalytic correlations, given the graph structures of all the classes. Again, in line with the QAP approach, we use this distribution to test the weighted average of actual correlations against the null hypothesis that the correlation may have been caused by the graph structure only, with random allocation of choices preserving the distribution.

Ethics statement
No new data has been collected. Access to the register data has been approved by The Norwegian National Committee for Research Ethics in the Social Sciences and the Humanities (NESH), the NSD Data Protection Services (Personvernombudet) and Statistics Norway.

Data
We used Norwegian register data for students starting their upper secondary education in the years 2006-2010, and starting their curricular specialisation the following year, in five Norwegian cities and 120 schools. We limit the number of cohorts to only these five, to maintain similarity for the between-cohort comparisons. Henceforth we refer to the students in a programme at the same school within the same year as a class (even though, in reality, this may correspond to several school classes if there are many students on the same programme within the same school).
The total number of individuals in the data is 51,315. We used only students starting their curricular specialisation track one year after they started at the general programme, and at the same school, leaving 42,577 individuals in our data, out of which 2,940 (6.9%) are immigrants and 6,847 (16%) are native-born descendants of immigrants. Among the descendants, 4,675 (11%) have an immigrated mother. We included only students whose country of origin is known.
The number of individuals, N, included in the respective analyses varies, for several reasons. In general, N is larger in the between-than the within-class analyses of ethnicity. In order to compute a correlation coefficient for choices with respect to shared origin, there needs to be at least one pair of students from the same country (and one pair from different countries) in the same class. When comparing over several years, it is more likely that this condition will be fulfilled. Note also that, at the same time, the number of classes, n, is smaller when classes are defined as including several years. We performed a robustness check, described in the Appendix section Alternative designs, to investigate whether the slightly different subsetting of data affected the correlation coefficient.
The variable under study also impacts N. For our scale variables: cultural distances, parental education and parental income, there needs not be a pair of students of shared origin in order to compute a measure, which enables a larger N. At the other end, N is reduced by the fact that the educational level of parents of immigrants is not always known, and we can only compute cultural distances between pairs of students from countries included in the World Values Survey.
As described in the Introduction, students first choose a general programme for their first year at upper secondary school, and then a curricular specialisation track within the programme for their second and third year. The most popular choice is a general academic programme, followed by 23,636 (56%) students in our data. Among these students, almost everyone chose a specialisation in either natural sciences (10,816, or 46%) or language, social sciences and economy (12,264, or 52%). The remaining students not on a general study programme followed a variety of mainly vocational tracks. The number of study programmes in the data is 22, of which 14 had at least 100 students. The number of specialisations is 53, of which 32 had at least 100 students. The most common choices of specialisation tracks and origins of immigrants and descendants are presented in the Appendix section Origins and choices of the students.

Results
Our main study investigates the extent to which pairs of immigrant students within the same class (students at the same school, programme and year) make the same choice of curricular track, dependent on whether they also share country of origin. We compare this result to the choices made by students on the same school and programme, but who were in other cohorts (i.e., between cohorts). We also compare within-and between-cohort choices among descendants of immigrants, and then go on to investigate different levels of origin. Is peer influence more prevalent between students from similar countries, and do we find this pattern also when considering immigrants as one group versus natives? Finally, we explore effects related to other demographic and socio-economic variables, such as gender, and parents' education and income. We present robustness checks in conjunction with the results. To further calibrate the robustness of our method, we have also investigated alternative designs, which are presented in the Appendix.
All the weighted correlations and p-values from our studies are summarised in Table 1. In the following subsections, we describe the tests and their results in more detail.

Same country versus different countries of origin
Our main hypothesis is that students with the same country of origin will influence each other's choices, so that students with the same country of origin make more similar choices than students with different countries of origin. We start by looking only at immigrant students within the same class. For each cohort, we constructed 1,000 simulated graphs for significance testing, following the concept of QAP.
The weighted average correlation coefficient within classes is ρ w � 0.057 (or ρ w � 0.053 in the fixed-effect model). The probability under the null hypothesis to obtain this value or higher is p < 0.001. Thus, the null hypothesis can be rejected with high confidence (see the within condition in Fig 1), and we conclude that there is a significant correlation among immigrants between shared country of origin and making the same educational choice. These analyses are based on n = 125 classes, at 52 schools, with N = 1, 175 students. Thus, within each class, the average number of immigrants is 9.4, and each class includes, on average, immigrants coming from 7 (6.95) countries. In total, the numbers of shared origin dyads, triads etc. are 162, 32, 15, 3, 3, 0, 0, 1.
To what extent can this result be attributed to exogenous effects? We performed the same analysis, but between classes, that is, with graphs consisting of cohorts of all classes on the same programme in the same school over all years, removing links and non-links between students enrolled in the same year from the analysis.
The correlation in this analysis is ρ b � 0.019 (or ρ b � 0.023 in a fixed-effect model). This coefficient is smaller than within classes, but significantly different from random allocation of choices retaining the graph structure, with p < 0.01 (see the between condition in Fig 1). These analyses are based on n = 85 classes with N = 1, 964 students; the average number of immigrants is 23.1 from 13.8 countries. In total, the numbers of shared origin dyads, triads etc. are 242, 83, 33, 23, 6, 6, 6, and 7 groups are larger than 8. How do these values compare, then, net of their respective graph structures? Pairing up the values from the simulated within-and between-class graphs randomly gives us a distribution of differences to which we can compare the actual difference. We find that, in more than 99% of the cases, the difference between the within-and between-class correlation coefficients are larger than the differences between the simulated values, which we accept as statistically significant (ρ w − ρ b � 0.038, p < 0.01).
In this design, we do not control for other demographic variables. Particularly, girls and boys tend to segregate in classroom situations. Assuming there is no gender bias associated with certain ethnicities, there is no reason to expect gender to be a driving factor behind our results. However, we can test our hypothesis net of gender effects by performing the analyses on boys and girls separately. The effect is more significant for girls than for boys, and while there is a retained difference between the within-and between-class measures for girls, we could not safely conclude that there is a real difference for boys (see Table 1). These results show that the measures increase in absolute terms, while they also become less significant. However, samples are relatively small, which may account for higher p-values, but also, due to the structural dependencies in the data, the measures are not directly comparable.
As a robustness check, we also tested alternative designs, presented in the Appendix, where we make use of data from the natives in the class, or include only a subset of the students in the within-class condition for the between-class condition. These alternative designs and samples produced qualitatively similar results, and the remaining presented findings are based on our first meta-analytic QAP approach, which requires less computational power than the first alternative design, and includes more data than the second.
Descendants of immigrants. We performed the same analysis on native students with an immigrant mother and compared pairs of students whose mother came from the same country to pairs where their mothers came from different countries. The within-class measure is ρ w � 0.031 (p < 0.001, N = 2, 878, n = 205), while the between-class measure is ρ b � 0.015 (p �  2) and the difference ρ w − ρ b � 0.016 (p � 0.033). Similar to the immigrant analyses above, the effect seems to be mainly driven by women. Restricting the analysis to students where both parents come from the same country produces similar results.
Comparing the effect size for descendants to that of immigrants in the same way as computing the difference between the within-and between-class measures, gives a significant net measure of 0.026 (p � 0.041) within classes (but not between classes). Thus, we found evidence that endogenous preference formation also remained in the second generation, though with a reduced effect.
Interpreting the effect size. How should we interpret the sizes of the correlation coefficients found here? First, it needs to be noted that the theoretical maximum is considerably below 1. We hypothesise that students of the same origin are more likely to choose similarly. However, a large correlation coefficient would require not only that all students from the same country make the same choice, but also that students from different countries always choose differently, which is not possible (and not predicted by the hypothesis) given the small number of available choices related to the number of students.
To get an estimate of a maximal effect size, we changed the specialisation choices of the students in such a way that all students from the same country always made the same choice. More specifically, within each class, everyone within a group was registered with the majority choice of their group. (In case of a tie, that choice was randomly selected among the most common ones.) This pattern provided a correlation within classes of ρ w � 0.29 and between classes of ρ b � 0.23. Thus, when everyone with the same background characteristics make the same choice, the maximum within-class effect size is 0.29.
To build up an intuition for the magnitude of the effects found here, Fig 3 presents the origin and choice graphs for a typical class in the data. This class has a correlation (ρ � 0.059) that is close to our weighted mean (ρ � 0.057). This is a class on the academic programme, with natural science and social science as the two available choices of specialisation. By making only one change, so that all the Chinese students would choose natural science instead of one of them choosing social science, the choices would be completely in line with our hypothesis, and provide a theoretical maximum similar to the hypothetical discussion above. Such a

PLOS ONE
Segregation within school classes: Detecting social clustering in choice data change would produce a maximal effect size of ρ max � 0.24, which is also close to the maximum of the meta-analysis.
The method is agnostic to choice of measure of effect size. Correlation coefficients have the benefit of being applicable also to continuous data, which is relevant for our analyses below. In the present case, however, the variables are dichotomous, and we could use other measures, such as ratios. Since several classes lack students from the same country making different choices, risk ratios are more viable (and easier to interpret) than odds ratios. The within-class 'relative risk' for immigrants from the same country to make the same choice compared to immigrants from different countries is RR w � 1.27 (p < 0.001, fixed effect RR w � 1.25, p < 0.001), that is, students sharing country of origin are 25% more likely to choose similarly (see Fig 4; cf. Fig 1). Between classes, the ratio is RR b � 1.11 (p � 0.004, same fixed effect, p < 0.001). The additive difference between these risks is RR w − RR b � 0.16 (p � 0.016). However, it is easier to interpret the ratio of these risks, that is, the ratio of choosing more similarly among students of the same origin within versus between classes. Calculating this ratio, we get RR w /RR b = log RR w − log RR b � 1.15 (p � 0.032, fixed effect RR w /RR b � 1.13, p � 0.045). Similar to the correlation coefficients, all cases are significant, though at a lower level when comparing the within-to the between-classes measures.

Correlations with cultural distances
In the previous analyses, students were binary categorised as belonging to the same or different groups. To further explore choice patterns related to immigrant students' country of origin we have performed analyses where we allow for a continuous categorisation, using a "cultural distance" measure from the World Values Survey. Cultural distance, generated from the results of [66], is a measure of how far apart countries are, as documented by survey data on the populations' attitudes to survival versus self-expression values and traditional versus secular-rational values. This measure allows us to explore, for example, if immigrant students from Sweden and Denmark are more similar in their educational choices than immigrant students from Sweden and Pakistan, and whether social choices are more likely between the first pair of

PLOS ONE
students. We use the same methods as above, with the only difference being that the graph representing the independent variable is now weighted (i.e., the adjacency matrix has the values of the cultural distances instead of 0 and 1).
For each pair of students, we measured the cultural distance between them, based on their countries of origin, as the independent variable, and their educational choice, as the dependent variable. We excluded pairs of students with zero cultural distance, that is, those of common origin. The result was a correlation of ρ w � 0.012 (N = 1, 099, n = 175, p � 0.30). Cultural distance thus does not seem to be a strong predictor of similarity in curricular choices. We also performed the analysis including pairs of students of the same origin. The result was a smaller effect than that of dichotomously defined within-and between-group pairs based on country of origin, suggesting that students do not on average choose more similar to students from countries that are culturally close than to other students, at least not by this measure of cultural distance.

Immigrants versus natives
We performed an analysis where all immigrant students were grouped together and where we also included the natives. A pair of students are considered to belong to the same group if they are both natives, or both immigrants. The within-class correlation is then ρ w � 0.014 (N = 30, 300, n = 580, p < 0.001). The between-class correlation is ρ b � 0.011 (N = 33, 202, n = 174, p < 0.001) (see Fig 5). While the sample is large enough for these small effects to be statistically significant, the difference between them is not. This is consistent with our previous finding on cultural distance: similarity in curricular choices pertain to students from the same country of origin. We thus do no find any evidence of endogenous preference formation related to higher-order levels of shared origin, such as culturally similar countries, or being immigrants (regardless of country of origin), as compared to being natives.

Other demographic and socio-economic variables
We also investigated the impact of gender, parental education and parental income. Restricting the analysis to immigrants gives us a good opportunity to check for potential confounding factors in our main results. After this, by including all students (i.e., also natives), we extended the analysis beyond ethnicity to see whether similar clustering takes place also for other individual characteristics.
We chose number of years of father's education as the measure of parental education, and the average income of both parents as the income measure. If we lack information on one of the parents' income, then the average is simply the other parent's income.
While gender is a straightforward dichotomous variable, years of education and income are scale variables. In these latter cases, then, we calculated the distance between each pair of students by taking the absolute difference between the respective measures. The results are presented in Table 1.
Restricting the analysis first to the group of immigrants, the dataset is comparable to those in our previous studies. We find that there are no significant effects in the between-class analyses, nor in the differences between the within-and between-class correlations. Within classes, the effects are significant at the 0.05 level for gender and income, but not for education. Thus, our results are mainly inconclusive as to whether there are discernible effects among immigrants with respect to these demographic variables, which contrasts the demonstrated effects of country of origin. We conclude that the ethnicity effect cannot be explained by gender, parental education or parental income, and that the ethnicity effect appears to be more important.
Performing the same test, but now including all students (i.e., also natives), we find that all effects are significant. Again, the samples are obviously considerably larger (29,667-34,374 students, compared to 1,012-2,037 students with immigrant background). With such a large sample, the simulated probability distributions for the three different variables are highly similar, making the correlation coefficients roughly comparable. The largest coefficient is that for gender; we have two thirds of that effect for father's education, and one third for parental income. At the same time, most of the effect for gender seems to be explained by common exogenous preferences, while there is some evidence of endogenous preference formation based on common parental education, and, possibly, also parental income.
For robustness, we also tested the influence of varying class sizes and number of available educational choices. We conclude from the results in the Appendix on robustness that the results are largely robust and that the major effects we have found are not dependent on the choice of including or excluding small classes with fewer choices.

Summary
In this paper, we suggest a new method for identifying endogenous preference formation based on specific characteristics. The method can be applied when we have large amounts of data that allow for a 'control' and 'effect' design. The studies are correlational, and we do not reconstruct actual social ties, but we identify added propensities for social clustering based on common characteristics in other variables.
We used this method to explore segregation at the micro-level, that is, we have analysed students within and between and school cohorts, to detect patterns of peer influence. In particular, we have explored if students make more similar choices of curricular specialisation to peers of the same country of origin as compared to other students, and whether this is a result of endogenous preference formation in classes. To do so, we differentiate between (a) exogenous preferences, that is, choices related to common inherent preferences (such as countryspecific preferences, and going to the same school), and (b) peer influence, that is, endogenous preferences related to social processes within the class, as well as preferences for being together in class next year. We are, however, not able to differentiate between the two last types of endogenous mechanisms, which we refer to as peer influence. Table 1 summarises our main findings.
Comparing within and between educational cohorts, our results show that immigrant students' educational choices correlate with common country of origin, and, to some extent, this is also the case for descendants of immigrants, categorised by their mother's country of origin. The differences between these measures, within and between cohorts, are also significant, giving us a measure of endogenous preference formation, which is the net effect that measures peer influence in class. The effect is possibly larger among female students. Further, the effect is significantly larger within classes for immigrants than descendants, but not between classes, which suggests that there is stronger ethnic clustering in the choices made by immigrants than descendants, while preferences from home may be similar.
We also investigated whether the domain can be extended to include pairs of students from culturally similar countries, but categorising students by cultural distances was not predictive for educational choices. This result is largely consistent with what [67] found.
Finally, we compared all immigrants with all natives. While educational choices do correlate with being an immigrant, this correlation can be attributed to immigrants more often making similar choices in general rather than to peer influence. Again, social boundaries have often been found to be larger between some immigrant groups than between immigrant groups and natives [67].
From these observations, we draw the conclusion that there is clustering of peer influence among immigrant students, and to some extent descendants of immigrants, based on shared country of origin, and the results confirm that the level of analysis should be the country-level, as also suggested by [68].
While there is more evidence for these effects among female than male students, there appear to be no significant endogenous preference effects based on only gender, nor on having similar education or parents with similar income, in the group of immigrants. It should be noted that for example the group of girls in a class is large and that we are measuring average effects. It might be less likely that all girls influence each other than students of a more confined group, and the conclusion from this is not that gender is unimportant, but that it is not the right level of aggregation for peer influence.
While we found no significant effects on curriculum choice over and above exogenous effects, same gender has previously been shown to be important in social tie formation [52], and given the larger ethnic effect among girls, there may be an interaction effect. In addition, the observed patterns may potentially be confounded by other factors, such as school performance [55]. A previous study, however, using data on students in English, German, Dutch and Swedish schools, found that cultural and socioeconomic differences did not explain intraethnic homophily in friendship patterns [53]. As it currently stands, though, the presented method does not enable us to measure the effects from one variable directly controlled for another. For this purpose, we call for further methods development.
Looking at the whole population, including natives, vastly multiplies the number of observations. The results showed that there is a gender-based curricular choice similarity, but that these can be ascribed to common exogenous preferences. Looking at students' social origin, however, we found some evidence of endogenous preferences based on similar background with respect to parents' education and income.
In sum we interpret the differences between choice of curriculum within cohorts and choices between cohorts, as an indication of ethnic peer influence. More specifically, a graph over peer influence will more likely have cliques based on country of origin than the other categories we have investigated, such as gender. Comparing curricular choices within and between cohorts, we find on the aggregate scale that students behave more similarly based on their background. Our design shows that this is not likely to be exclusively an effect of having a common background as such. Rather, we would argue, students who are exposed to each other have the potential of affecting each other, and students with common social characteristics make similar choices to a greater extent than students of different backgrounds. We believe the simplest explanation to be an increased social exchange between the students.
Still, there are alternative explanations. One possibility is that other processes taking place in the classroom are driving the similarity in choices. One important example of this type of mechanisms, would be if individual teachers are having an effect on later choices within the class they are teaching. If this is the case, then the teacher effect would have to be minority-specific to explain our findings; that is, we would expect that teachers would influence students with the same ethnic background in ways that make their choices more similar. On the other hand, if teachers have a general effect on all their students, which we consider plausible, then the teacher effect would not increase the correlation coefficient or risk ratio for students of shared origin. Still, if teachers are in fact influencing their students in this way, then this will arguably lead to greater variation in effect sizes across different classes. We tested this, and Fig  6 in the appendix shows that while there is substantial variation, the effect sizes are less spread in classes with higher weight in the overall analysis. Another prediction from teacher effects, and other similar effects within cohorts, is that between-class correlations should be higher when comparing years close in time. (However, this could also be driven by substantial changes in school conditions over time, so it would not alone be evidence of teacher and similar effects.) We have measured the effects at a more disaggregated level in the Appendix section Disaggregated data, finding no clear pattern that between-class correlation coefficients drop off significantly when considering years further apart. This suggests that our results are not driven by a teacher effect, or other cohort-specific effects.
Another theoretical possibility is the existence of streaming, where students are administratively sorted into different classes based on their background characteristics. This is not likely to be important, as assigning students in secondary school based on ethnicity has been declared unlawful in the Norwegian school system, based on national and international discrimination laws (see e.g. an announcement by The equality and anti-discrimination ombudsman in Norway, ref. 12/186-10, www.ldo.no/globalassets/arkiv/uttalelser_pdf/2012/ 12_186.pdf).
Finally, these educational choices also determine who will be the students' future classmates. Surveys and interview studies find friends to be an important [69] or even the most important factor [70,71] in choosing upper secondary schools in Norway and Sweden. We would expect that choosing who will remain classmates in the second and third year at school should have an even stronger social component.

Discussion
The mechanisms related to homophily and endogenous preference formation are important to document, yet often difficult to explore empirically. We have in this paper developed a method to identify such effects in individuals' choices.
Our main results suggest some social clustering of immigrant students at school, based on country of origin (but not cultural similarity). This applies in particular to female immigrant students.
In the Introduction we suggested three relevant social mechanisms: homophily, language difficulties, and discrimination or harassment. Our findings are in line with the homophily mechanisms: small social distances and perceived affinity or nearness between people contribute to group-based similarities in educational choices. Here we found evidence of endogenous preference formation when groups were defined by country of (own or parents') origin.
For immigrant students, language problems is also a likely explanation. Immigrant students who are weak in Norwegian but fluent in another common language may prefer to stay together in class. Descendants of immigrants born in Norway are likely to master both Norwegian and their mother's language, and may therefore interact more equally with natives and each other, which is in line with a smaller effect for descendants of immigrants.
Finally, the premise of a resistance strategy is that majority students categorise minorities from particular countries of origin into outgroups based on stereotypes. If this was a dominant mechanism, however, then we would have expected to find more evidence of similar choices among immigrant students from countries of origin with small cultural distances (such as Muslims).
We would expect our approach to be relevant for other topics of investigation as well. One obvious example would be when students are choosing educational tracks later in their educational careers. Often students have to move geographically to attend higher educational institutions. If students want to stay together, then we would expect social clustering to be of relevance for their choice of institution, but not necessarily choice of study programme or discipline. If students have influenced each others' preferences, then we would expect social ties to also influence their choice of study programme or discipline. In this case, it may thus be possible to also disentangle endogenous educational preference formation from choices based on friendship.
Generally, many decisions are affected by social clustering, also outside the educational domain (such as mobility within jobs or between firms, marriage decisions and families' decisions on where to move). What is necessary for the proposed method to work is that the data allows for comparisons within and between cohorts, and where individuals within a certain setting or institution face an overlapping set of choices. One example could be co-workers who started in a firm at the same time and who face decisions of whether to stay or leave, as well as what type of firm to leave for, if applicable. Such mobility decisions have been found to be affected by peer influence based on shared characteristics in a similar way to what we find in the current study [72,73]. The theoretically challenging task would be to carefully delineate under what social conditions we might expect social influence to be of direct relevance for an individual's decision-making.
In this study, we have conducted separate analyses where choice is potentially dependent on a number of different independent variables. Further development of the method includes finding reliable measures for comparing effect sizes between different aggregated analyses and generalising it to allow for multivariate and multivariable models, for example to study gender as contrasts or isolating effects from socioeconomic variables. Table 2. Distribution of students over specialisation tracks. The twenty most popular specialisation tracks and their programme (abbreviated), with proportional share of students registered on each for each of the five years in our study. All figures are in percent, and include both natives and immigrants.  Table 3. Origins of immigrants and descendants. The twenty most common countries of origin for immigrants and mother's origin for native descendants, with proportional share in the population from the country, and share of students on the specialisation tracks natural science (N); language, social science and economy (S); and other (O), respectively. The total is the summed share from the twenty most common countries, and shares in the whole set of immigrants and descendants, respectively, on the respective specialisation tracks. Note that school classes with too few immigrants to perform our analyses have been excluded. Including the whole population, not only those in our analysis, would increase the figures for "other" specialisations. All figures are in percent.

Disaggregated data
The effect sizes should be compared to the simulated distributions for drawing conclusions.
To give an overview, however, of what the disaggregated data looks like, the distribution of correlation coefficients in each class (i.e., not controlled for structural dependencies) is given in Fig 6. The within-class distribution has (nonweighted) mean μ w � 0.072 and variance s 2 w � 0:056, and the between-class distribution has mean μ b � 0.033 and variance s 2 b � 0:027. The figure also shows the effect sizes plotted against their weights in the meta-analysis. Even if the variability is high over all classes, the effect sizes are less spread for classes of high weight.
We have included five years of data to get a sizable dataset that can provide a stable measure not too dependent on a particular year, while at the same time excluding comparisons so many years apart that school conditions may have changed considerably (see e.g. [74]). Table 4 presents within-and between-class correlations for analyses of choices based on country of origin, and based on individual years and pairs of years, respectively. Note that the samples are fairly small (approximately 250 immigrant students per year).
If school conditions influencing the between-class effect change rapidly, then we would expect a clear pattern where the between-class coefficient drops off quickly, the further apart the years are in the between-class comparison. Such effects could include a restructuring of specialisations in large schools or special targeting towards students of a certain background, for example through teachers involved during a short time period. The year 2006 potentially hints at a decreasing effect (though only the between-class comparison to 2008 is significant, while that to 2007 is not). For the other years there is no clear such pattern, especially when taking level of significance into account.
There is no clear trend for the within-class correlation coefficients. We do not know of any explanations beyond random variation for the drop in 2009.

Alternative designs
In the present analyses, the graphs studied include only immigrant students, and thus, the graph structures we control for do account for statistical dependencies, but since these are only subsets of the entire classes, we do not use all the data available. The reason for this is that natives constitute a large majority of the data, so including them in the effect measure would lead to an estimate mainly measuring links of natives versus links of a native and an immigrant. It is possible, however, to include natives when computing the variances used for weighting classes, but at the same time including only immigrants when computing the correlation coefficient. In practice, this means that we first include the entire class. Then we randomly permute nodes (as in QAP), and remove the natives from the graphs only before computing the correlation coefficient.
Using this design for our main study on country of origin produced similar results. The within-class measure is ρ w � 0.063 (p < 0.001), the between-class measure is ρ b � 0.024 (p < 0.001), and the difference ρ w − ρ b � 0.039 (p < 0.01). Making within-class comparisons of immigrants puts higher restrictions on the ethnic compositions in classes than making between-class comparisons, since, in order to compute correlations, there needs to be at least one pair of students from the same country. As a result, there are more students included in the between-class analyses. In case there would be reason to suspect data selection to influence the measures, we performed an analysis on the subset of students included in the within-class design. This produced almost exactly the same betweenclass measure as before, ρ b � 0.019 (p � 0.019, N = 916). Note that the smaller N gives a higher p-value with the same correlation coefficient. Also, not every student is included in the analysis, since the restricted dataset does not always make it possible to do cross-year comparisons.
In conclusion, the alternative designs produce qualitatively similar results.

Robustness to diversity between classes
Including all school classes in the analysis without restrictions, we have large diversity between classes with respect to available educational choices and overall popularity of each choice. Also, small classes are more homogeneous, in that a vast majority of the students make the same choice, probably due to a limited availability of different specialisation tracks. Our analytical design, utilising QAP and random-effects meta-analysis does allow for diversity between classes and weights are adjusted accordingly. Nevertheless, combining these approaches is a novel design not well documented in the literature, and as a robustness check, we repeated the analyses restricted to large schools, specifically to classes/educational cohorts within the same schools that had at least 50 students (75% of all classes in our analysis), including only students going to specialisation tracks with at least 10 students enrolled. The rationale for this threshold is that smaller classes are often more homogeneous within the group, with a large majority of students making the same choice, and vary more in the diversity in choices between the classes, thus making the classes less comparable. As can be seen in Fig 7, showing similarity in choices by class size, in smaller classes students often only have one choice (classes with a 100% similarity). The results are similar to the cases where we included all classes. The major difference is that a few small effects from the previous studies are no longer significant. Such a change needs not be ascribed to structural differences in classes when including or excluding small classes, but is also consistent with the fact that the sample is smaller in the latter case. Overall, the results change only little. The changes are that the ethnicity effects are no longer significant for men (ρ w � 0.054, p � 0.091 and ρ b � 0.026, p � 0.12 for immigrants, and ρ w � 0.015, p � 0.12 and ρ b � 0.0098, p � 0.10 for descendants), while the difference becomes significant for female descendants (ρ w − ρ b � 0.025, p � 0.035). The between-class coefficient with respect to income is no longer significant (ρ b � 0.00098, p � 0.13), nor within classes for immigrants (ρ w � 0.016, p � 0.064). Finally, the small differences with respect to sex, education and income for all students have changed their respective significance levels (with coefficients 0.0058, 0,0034 and 0.0063, and p-values 0, 0.041 and 0.0070, respectively).
It can be noted that most classes in our analysis are fairly large, since those are the ones with sufficient variation in the data. For 125 classes, 25% have fewer than 52 students, 50% have fewer than 98 and 75% fewer than 138. The mean is 97 students. It is possible in the largest cohorts that not all students have had opportunities to interact with everyone, which in those cases should make the test more conservative and bring the effect closer to the betweenclass coefficient.