Developing SENSES: Student experience of non-shared environment scales

Twin and adoption studies find that non-shared environmental (NSE) factors account for variance in most behavioural traits and offer an explanation for why genetically identical individuals differ. Using data from a qualitative hypothesis-generating study we designed a quantitative measure of pupils’ non-shared experiences at the end of formal compulsory education (SENSES: Student Experiences of Non-Shared Environment Scales). In Study 1 SENSES was administered to n = 117 16–19 year old twin pairs. Exploratory Factor Analysis yielded a 49-item 10 factor solution which explained 63% of the variance in responses. SENSES showed good internal consistency and convergent and divergent validity. In Study 2 this factor structure was confirmed with data from n = 926 twin pairs and external validity was demonstrated via significant correlations between 9 SENSES factors and both public examination performance and life satisfaction. These studies lend preliminary support to SENSES but further research is required to confirm its psychometric properties; to assess whether individual differences in SENSES are explained by NSE effects; and to explore whether SENSES explains variance in achievement and wellbeing.


Introduction
It has long been established by twin and adoption studies that non-shared environmental (NSE) effects explain variance in most behavioural and psychological traits, particularly after the preschool years e.g. [1,2,3]. NSE effects are those that make siblings brought up together differ from each other. They are uncorrelated with genetic effects and, for this reason, represent potentially interesting targets for intervention. However, it has proved almost as difficult to identify the specific experiences that can explain NSE variance as it has been to identify the specific genes that explain genetic variance [4]. We have both a 'missing heritability' and a 'missing environments' problem [5,6]. The current study was motivated by a need to identify measured environments that can explain variance attributable to NSE in educationally relevant behaviour, such as achievement and wellbeing, with a view to possible intervention. PLOS  It has been difficult to identify NSE experiences because NSE effects, like genetic effects, tend to be many, small and involved in dynamic relationships with genes and other experiences. A further difficulty is that NSE effects include measurement error and one possibility is that they represent nothing but measurement error. However, this seems unlikely as some specific NSE effects have been found to explain small proportions of variance and, indeed, to show a degree of stability over time [7]. The largest body of research in this area has focused on parenting and has found effects that explain small proportions of variance [8] and which, in some studies, explain more variance at the extremes [9]. However, it remains possible that the measures of non-shared environment used in prior research do not explain much NSE variance because they do not accurately measure individuals' experiences. It is important, therefore, to take a closer look at students' experiences of the world in which they learn, and their perceptions of those experiences.
To that end, the current study was preceded by a qualitative hypothesis-generating MZ twin differences study designed to explore in detail the non-shared experiences of young people preparing for the public examinations (General Certificates of Secondary Education: GCSEs) taken by most UK pupils at age 16 [10,11]. The focus was on MZ twins because behavioural or psychological differences between MZ twins cannot be explained by shared genes (because they can be assumed to have identical genotypes, albeit with a small chance of mutation) or shared environmental effects, and must therefore be explained by NSE effects, including measurement error. By asking MZ twins about differences between them in educationally relevant behaviour we were able to develop testable hypotheses about potential NSE influences at a transitional time, the point at which UK pupils make choices about further education, training and employment. The educationally-relevant traits we focused on were achievement and wellbeing (life satisfaction) as these are particularly salient variables at a time when young people are making important, potentially life-changing, decisions about their next steps in education or employment. Emerging hypotheses related to factors including perceived differences in teacher quality, teacher-pupil relationships and individual effort. Although psychology has already identified such factors as important correlates of achievement, this study was novel in suggesting that they may explain NSE variance specifically. The current study was designed to develop a quantitative measure that would make it possible to test these genetically-informed hypotheses.
We know that NSE factors, including measurement error, explain one-fifth of the variance in GCSE performance [12,13]. We therefore expected that families' explanations of why one twin performed better than the other in their examinations could explain a maximum of 20% of variation in exam performance, and probably considerably less given that measurement error is likely to play a significant role [14]. The study also focused on participants' well-being and we know that there is more NSE variance to be explained here. In a recent meta-analysis, for instance, Bartels [15] found that genetic effects explained 32% of the variance in selfreported life satisfaction and 36% of the variance in feelings of well-being. The remaining variance in both types of measure was explained by non-shared environmental factors. Identifying NSE influences on well-being therefore represents an important challenge.
The current research had two main aims: (1) to design a measure that reflected qualitative accounts of NSE experience at this transitional time; and (2) to assess the factor structure, reliability and validity of this newly developed measure. We aimed to make a useful contribution to research in this area by developing a measure that can explain a proportion of environmental variance in educationally relevant traits in late adolescence. Because NSE effects are uncorrelated with genetic effects such a measure may feasibly form a useful basis for environmental intervention in the future. data are collected. Furthermore, there are data governance issues stipulated by Ethics Review Board and Executive Committee of the TEDS study, which require a sensitive handling of the data for the primary study analysis purposes, which the study participants have originally consented to. In order to ensure that the participants continue their longitudinal participation, we need to be sensitive to these governance issues. For these reasons we cannot publicly share our data. However, researchers can contact the TEDS PI, Professor Robert Plomin, to request data access on: robert. plomin@kcl.ac.uk. TEDS has openly provided data for re-analyses and meta-analyses in the past, but has done so in accordance with the confidentiality stipulations set by the Ethics Review Board and the Executive Committee, which require the governance of this process to rest with the TEDS principal investigator.

Aims
The aims of Study 1 were threefold: ■ To develop an item pool based on data collected in an earlier qualitative phase of the project.
■ To reduce the number of items needed to measure experiences that may explain NSE variance in educationally-relevant outcomes.
■ To extract underlying factors and assess reliability.

Method
Participants. Participants were drawn from the UK Twins' Early Development Study (TEDS). TEDS is an on-going longitudinal study of three cohorts of twins born in 1994, 1995 and 1996 (16). The TEDS sample has been shown to be reasonably representative of the UK population of same-age adolescents and their parents [16,17]. 300 twin pairs were invited to take part and data were gathered from n = 115 pairs and 2 unpaired twins (n = 117) who provided informed consent, 58 dizygotic pairs (62% female) and 57 monozygotic pairs (58% female) plus one dizygotic male twin and one monozygotic female twin. Twins were provided with a detailed information sheet and questionnaire completion indicated their consent to participate. Data were subsequently received from a further 6 pairs, but too late to be incorporated into analyses. Participants' ages ranged from 16 to 19 (M = 18.28).
Measure development and procedure. In an earlier phase of this project we gathered qualitative questionnaire data from n = 497 pairs of MZ twins (61% female) and interview data from n = 95 of these pairs (10)(11). These twin pairs were all participants in TEDS and were all aged between 16 and 19 (M = 17.3). Both questionnaires and interviews asked MZ pairs, and one parent from each family, to describe and explain differences between them in a range of traits including GCSE achievement and wellbeing. This represented an attempt to generate new hypotheses about NSE influences on young people approaching the end of their formal compulsory education.
We drew on this rich dataset to build up an item bank for the current study. We prepared 175 draft items and revised them after conducting a small feasibility study with n = 6 young people aged [16][17][18][19]. We then administered the items to our Study 1 sample of n = 117 twin pairs (n = 234 individuals). This initial questionnaire aimed to comprehensively represent the breadth and depth of our qualitative data and was therefore very long. We engaged in a process of extensive data reduction once the data were collected.
We organised our data into two related samples, Sample 1 and Sample 2, with one twin from each pair (randomly selected) represented in each. We identified items that could be excluded on the basis of data from both samples. More specifically, items were excluded for the following reasons: This initial data reduction process allowed us to discard 93 items, leaving 82. We then conducted Principal Component Analysis (PCA) with Sample 1 for the sole purpose of further data reduction, that is, not for factor extraction [18]. PCA suggested the exclusion of 33 further items on grounds of cross-loadings, that is, items having a loading of 0.4 or higher on more than one component [19] and also having a lower than 0.2 loading difference between the primary and alternative factors [20]; or the clustering of fewer than three items i.e. too few to constitute a viable factor [21]. This data reduction process left us with 49 items with which to measure NSE influences on young people preparing to leave school (the PCA process suggested using 48 items and the reasons for retaining 49 are discussed later). All items used a 5-point Likert response scale ranging from 1 = 'not at all true' to 5 = 'very true'. These 49 items make up the SENSES measure.
Analysis. Sample 2 data were used for the purposes of Exploratory Factor Analysis (EFA). Principal Axis Factoring extraction and promax rotation (kappa set at 4) were used to extract factors. Principal Axis Factoring is the most widely used method of factor analysis in the social sciences [22] and oblique rotation methods were suggested because some factors were expected to be correlated [23].

Results
We began by assessing the suitability of our data for factor analysis. The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy was .72 and therefore higher than the suggested cut-off point of .60 [24]. Furthermore, Bartlett's test of sphericity, χ 2 (1176) = 4268.80, p < .05 showed that the correlation matrix was not an identity matrix and was appropriate for factor analysis. We therefore proceeded to EFA.
In order to identify the underlying factor structure we considered both Kaiser's eigenvalues >1 and scree plots of the data [25]. In both cases they yielded 9 factors. However, an eigenvalue of 0.935 for a tenth factor led to the decision to explore both 9 and 10 factor solutions. The research team ultimately decided that the 10 factor solution was conceptually more reasonable and therefore took the 10 factor solution forward to the main study. The 9 factor solution joined perceptions of self in Maths and Science in a single factor whereas the 10 factor solution distinguished between perceptions relating to the three core GCSE subjects. The 10 factor solution explained 63% of the variance in participants' scores and can be seen in Table 1.
It is interesting to note that while some of these factors relate to experiences that can relatively easily be classified as 'environmental', such as teachers, social media and the influence of family and work experience; others describe aspects of behaviour such as effort and confidence that would not typically be considered as environmental influences. However, it is important to note that behavioural discordance in MZ twins must have non-shared environmental origins, that is, if one MZ twin is more confident or hard-working than the other this has to be for environmental reasons, and that such discordant behaviour could have NSE effects. For this reason, we retained both types of factor and both types of item. In some cases it is difficult to draw a clear line between experience and behaviour. For instance, does an individual's perception or interpretation of an event, such as an interaction with a teacher, represent environment or behaviour? Our focus here was on individual experiences (including perceptions) that can explain NSE variance. Table 1 also shows a sample item for each factor, the number of items representing each factor, the proportion of variance explained and Cronbach's alpha. It can be seen that all factors showed good internal consistency with Cronbach's alphas ranging from .76 to .92. The English (perceptions of self and teacher) factor explained the largest amount of variance (19.15%). The combined variance explained by the split Science and Maths factors was a little lower (11.85% for Science and 11% for Maths). Items relating to Maths and Science did not combine in a single factor as they did for English. Effort during GCSE courses accounted for a similar amount of variance (9.56%). Factors relating to social media and to future plans explained smaller proportions of variance (<4%). Table 2 shows that factor pattern coefficients suggested all of the items had factor loadings above .40 [25] (See Table 2).
The 49th item (Item 5.5)-I was interested in what we were studying in maths-was found to have close loadings (differences between primary and alternative factor loadings were less than .20) on 3 different factors (.44 on maths self-perception but also .34 on perceptions of maths teacher and .31 on science self-perception). However, it was noted that the equivalent item for English and Science loaded only on the appropriate factors i.e. perceptions of self in English and Science. In order to maintain consistency of items across three subject domains it was decided to take the 49th item forward to Study 2 for further testing. It has been suggested that the consequences of over-factoring are usually less marked than the consequences of underfactoring [18] and this approach allowed us to test the items and the factor structure with a larger sample, yielding more trustworthy results. Furthermore, it is generally agreed that substantive considerations should be taken into account alongside statistical considerations in EFA [26]. Pat.

Discussion
The 10 factors yielded by our multi-stage process were conceptually reasonable, representative of the qualitative data gathered in an earlier phase of this project and they showed good internal reliability. The scales showed both convergent item validity (high factor loadings on a relevant scale) and, with one exception, divergent item validity (low factor loadings on other scales).
The major limitation of this pilot study was that the ratio of items to participants was too great (82 items for PCA and 117 participants i.e. one twin per pair in each sample). However, subsequent analysis found that our data were suitable for factor analysis. A further issue to consider in Study 2, where this problem was eliminated, is that the decision to proceed with a 10 factor rather than a 9 factor model was made on conceptual rather than statistical grounds. We need to ask whether the 49-item, 10 factor structure is confirmed. These issues are revisited in Study 2 and in the General Discussion.

Study 2
Aims ■ Conduct Confirmatory Factor Analysis (CFA) to evaluate the construct validity of SENSES.
■ Explore external validity via correlations with GCSE performance and self-reported life satisfaction.

Method
Participants. Participants for Study 2 were also drawn from the Twins' Early Development Study (TEDS) [16,17]. We invited twins in 2165 families to participate and received SENSES and life-satisfaction data from n = 926 families (53% MZ). Twins were provided with a detailed information sheet and questionnaire completion implied consent. This approach was approved by our institutional ethics committee. In 908 cases we received data from both twins in the pair and in 18 cases, only from one. Data were gathered, therefore, from n = 1834 individuals (Mean age = 18.4; Range = 17 to 19; 61.6% female). Of these, n = 1672 participants had previously provided us with academic achievement data and, in all but 3 cases, this included General Certificate of Secondary Education (GCSE) data. In the remaining three cases alternative examinations to GCSE were taken. The sample was not fully representative of the UK population, or of the original TEDS sample. The relatively increased proportion of girls (from close to 50% at first contact) is broadly representative of TEDS data at age 16, but not of the UK population. This discrepancy may be the result of a greater willingness to engage with data collection among girls than boys at this age. Furthermore, standardized SES was higher in this sample than in the full TEDS sample (M = 0.31), and, more surprisingly, standardized g scores (measured at age 12) were slightly lower (M = -0.12). These discrepancies may be due to sample selection effects.
Measures. The 49 item SENSES measure developed in Study 1 was administered to Study 2 participants (n = 926 twin pairs). Data had previously been gathered on GCSE results and were also gathered on self-perceived life satisfaction using a well-validated five item measure of global life satisfaction [27]. Items included 'In most ways my life is close to my ideal' and 'If I could live my life over, I would change almost nothing' and used a 5 point response scale from 1 = Strongly disagree through to 5 = Strongly agree. In the current study α = 0.86.

Procedure.
Questionnaires were posted to twin pairs along with an information sheet and separate envelopes for individual questionnaires so that twins could retain their privacy. Because the twins had already completed their GCSEs they were specifically asked to think back to Year 10 and 11 (when they were taking GCSE courses) when responding to items. Twins who returned completed questionnaires received a £5 gift voucher each and were entered into a prize draw with a chance of winning a pair of iPad minis.
Analysis. Confirmatory Factor Analysis (CFA) was conducted using the maximum likelihood estimation method in LISREL 8.80 [28] with SIMPLIS command language. One twin per pair was randomly selected for CFA and invariance analysis was conducted with this sample and the co-twin sample. Correlations between SENSES and our measures of achievement and life satisfaction were conducted. All analyses were replicated with the co-twin sample.

Results
CFA found an acceptable model fit to the data ( 2 (1082) = 4992.29, p < .05; CFI = .93; NFI = .92; SRMR = .053 RMSEA = .071; 90% CI = .070, .073) [29,30]. All items loaded significantly on their intended factors and standardized parameter estimates (Lambda X) ranged from .49 to .92 (See Table 3). Here we want to emphasize that the 49 th item (i.e. I was interested in what we were studying in Maths), which we retained in the interests of retaining consistency across domains, had a sufficiently high factor loading of. 77. Therefore, this analysis with a larger sample supported our decision to retain this item.
We looked at correlations between our 10 factors and found an average correlation of r = 0.16 (range = .01 to .69). This suggested that the most factors showed low or no levels of correlation and can, therefore, be reasonably considered to be measuring different things (See Table 4).
Three correlations were exceptions to this pattern. The correlation between SCIENCE 1 (Perceptions of Self) and SCIENCE 2 (Perceptions of Teacher) was r = .69, p < .05. A similar pattern was also observed for Maths in that MATHS 1 and MATHS 2 correlated r = .64, p < .01. Finally, self-perceptions in science (SCIENCE 1) also correlated r = 0.46, p < .05 with MATHS 2 (perceptions of Maths teacher). These correlations and their implications for the SENSES measure are discussed later. Table 5 shows means, standard deviations and reliability coefficients for the 10 factors. We also looked at factorial invariance across the two samples (one twin per pair in each sample group) in order to cross-validate the 10 factor model. This was achieved by conducting five multi-group CFA models. Firstly, a baseline model was tested in order to examine whether both samples conceptualised the constructs in a similar manner. In the remaining models factor loadings, factor variance, factor covariance and variance of error terms were constrained to be equal across the two samples and invariance between the models was compared (See Table 6) Comparisons of each model found that chi-squared differences were non-significant. Furthermore, ΔCFI between constrained and unconstrained models were less than .01, as  [31]. In summary, invariance testing supported the factorial invariance of SENSES across our two related samples. External validity. In order to assess the external validity of the SENSES measure we looked at correlations between the 10 factors and both GCSE achievement in English, Maths and Science and self-reported life satisfaction (See Table 7). GCSE achievement. In general, the domain specific factors (Perceptions of Self and Teacher in English, Maths and Science) were significant correlates of GCSE achievement in English, Maths and Science respectively. The ENGLISH factor correlated r = .39, p < .001 with GCSE English but only r = .11, p < .01 with Maths and r = .10, p < .01 with Science. Likewise, SCIENCE factors correlated more strongly with Science achievement than with English or Maths achievement, and MATHS factors correlated more strongly with Maths achievement However, the remaining four factors yielded few significant correlations with GCSE achievement and those that did achieve statistical significance ranged from r = -0.07, p < .05 for the correlation between self-confidence about the future (PLANS 2) and Maths achievement, and r = -.11, p < .01 for the correlation between Social Media and Science achievement.
Life satisfaction. Correlations between the SENSES factors and our measure of life satisfaction were, with one exception statistically significant but mainly weak, ranging from r = .05 (NS) for PLANS 1 (family influence) and r = .08, p < .05 for PLANS 3 (work experience) through to a moderate correlation of r = .48, p < .001 for PLANS 2 (self-confidence about the future). The average correlation was r = .13.

Discussion
Study 2 did not have the principal limitations of Study 1 in that we administered a questionnaire with fewer items (49 compared with 82) to a larger sample (n = 926 twin pairs compared with n = 117 twin pairs). It was therefore pleasing to note that CFA confirmed the factor structure that emerged from Study 1, and justified our decisions to use a 10 factor structure (rather Developing SENSES: Student experience of non-shared environment scales than a 9 factor structure) and to retain 49 rather than 48 items. It was also seen that all 10 of the SENSES factors retained good levels of internal reliability. Furthermore, invariance testing cross-validated the model across our two related samples. One area of concern was that although most factors correlated at the level of r</ =~.3 there were three exceptions. The two Maths factors correlated r = .64; the two Science factors correlated r = .69 and self-perceptions in Science correlated r = .46 with perceptions of teacher in Maths. We also know that the same variables in relation to English did not split into two separate factors and the pattern is therefore inconsistent across the three GCSE subjects. This raises the possibility that either English should be split into two factors or that the currently separate but correlated Maths factors should be joined (also true for Science). It should remain a consideration that in the 9 factor structure suggested by EFA self-perceptions in Maths and Science were found to cluster on a single factor. This issue can only be satisfactorily resolved through further testing in different samples. We will not be able to reasonably claim the SENSES instrument is robust until we have tested it in more populations. However, in the meantime, it can be considered positive that ENGLISH factors correlated most strongly with English achievement, MATHS factors with Maths achievement and SCIENCE factors with Science achievement, suggesting external validity for the existing sub-scales.

General discussion
The psychometric properties of the SENSES measure appear promising and indicate that it, or sub-scales from it, can make a useful contribution to research. However, some issues remain to be resolved in future research with different samples. Only by conducting further validation research will we gain confidence that we have identified the optimal factor structure. As the measure stands, domain specific factors show moderate correlations with domain specific GCSE achievement; and our measure of self-confidence about the future (PLANS 2) shows a moderate association with self-reported life satisfaction in late adolescence.
Four of the SENSES factors did not correlate with either GCSE achievement or life satisfaction. However, items were developed on the basis of discordance in a wider range of educationally relevant traits than this, and it is possible that these four factors could correlate with, for example, measures of occupational success, vocational interests or peer relationships. When undertaking further validation work with the SENSES measure it will be important to explore relationships with other variables.  Developing SENSES: Student experience of non-shared environment scales It is important to note that some of the SENSES factors target traits such as effort or beliefs such as self-confidence about the future rather than environments per se. The suggestion from the qualitative data is that these factors differ between monozygotic twins and lead to nonshared outcomes. However, this discordance remains to be explained by differences in experience.

Limitations
We have mentioned the limitation that while we gathered data on a wide range of traits in order to develop the SENSES measure we were only able to test it in relation to GCSE achievement and self-reported global life satisfaction. It would be interesting to explore relationships between SENSES factors and other variables, not measured in the current study, such as personality, peer relationships and mental health status. This can be achieved as we continue to test the validity of the SENSES measure as a whole, and individual sub-scales from it. Our study is also limited by a cross-sectional design that cannot speak to direction of effects or identify reverse causation. Furthermore, we have not yet established whether SENSES can do what it aims to do, that is, to explain NSE variance in outcomes including GCSE achievement and life satisfaction.
A particularly major limitation of this research is that it relied on retrospective data. Specifically, participants already knew their GCSE results when they provided data about their learning experiences during the GCSE course. This may have coloured their view of the GCSE experience in either positive or negative ways. Testing the SENSES measure with a sample of 14-16 year old UK pupils could address this concern.

Future research
The top priority for future research has to be reliability and validity testing of the SENSES measure in different populations. This will help us to address remaining concerns about whether we have identified the optimal factor structure. Beyond that, longitudinal work is needed if we are to begin to be able to understand the direction of any effects and to test for reverse causation i.e. the possibility that discordant GCSE results or life satisfaction are the precursor to discordant experiences, rather than the other way around. Finally, it will be important to assess whether SENSES factors can actually explain NSE variance in educationally relevant variables, and whether associations are mediated by genetic, shared or non-shared environmental effects (using multivariate twin analyses) and this research is already underway using the current sample.