Sleep problems are both symptoms of and modifiable risk factors for many psychiatric disorders. Wrist-worn accelerometers enable objective measurement of sleep at scale. Here, we aimed to examine the association of accelerometer-derived sleep measures with psychiatric diagnoses and polygenic risk scores in a large community-based cohort.
Methods and findings
In this post hoc cross-sectional analysis of the UK Biobank cohort, 10 interpretable sleep measures—bedtime, wake-up time, sleep duration, wake after sleep onset, sleep efficiency, number of awakenings, duration of longest sleep bout, number of naps, and variability in bedtime and sleep duration—were derived from 7-day accelerometry recordings across 89,205 participants (aged 43 to 79, 56% female, 97% self-reported white) taken between 2013 and 2015. These measures were examined for association with lifetime inpatient diagnoses of major depressive disorder, anxiety disorders, bipolar disorder/mania, and schizophrenia spectrum disorders from any time before the date of accelerometry, as well as polygenic risk scores for major depression, bipolar disorder, and schizophrenia. Covariates consisted of age and season at the time of the accelerometry recording, sex, Townsend deprivation index (an indicator of socioeconomic status), and the top 10 genotype principal components. We found that sleep pattern differences were ubiquitous across diagnoses: each diagnosis was associated with a median of 8.5 of the 10 accelerometer-derived sleep measures, with measures of sleep quality (for instance, sleep efficiency) generally more affected than mere sleep duration. Effect sizes were generally small: for instance, the largest magnitude effect size across the 4 diagnoses was β = −0.11 (95% confidence interval −0.13 to −0.10, p = 3 × 10−56, FDR = 6 × 10−55) for the association between lifetime inpatient major depressive disorder diagnosis and sleep efficiency. Associations largely replicated across ancestries and sexes, and accelerometry-derived measures were concordant with self-reported sleep properties. Limitations include the use of accelerometer-based sleep measurement and the time lag between psychiatric diagnoses and accelerometry.
In this study, we observed that sleep pattern differences are a transdiagnostic feature of individuals with lifetime mental illness, suggesting that they should be considered regardless of diagnosis. Accelerometry provides a scalable way to objectively measure sleep properties in psychiatric clinical research and practice, even across tens of thousands of individuals.
Why was this study done?
- Sleep problems are both symptoms of and risk factors for many mental health conditions.
- This study aimed to determine how objectively measured sleep differs among individuals with lifetime psychiatric diagnoses.
What did the researchers do and find?
- This cohort study of 89,205 individuals from the UK Biobank analyzed 10 accelerometer-derived sleep measures.
- The study found a rich suite of associations with lifetime diagnoses of psychopathology and psychiatric polygenic risk scores, though effect sizes were generally small.
What do these findings mean?
- Sleep pattern differences are the norm among patients with lifetime psychiatric illness.
- Accelerometry provides a scalable way to objectively measure such differences in psychiatric research and practice.
- Limitations include the use of accelerometer-based sleep measurement and the time lag between psychiatric diagnoses and accelerometry.
Citation: Wainberg M, Jones SE, Beaupre LM, Hill SL, Felsky D, Rivas MA, et al. (2021) Association of accelerometer-derived sleep measures with lifetime psychiatric diagnoses: A cross-sectional study of 89,205 participants from the UK Biobank. PLoS Med 18(10): e1003782. https://doi.org/10.1371/journal.pmed.1003782
Academic Editor: Vikram Patel, Harvard Medical School, UNITED STATES
Received: May 4, 2021; Accepted: August 25, 2021; Published: October 12, 2021
Copyright: © 2021 Wainberg et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: De-identified data for the 10 accelerometer-derived sleep measures used in the study are available through the UK Biobank. The data are available to researchers through a procedure described at http://www.ukbiobank.ac.uk/using-the-resource/.
Funding: The authors acknowledge Milos Milic for data curation assistance. MW and SJT acknowledge support from the Kavli Foundation, Krembil Foundation, CAMH Discovery Fund, the McLaughlin Foundation, NSERC (RGPIN-2020-05834 and DGECR-2020-00048) and CIHR (NGN-171423). DF is supported by the Michael and Sonja Koerner Foundation New Scientist Program, Krembil Foundation, CAMH Discovery Fund, and the McLaughlin Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. This research was conducted under the auspices of UK Biobank application 61530, “Multimodal subtyping of mental illness across the adult lifespan through integration of multi-scale whole-person phenotypes”.
Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: M.A.R. is on the SAB of 54Gene, Related Sciences and scientific founder of Broadwing Bio and has advised BioMarin, Third Rock Ventures and MazeTx; the remaining authors declare no competing interests.
Abbreviations: FDR, false discovery rate; HDCZA, Heuristic algorithm looking at Distribution of Change in Z-Angle; ICD, International Classification of Diseases; LD, linkage disequilibrium; REM, rapid eye movement; WASO, wake after sleep onset
Sleep is fundamental to mental health. Poor sleep is not just a hallmark of psychiatric disorders, but can be a causal risk factor as well . Sleep interventions can lessen depression  and posttraumatic stress disorder  symptoms, prevent psychotic experiences [4,5], and improve psychological well-being and quality of life .
In psychiatry, sleep properties are often ascertained via self-report: for instance, self-reported sleep quality is a component of nearly every depression rating scale, including the HAM-D  and Montgomery–Asberg . However, self-reported measures of sleep do not always correlate well with direct physiological measurements: prior work has found that a typical person may overestimate [9,10] or underestimate [11,12] their sleep duration by up to 75 minutes, relative to direct measurement. This divergence may be especially large among psychiatric patients: individuals with depression are less accurate at reporting sleep quality and duration than healthy controls . Thus, when studying sleep in a psychiatric context, objective measurement may be a useful complement to self-report. While lab-based polysomnography remains the gold standard for sleep measurement, it is ill-suited to long-term or home use, and spending a night in a sleep clinic with multiple electrodes attached to one’s body may not be conducive to a good night’s sleep. Wrist-based accelerometry (also called actigraphy) is a reasonably accurate and much more versatile and scalable alternative [14–19].
Historically, most accelerometry studies of sleep and mental illness have relied on highly selected samples of tens to hundreds of individuals . Recently, the UK Biobank collected 7-day accelerometry recordings from over 100,000 participants , providing an unprecedented opportunity to study the interplay between sleep and mental health across a broad cross-section of the community. Researchers have used this dataset to determine that circadian dysrhythmia is correlated with mood disorders and subjective well-being  and genetically correlated with mood instability  and that insomnia, chronotype , sleep duration , and daytime sleepiness  are genetically correlated with lifetime prevalence of several psychiatric disorders.
Yet despite recognition that insomnia and disturbed sleep are transdiagnostic processes [26,27] that cut across conventional diagnostic boundaries, the relationship between objectively measured sleep and mental health has rarely been studied from a transdiagnostic perspective—and even then, often only for a single sleep property at a time and in a small sample. To illustrate this research gap, we searched PubMed for studies of objectively measured sleep in a psychiatric context, using the search terms “sleep AND (polysomnography OR accelerometry OR actigraphy) AND (depression OR anxiety OR bipolar OR schizophrenia),” and identified 2,923 articles meeting these criteria. However, after narrowing our search criteria to studies considering all 4 disorders—“sleep AND (polysomnography OR accelerometry OR actigraphy) AND (depression AND anxiety AND bipolar AND schizophrenia)”—we identified only 4 articles: 2 reviews [28,29], a case series of 58 patients , and a cohort study of 110 patients also focused on sleep apnea .
Here, we address this research gap by performing an “all-by-all” analysis of sleep and mental health across 89,205 UK Biobank participants. Specifically, we investigate the associations of 10 sleep measures—including bedtime and wake-up time, sleep duration, number of awakenings, and variability in bedtime and sleep duration—with 4 lifetime psychiatric diagnoses—major depressive disorder, anxiety disorders, bipolar disorder/mania, and schizophrenia spectrum disorders—as well as polygenic risk scores for major depression, bipolar disorder, and schizophrenia.
This study is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline (S1 Checklist). The study did not have a prospective protocol or analysis plan.
Accelerometry recordings were gathered from 103,688 participants in the UK Biobank, a community-based prospective cohort study, between 2013 and 2015 . Briefly, participants were provided with an Axivity AX3 triaxial accelerometer by mail and asked to wear it on their dominant wrist for 7 days, starting immediately after receiving it in the mail. These data have been made available as Data-Field 90001 of the UK Biobank (“Acceleration data—cwa format”).
Of these 103,688, participants were excluded if they did not wear the accelerometer for every one of the 24 hours in a day on at least one of the days (Data-Field 90084, “Unique hours of wear in a 24 hour cycle (scattered over multiple days)”; N = 4,345); if their accelerometer was not well calibrated (Data-Field 90016, “Data quality, good calibration”; N = 11); if their wear period included a DST change (Data-Field 90018, “Daylight savings crossover”; N = 4,543); if they woke up in the afternoon on an average day (for instance, shift workers; N = 137); or if fewer than 2 days during the 7-day wear period were valid (see below; N = 6,020). Due to the inclusion of analyses involving polygenic risk scores, participants were also excluded if they had greater than 2% genotype missingness (Data-Field 22005, “Missingness”), a mismatch between genetic sex and self-reported sex, sex chromosome aneuploidy, or were flagged as “Outliers for heterozygosity or missing rate” (Data-Field 22027). Self-reported white participants (according to Data-Field 21000, “Ethnic background”; N = 86,513) were used for the primary analysis, with replication in a much smaller number of self-reported non-white participants (N = 2,692), for a total of 89,205 participants. Replication was also performed stratified by sex, among self-reported white females (N = 48,562) and males (N = 37,951).
Accelerometry data processing
Accelerometry recordings were temporally segmented into sleep and activity bouts using an accelerometry software toolkit (github.com/activityMonitoring/biobankAccelerometerAnalysis) specifically designed for the UK Biobank [32,33]. As described previously, this segmentation was performed by a machine learning classifier consisting of a random forest, the predictions of which are temporally smoothed by a hidden Markov model. This classifier was trained on an external, labeled dataset of accelerometer recordings. For our analyses, we ignored distinctions between activities and classified each bout as either “sleep” or “wake.” Bouts for times when the accelerometer was not worn were probabilistically imputed; we labeled these bouts as “sleep” if the imputed probability of sleep was greater than 0.5, and “wake” otherwise.
While this segmentation is sufficient to determine the start and end time of each sleep and wake bout, it does not annotate which bouts make up the primary sleep period (usually at night) and which are just naps. To do this, we used steps 7 to 10 of the Heuristic algorithm looking at Distribution of Change in Z-Angle (HDCZA) algorithm implemented in the widely used GGIR accelerometry toolkit : following GGIR, we defined each day’s primary sleep period as the longest time period containing sleep bouts of at least 30 minutes separated by gaps of no more than 60 minutes. (While this definition is commonly used in the field, there is no single correct definition of what should constitute sleep inside versus outside the primary sleep period, particularly for individuals with highly fragmented sleep.) A “day” was defined as the period from 3 PM to the following 3 PM. Days were deemed invalid and discarded if their primary sleep period crossed one of the 3 PM day boundaries, if all the day’s sleep periods were less than 30 minutes, or if more than 10% of the day’s data was imputed.
Having defined each day’s primary sleep period, we defined 10 sleep measures based on the timings and lengths of the sleep and wake bouts inside and outside of this period (Table 1). These measures are similar to those used in previous accelerometry and polysomnography studies [35,36]. All measures were quantified as medians (or median absolute deviations, for the variability measures) across days, to be robust to outliers. To keep the focus on sleep, we do not include activity features, nor the L5 and M10 measures of circadian rhythmicity used in a previous study of the UK Biobank , which are based on both sleep and activity.
Inpatient psychiatric diagnoses
These 10 sleep measures were tested for association with 4 lifetime inpatient psychiatric diagnoses from any time before the date of accelerometry: schizophrenia spectrum disorders (International Classification of Diseases [ICD] codes F20-F29), bipolar disorder/mania (F30, F31), major depressive disorder (F32, F33), and anxiety disorders (F40, F41). Inpatient diagnoses and their dates were derived from the “hesin_diag” table of the inpatient records provided by the UK Biobank (Data-Field #41234, “Records in HES inpatient diagnoses dataset”).
To mitigate contamination of the control group, we excluded participants with preexisting primary care diagnoses (available for approximately 45% of the cohort), death record-based diagnoses, and/or self-reported clinician diagnoses of the same disorder, according to the “Source of report of [ICD code]” fields provided with the UK Biobank, for instance, Data-Field 130895, “Source of report of F32 (depressive episode).” We also excluded participants whose first inpatient diagnosis of the disorder was after the date of accelerometry. For instance, when computing associations with inpatient major depressive disorder, we excluded participants with primary care, death record-based, or self-reported major depressive disorder diagnoses, or whose first inpatient diagnosis of major depressive disorder was after the date of accelerometry.
Polygenic risk scores
The 10 sleep measures were also associated with polygenic risk scores derived from public genome-wide association study results for major depression , bipolar disorder [38,39], and schizophrenia  across self-reported white participants. The UK Biobank’s imputed genotypes were filtered using version 2.0 of the plink software . Nonautosomal variants, duplicates, indels, and variants with imputation INFO score less than 0.8 were removed, as were variants with Hardy–Weinberg equilibrium p-value less than 10−10, over 5% missingness, minor allele frequency below 0.1% across self-reported white participants.
The polygenic risk scores were then calculated. Summary statistics were first harmonized with the UK Biobank imputed genotypes with respect to reference/alternate allele and strand, using the allele harmonization framework from munge_sumstats.py in the ldsc software package . Ambiguous variants (A/T, C/G, G/C, T/A) and variants missing from UK Biobank were excluded. Summary statistics were then subset to p < 0.05, a threshold found to be most predictive across self-reported white participants in the UK Biobank (S1 Fig). Frequency-informed linkage disequilibrium (LD) pruning to r2 > 0.2 across the self-reported white participants was then performed using a 500-kb sliding window. The remaining variants constituted the trait’s polygenic risk score, with the variants’ effect sizes (β coefficients for educational attainment, log odds ratios for the other 3 case–control studies) constituting the weights of the score. Finally, polygenic risk scores were scored on each individual in the study cohort by summing, across the variants in the polygenic risk score, the variant’s weight times the individual’s number of effect alleles of that variant; missing genotypes were mean imputed.
Association tests were performed by linearly regressing the outcome variable (sleep measures) on the exposure variable (psychiatric diagnoses or polygenic risk scores). Covariates consisted of age and season at the time of the accelerometry recording, sex, Townsend deprivation index (an indicator of socioeconomic status), and the top 10 genotype principal components. Benjamini–Hochberg correction  was performed at a false discovery rate (FDR) threshold of 5%.
Analyses of self-reported sleep properties
As a secondary analysis, we considered 6 self-reported sleep properties (S1 Table) ascertained at baseline assessment between 2006 and 2010, approximately a half decade earlier than the accelerometry. We first assessed the concordance between self-reported sleep properties and accelerometry-derived sleep measures, by linearly regressing each accelerometry-derived measure (as the dependent variable) on each self-reported sleep property (as the independent variable) across all 77,232 self-reported white participants with both types of sleep properties, using the same covariates as above.
Next, we performed the same battery of associations with psychiatric diagnoses and polygenic risk scores, with the following differences from the primary analysis. First, we analyzed all 400,771 self-reported white participants with self-reported sleep properties and genotype data, not just the 89,205 with accelerometry. Second, we excluded participants with inpatient diagnoses after the baseline assessment, rather than after the date of accelerometry. Third, instead of including the age and season of accelerometry as covariates, we include the age at baseline assessment. Aside from these changes, this secondary analysis was conducted identically to the primary analysis.
This study is a reanalysis of the UK Biobank cohort, which obtained ethical approval and informed consent from study participants as described in the flagship UK Biobank publication . This study was conducted under the auspices of UK Biobank application 61530, “Multimodal subtyping of mental illness across the adult lifespan through integration of multiscale whole-person phenotypes.”
Accelerometer-derived sleep measures across 89,205 individuals
We analyzed accelerometry data from 89,205 participants. Our primary analysis used the largest ancestry group, self-reported white (N = 86,513); replication in the much smaller number of self-reported non-white participants (N = 2,692) and stratified by sex is discussed in the final subsection of the Results. Characteristics of participants with and without each of the 4 psychiatric diagnoses, for the self-reported white cohort used in the primary analysis, are shown in Table 2. We derived 10 sleep measures from these accelerometry data (Table 1, Fig 1, S2 Fig): bedtime, wake-up time, sleep duration, wake after sleep onset (WASO; the total time spent awake between bedtime and wake-up time), sleep efficiency (the fraction of time spent asleep between bedtime and wake-up time), number of awakenings, duration of longest sleep bout, number of naps, variability in bedtime, and variability in sleep duration.
Each row’s middle panel shows a 100-bin histogram and Gaussian kernel density estimate of a particular sleep measure across the self-reported white participants. For each measure, 2 exemplar individuals were chosen: one at the 5th percentile (plotted to the left of the histogram), and one at the 95th percentile (plotted to the right of the histogram). The blue (left) and red (right) lines on the histograms denote the 5th and 95th percentiles, i.e., where these 2 exemplar individuals are located on the distribution. In the exemplar plots, blue/red blocks indicate sleep bouts, and black lines with bars indicate each day’s primary sleep period. Days of the week are ordered differently for different exemplars because some people started the accelerometry on different days.
To gain insight into the distributions of these sleep measures, we tabulated percentiles of each measure (Table 3) across participants with and without a history of any of 4 common inpatient psychiatric diagnoses from before the date of accelerometry: major depressive disorder, anxiety disorders, bipolar disorder/mania, and schizophrenia spectrum disorders. (Depression, anxiety, schizophrenia, and bipolar disorder are the 4 mental health conditions with the greatest global disease burden according to the Global Burden of Disease Study 2019 ). The medians (50th percentiles) of these measures were similar between those with and without psychiatric diagnoses: a marginally later bedtime of 11:29 PM instead of 11:19 PM and wake-up time of 7:41 AM instead of 7:24 AM, an identical 99% sleep efficiency, a single awakening, and so on. Differences were larger at one or both extremes, at least for some measures: for instance, the 99th percentile of bedtime was 2:33 AM for those without psychiatric diagnoses, but 4:47 AM for those with diagnoses, while the 99th percentile of wake time was 9:57 AM for those without diagnoses but 10:52 AM for those with.
Association of accelerometer-derived sleep measures with psychiatric diagnoses
We associated these 10 accelerometry-defined sleep measures with 4 ICD-code-based inpatient psychiatric diagnoses (Table 4). Three trends were especially striking.
First, sleep pattern differences were ubiquitous across diagnoses. Having any psychiatric diagnosis was significantly associated with differences in every sleep measure except for total sleep duration, and each individual psychiatric diagnosis was associated with a median of 8.5 of the 10 sleep measures, though effect sizes were generally small. For instance, the largest magnitude effect size across the 4 disorders was β = −0.11 (95% confidence interval −0.13 to −0.10, p = 3 × 10−56, FDR = 6 × 10−55) for the association between lifetime inpatient major depressive disorder diagnosis and sleep efficiency.
Second, almost all significant associations with accelerometer-derived sleep measures and 18 significant associations with self-reported sleep properties had the same effect size directions: toward later bedtime and wake-up time; shorter duration of longest sleep bout; lower sleep efficiency; higher WASO and number of awakenings; more naps; and more variable bedtime and sleep duration. The one exception was sleep duration, which was significantly shorter among participants with lifetime major depressive disorder diagnoses (β = −0.02, 95% confidence interval −0.04, −0.01, p = 0.003, FDR = 0.003) but significantly longer among participants with lifetime schizophrenia spectrum disorder diagnoses (β = 0.02, 95% confidence interval 0.01 to 0.04, p = 0.0008, FDR = 0.001).
Third, despite this relative homogeneity, certain sleep properties were associated to a greater extent than others with lifetime psychopathology. In particular, across diagnoses, measures of sleep quality were more strongly associated than mere sleep duration. In particular, WASO, sleep efficiency, and number of awakenings were each associated with every tested disorder. In contrast, sleep duration was only significantly associated with major depressive disorder and schizophrenia spectrum disorders (see previous paragraph), and its effect size for major depressive disorder was several times smaller than for the other 9 sleep measures.
Association of accelerometer-derived sleep measures with polygenic risk scores
To ascertain genetic influences on sleep patterns, we next associated each of the 10 accelerometer-derived sleep measures with polygenic risk scores for major depression, bipolar disorder, and schizophrenia (Table 4). Given the imperfect nature of polygenic risk scores, the effect sizes for these associations were generally smaller than for the psychiatric diagnoses; but since every individual in the cohort has a polygenic risk score (even though most lack psychiatric diagnoses), many were nonetheless significant, particularly for major depression and schizophrenia. As with the psychiatric diagnoses, all significant associations were in the direction of later wake-up time; shorter duration of longest sleep bout; lower sleep efficiency; higher WASO and number of awakenings; more naps; and more variable bedtime and sleep duration. Bedtime and sleep duration were not associated with any of the 3 polygenic risk scores.
Replication across ancestries and sexes
Finally, we confirmed replicability of the associations between sleep measures and psychiatric diagnoses across ancestries and sexes (S4 Table). Due to the relatively low numbers of self-reported non-white participants in the sample, we restricted ourselves to replicating the “Any psychiatric diagnosis” row from Table 4. We found that, of the 10 accelerometer-derived sleep measures with significant associations among self-reported white participants, 3 measures replicated in self-reported non-white participants; 6 of the 7 associations that failed to replicate (all except for sleep duration) nonetheless had the same effect directions as in self-reported white participants. Replication among self-reported white males and females was better powered: all 10 significant associations with accelerometer-derived sleep measures replicated in both males and females, with comparable effect sizes to the non-sex stratified analysis.
Comparison with self-reported sleep properties
As a secondary analysis, we considered 6 self-reported sleep properties—sleep duration, ease of morning awakening, chronotype, daytime napping, insomnia, and daytime dozing (S1 Table)—ascertained at baseline assessment between 2006 and 2010, approximately a half decade earlier than the accelerometry. We found that self-reported sleep properties were broadly concordant with their closest self-reported equivalents (S2 Table)—though not completely so, as expected given the known discordance between subjective and objective sleep measures, differences in the definitions of the 2 types of measures, and the half-decade time lag between the two. Among notable associations, self-reported sleep duration was most strongly associated with accelerometry-derived sleep duration (β = 0.40, 95% confidence interval 0.39 to 0.42, p = 0, FDR = 0); self-reported ease of morning awakening with accelerometry-derived wake-up time (β = −0.53, 95% confidence interval −0.54 to −0.51, p = 0, FDR = 0); self-reported chronotype (higher values indicate one is more of an “evening person” than a “morning person”) with accelerometry-derived bedtime (β = 0.69, 95% confidence interval 0.67 to 0.70, p = 0, FDR = 0) and wake-up time (β = 0.73, 95% confidence interval 0.72 to 0.75, p = 0, FDR = 0); and self-reported daytime napping with accelerometry-derived number of naps (β = 0.38, 95% confidence interval 0.37 to 0.40, p = 0, FDR = 0).
Next, we performed the same battery of associations with psychiatric diagnoses and polygenic risk scores on these self-reported sleep properties (S3 Table) as for the accelerometry-derived sleep measures. Despite much stronger statistical significance due to the increased sample size, effect sizes were not substantially larger than for the primary analysis (Table 4). For instance, the largest magnitude effect size across the 4 disorders was β = −0.12 (95% confidence interval −0.12 to −0.11, p = 6 × 10−258, FDR = 6 × 10−257) for the association between lifetime inpatient major depressive disorder diagnosis and ease of morning awakening, essentially identical to the largest effect size for the accelerometry-derived measures (β = −0.11 for the association between major depressive disorder and sleep efficiency, as mentioned above).
In this work, we analyzed the structure of sleep and its association with lifetime psychopathology across nearly 90,000 individuals. In a departure from previous studies analyzing only a single sleep property or a single disorder, we take an “all-by-all” approach, associating 10 accelerometer-derived sleep measures with 4 inpatient psychiatric diagnoses and 3 psychiatric polygenic risk scores. On the whole, accelerometer-derived sleep measures were concordant with self-reported sleep properties, and both were richly associated with psychiatric diagnoses and polygenic risk scores, and these associations replicated across ancestries and sexes. To our knowledge, this is the first large-scale transdiagnostic study of objectively measured sleep and mental health.
The same sleep pattern differences tended to recur across disorders: each diagnosis was associated with a median of 8.5 of the 10 sleep measures, almost always in the direction of worse sleep quality. However, effect sizes were generally quite small. Note that these numbers are with respect to lifetime diagnoses; the extent of sleep disruption would presumably be greater during an active episode of depression, mania, or psychosis .
Across diagnoses, metrics pertaining to sleep quality were more strongly associated than mere sleep duration. Strikingly, the accelerometry-defined duration of an individual’s longest sleep bout was much more strongly associated with most psychiatric diagnoses and polygenic risk scores than total sleep duration. Given the intimate relationship between sleep bout length and sleep quality [46,47], this suggests that sleep quality may be more disturbed than sleep length across psychopathologies. These findings undergird the importance of assessment of sleep quality in addition to sleep duration. However, we note that effects on sleep may vary greatly across disease subtypes (for instance, atypical versus nonatypical depression) or states (for instance, manic episode versus depressive episode versus euthymia), and these effects may be obscured when lumping together subtypes and states, as we do here.
Most prior studies of sleep and mental illness have focused on white individuals, and a key differentiating factor of our work is its replication across diverse ancestries, including those historically underrepresented in medical research . In addition to this trans-ethnic replication, we also confirm that males and females display similar sleep alterations across lifetime psychopathologies. Even so, our results should be interpreted in the context of the UK Biobank’s well-characterized “healthy volunteer” selection bias  and its consequent underascertainment of individuals with psychiatric diagnoses .
This study has several key limitations. First, it relies on linked inpatient medical records, which may not capture all participants with clinically significant psychopathology, thus compounding the “healthy volunteer” bias mentioned in the previous paragraph. Second, the (often years-long) time lag between psychiatric diagnoses and accelerometry (Table 2) obscures whether participants were in an active manic, depressive, or psychotic episode at the time of their accelerometry. Third, the study’s cross-sectional design limits the ability to make inferences about causality. Fourth, accelerometer-based sleep measurement is not as precise as polysomnography, the gold standard in sleep research. The algorithm used for sleep/wake segmentation [32,33] was trained on accelerometry data annotated from head-mounted video and sleep diaries, rather than direct measures of sleep/wake, which could result in the misclassification of certain awake-in-bed periods (for instance, short awakenings or periods prior to sleep onset where the individual is motionless) as sleep. This may also account for the relatively high median sleep efficiency, low wake time after sleep onset, and long sleep bout durations seen in this study relative to polysomnography-based studies . Also, accelerometry alone cannot accurately distinguish between rapid eye movement (REM) sleep and the various stages of non-REM sleep [52,53]. However, these limitations should be weighted against the population-scale, pan-diagnostic scope that accelerometry-based sleep measurement enables. Moreover, certain of our sleep metrics may indirectly capture aspects of sleep stage: for instance, high numbers of awakenings or low duration of longest sleep bout may indicate insufficient REM sleep [46,47,54].
A key clinical implication of this work is that sleep pattern differences are a transdiagnostic feature of psychopathology. Alterations in sleep parameters—particularly those impacting sleep quality and not merely duration—should be considered regardless of which psychiatric conditions a patient presents with. Future transdiagnostic studies of sleep and psychopathology should employ a longitudinal design to more precisely examine how sleep parameters vary across phases of mental illness.
In sum, we find that alterations in objectively measured sleep parameters are the norm among patients with lifetime psychiatric illness. Our findings provide a rich clinical portrait of the ways in which sleep can be disrupted across individuals with lifetime mental illness. This work showcases the capacity of accelerometry to provide detailed, objective sleep measurements at scale, even across cohorts of tens of thousands of individuals.
S1 Fig. AUC curves summarizing the predictive accuracy of our polygenic risk scores at various GWAS p-value thresholds.
The AUC, also known as the AUROC or C statistic, is the fraction of the time that the polygenic risk score would rank a randomly chosen case higher than a randomly chosen control. Polygenic risk scores were benchmarked against ICD codes from inpatient, primary care, or death records for the corresponding disorder (using the UK Biobank’s “Source of report of [ICD code]” fields; Methods): for major depressive disorder, F32 or F33; for bipolar disorder, F31; and for schizophrenia, F20. AUCs were computed across 451,993 self-reported white participants in the UK Biobank with genotype data. AUC, area under the curve; AUROC, area under the receiver operating characteristic curve; C statistic, concordance statistic; GWAS, genome-wide association study; ICD, International Classification of Diseases.
S2 Fig. Correlations among the 10 accelerometer-derived sleep measures.
Semipartial Kendall correlation coefficients (τ) among all pairs of accelerometer-derived sleep measures, hierarchically clustered with Euclidean distance and average linkage.
S1 Table. The 6 self-reported sleep properties.
S2 Table. Concordance of accelerometer-derived sleep measures (columns) with self-reported sleep properties (rows).
Covariate-corrected linear regression effect sizes (β coefficients) and p-values for association between each accelerometer-derived sleep measure and each self-reported sleep property and each psychiatric diagnosis, across the 77,232 self-reported white participants with both types of sleep properties. Bold denotes significant associations at 5% FDR; square brackets denote 95% confidence intervals; rounded brackets denote p-values. WASO, wake after sleep onset.
S3 Table. Association of self-reported sleep properties with psychiatric diagnoses and polygenic risk scores.
Covariate-corrected linear regression effect sizes (standardized β coefficients) and p-values for association between each self-reported sleep property and each psychiatric diagnosis, across the 400,771 self-reported white participants with self-reported sleep properties. Bold denotes significant associations at 5% FDR; square brackets denote 95% confidence intervals; rounded brackets denote p-values. FDR, false discovery rate.
S4 Table. Replication of associations between accelerometer-derived sleep measures and lifetime psychopathology.
Replication of the “Any psychiatric diagnosis” row of Table 4, both in non-white participants (67 participants of 2,692 have at least one of the 4 inpatient psychiatric diagnoses) and stratified by sex (1,445 females out of 48,562 and 757 males out of 37,951 have at least one of the 4). Covariate-corrected linear regression effect sizes (β coefficients) and p-values are shown for each sleep measure. Bold denotes significant associations at 5% FDR; square brackets denote 95% confidence intervals; rounded brackets denote p-values. FDR, false discovery rate; WASO, wake after sleep onset.
The authors acknowledge Milos Milic for data curation assistance. This research was conducted under the auspices of UK Biobank application 61530, “Multimodal subtyping of mental illness across the adult lifespan through integration of multi-scale whole-person phenotypes.”
- 1. Freeman D, Sheaves B, Waite F, Harvey AG, Harrison PJ. Sleep disturbance and psychiatric disorders. Lancet Psychiatry. 2020;7:628–37. pmid:32563308
- 2. Gee B, Orchard F, Clarke E, Joy A, Clarke T, Reynolds S. The effect of non-pharmacological sleep interventions on depression symptoms: A meta-analysis of randomised controlled trials. Sleep Med Rev. 2019;43:118–28. pmid:30579141
- 3. Ho FY-Y, Chan CS, Tang KN-S. Cognitive-behavioral therapy for sleep disturbances in treating posttraumatic stress disorder symptoms: A meta-analysis of randomized controlled trials. Clin Psychol Rev. 2016;43:90–102. pmid:26439674
- 4. Freeman D, Sheaves B, Goodwin GM, Yu L-M, Nickless A, Harrison PJ, et al. The effects of improving sleep on mental health (OASIS): a randomised controlled trial with mediation analysis. Lancet Psychiatry. 2017;4:749–58. pmid:28888927
- 5. Reeve S, Emsley R, Sheaves B, Freeman D. Disrupting Sleep: The Effects of Sleep Loss on Psychotic Experiences Tested in an Experimental Study With Mediation Analysis. Schizophr Bull. 2018;44. pmid:28981834
- 6. Espie CA, Emsley R, Kyle SD, Gordon C, Drake CL, Siriwardena AN, et al. Effect of Digital Cognitive Behavioral Therapy for Insomnia on Health, Psychological Well-being, and Sleep-Related Quality of Life: A Randomized Clinical Trial. JAMA Psychiat. 2019;76:21–30. pmid:30264137
- 7. Hamilton M. A rating scale for depression. J Neurol Neurosurg Psychiatry. 1960;23:56–62. pmid:14399272
- 8. Montgomery SA, Asberg M. A new depression scale designed to be sensitive to change. Br J Psychiatry. 1979;134:382–9. pmid:444788
- 9. Lauderdale DS, Knutson KL, Yan LL, Liu K, Rathouz PJ. Self-reported and measured sleep duration: how similar are they? Epidemiology. 2008;19:838–45. pmid:18854708
- 10. Jackson CL, Patel SR, Jackson WB, Lutsey PL, Redline S. Agreement between self-reported and objectively measured sleep duration among white, black, Hispanic, and Chinese adults in the United States: Multi-Ethnic Study of Atherosclerosis. Sleep. 2018;41. pmid:29701831
- 11. Harvey AG, Schmidt DA, Scarnà A, Semler CN, Goodwin GM. Sleep-related functioning in euthymic patients with bipolar disorder, patients with insomnia, and subjects without sleep problems. Am J Psychiatry. 2005;162:50–7. pmid:15625201
- 12. Jackson CL, Ward JB, Johnson DA, Sims M, Wilson J, Redline S. Concordance between self-reported and actigraphy-assessed sleep duration among African-American adults: findings from the Jackson Heart Sleep Study. Sleep. 2020;43. pmid:31616945
- 13. Rotenberg VS, Indursky P, Kayumov L, Sirota P, Melamed Y. The relationship between subjective sleep estimation and objective sleep variables in depressed patients. Int J Psychophysiol. 2000;37:291–7. pmid:10858574
- 14. Kushida CA, Chang A, Gadkary C, Guilleminault C, Carrillo O, Dement WC. Comparison of actigraphic, polysomnographic, and subjective assessment of sleep parameters in sleep-disordered patients. Sleep Med. 2001;2:389–96. pmid:14592388
- 15. de Souza L, Benedito-Silva AA, Pires MLN, Poyares D, Tufik S, Calil HM. Further validation of actigraphy for sleep studies. Sleep. 2003. pp. 81–85. pmid:12627737
- 16. McCall C, McCall WV. Comparison of actigraphy with polysomnography and sleep logs in depressed insomniacs. J Sleep Res. 2012;21:122–7. pmid:21447050
- 17. Marino M, Li Y, Rueschman MN, Winkelman JW, Ellenbogen JM, Solet JM, et al. Measuring sleep: accuracy, sensitivity, and specificity of wrist actigraphy compared to polysomnography. Sleep. 2013;36:1747–55. pmid:24179309
- 18. Smith MT, McCrae CS, Cheung J, Martin JL, Harrod CG, Heald JL, et al. Use of Actigraphy for the Evaluation of Sleep Disorders and Circadian Rhythm Sleep-Wake Disorders: An American Academy of Sleep Medicine Systematic Review, Meta-Analysis, and GRADE Assessment. J Clin Sleep Med. 2018;14:1209–30. pmid:29991438
- 19. Tazawa Y, Wada M, Mitsukura Y, Takamiya A, Kitazawa M, Yoshimura M, et al. Actigraphy for evaluation of mood disorders: A systematic review and meta-analysis. J Affect Disord. 2019;253:257–69. pmid:31060012
- 20. Lyall LM, Wyse CA, Graham N, Ferguson A, Lyall DM, Cullen B, et al. Association of disrupted circadian rhythmicity with mood disorders, subjective wellbeing, and cognitive function: a cross-sectional study of 91 105 participants from the UK Biobank. Lancet Psychiatry. 2018;5:507–14. pmid:29776774
- 21. Doherty A, Jackson D, Hammerla N, Plötz T, Olivier P, Granat MH, et al. Large Scale Population Assessment of Physical Activity Using Wrist Worn Accelerometers: The UK Biobank Study. PLoS ONE. 2017;12:e0169649. pmid:28146576
- 22. Ferguson A, Lyall LM, Ward J, Strawbridge RJ, Cullen B, Graham N, et al. Genome-Wide Association Study of Circadian Rhythmicity in 71,500 UK Biobank Participants and Polygenic Association with Mood Instability. EBioMedicine. 2018;35:279–87. pmid:30120083
- 23. Jones SE, Lane JM, Wood AR, van VT, Tyrrell J, Beaumont RN, et al. Genome-wide association analyses of chronotype in 697,828 individuals provides insights into circadian rhythms. Nat Commun. 2019;10:343. pmid:30696823
- 24. Dashti HS, Jones SE, Wood AR, Lane JM, van Hees VT, Wang H, et al. Genome-wide association study identifies genetic loci for self-reported habitual sleep duration supported by accelerometer-derived estimates. Nat Commun. 2019;10:1100. pmid:30846698
- 25. Wang H, Lane JM, Jones SE, Dashti HS, Ollila HM, Wood AR, et al. Genome-wide association analysis of self-reported daytime sleepiness identifies 42 loci that suggest biological subtypes. Nat Commun. 2019;10:3503. pmid:31409809
- 26. Harvey AG, Murray G, Chandler RA, Soehner A. Sleep disturbance as transdiagnostic: consideration of neurobiological mechanisms. Clin Psychol Rev. 2011;31:225–35. pmid:20471738
- 27. Dolsen MR, Asarnow LD, Harvey AG. Insomnia as a transdiagnostic process in psychiatric disorders. Curr Psychiatry Rep. 2014;16:471. pmid:25030972
- 28. Eiber R, Escande M. Sleep electroencephalography in depression and mental disorders with depressive comorbidity. L’Encéphale. 1999;25. pmid:10598300
- 29. Ramtekkar U, Ivanenko A. Sleep in Children With Psychiatric Disorders. Semin Pediatr Neurol. 2015;22. pmid:26072345
- 30. Martínez NT, Cock DR. [Obstructive sleep apnea syndrome in patients attending a psychiatry outpatient service: a case series]. Rev Colomb Psiquiatr. 2017;46. Spanish. pmid:29122232
- 31. Knechtle B, Economou NT, Nikolaidis PT, Velentza L, Kallianos A, Steiropoulos P, et al. Clinical Characteristics of Obstructive Sleep Apnea in Psychiatric Disease. J Clin Med Res. 2019;8. pmid:31003451
- 32. Willetts M, Hollowell S, Aslett L, Holmes C, Doherty A. Statistical machine learning of sleep and physical activity phenotypes from sensor data in 96,220 UK Biobank participants. Sci Rep. 2018;8:7961. pmid:29784928
- 33. Doherty A, Smith-Byrne K, Ferreira T, Holmes MV, Holmes C, Pulit SL, et al. GWAS identifies 14 loci for device-measured physical activity and sleep duration. Nat Commun. 2018;9:5257. pmid:30531941
- 34. van VT, Sabia S, Jones SE, Wood AR, Anderson KN, Kivimäki M, et al. Estimating sleep parameters using an accelerometer without sleep diary. Sci Rep. 2018;8:12975. pmid:30154500
- 35. Natale V, Plazzi G, Martoni M. Actigraphy in the assessment of insomnia: a quantitative approach. Sleep. 2009;32:767–71. pmid:19544753
- 36. Shrivastava D, Jung S, Saadat M, Sirohi R, Crewson K. How to interpret the results of a sleep study. J Community Hosp Intern Med Perspect. 2014:24983. pmid:25432643
- 37. Wray NR, Ripke S, Mattheisen M, Trzaskowski M, Byrne EM, Abdellaoui A, et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat Genet. 2018;50:668–81. pmid:29700475
- 38. Stahl EA, Breen G, Forstner AJ, McQuillin A, Ripke S, Trubetskoy V, et al. Genome-wide association study identifies 30 loci associated with bipolar disorder. Nat Genet. 2019;51:793–803. pmid:31043756
- 39. Sullivan P. bip2019. figshare. 2021.
- 40. Sullivan P. scz2021. figshare. 2021.
- 41. Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7. pmid:25722852
- 42. Bulik-Sullivan BK, Schizophrenia Working Group of the Psychiatric Genomics Consortium, Loh P-R, Finucane HK, Ripke S, Yang J, et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015:291–5. pmid:25642630
- 43. Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J R Stat Soc B Methodol. 1995:289–300.
- 44. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562:203–9. pmid:30305743
- 45. GBD 2019 Diseases and Injuries Collaborators. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet. 2020;396:1204–22. pmid:33069326
- 46. Zepelin H. REM sleep and the timing of self-awakenings. Bull Psychon Soc. 1986:254–6.
- 47. Akerstedt T, Billiard M, Bonnet M, Ficca G, Garma L, Mariotti M, et al. Awakening from sleep. Sleep Med Rev. 2002;6:267–86. pmid:12531132
- 48. Smart A, Harrison E. The under-representation of minority ethnic groups in UK medical research. Ethn Health. 2017;22:65–82. pmid:27174778
- 49. Fry A, Littlejohns TJ, Sudlow C, Doherty N, Adamska L, Sprosen T, et al. Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population. Am J Epidemiol. 2017;186:1026–34. pmid:28641372
- 50. Davis KAS, Cullen B, Adams M, Brailean A, Breen G, Coleman JRI, et al. Indicators of mental disorders in UK Biobank-A comparison of approaches. Int J Methods Psychiatr Res. 2019;28:e1796. pmid:31397039
- 51. Boulos MI, Jairam T, Kendzerska T, Im J, Mekhael A, Murray BJ. Normal polysomnography parameters in healthy adults: a systematic review and meta-analysis. Lancet Respir Med. 2019;7:533–43. pmid:31006560
- 52. Martin JL, Hakim AD. Wrist actigraphy. Chest. 2011;139:1514–27. pmid:21652563
- 53. Walch O, Huang Y, Forger D, Goldstein C. Sleep stage prediction with raw acceleration and photoplethysmography heart rate data derived from a consumer wearable device. Sleep. 2019;42. pmid:31579900
- 54. Palagini L, Baglioni C, Ciapparelli A, Gemignani A, Riemann D. REM sleep dysregulation in depression: state of the art. Sleep Med Rev. 2013;17:377–90. pmid:23391633