Skip to main content
  • Loading metrics

A PRISMA systematic review of adolescent gender dysphoria literature: 1) Epidemiology

  • Lucy Thompson ,

    Roles Conceptualization, Data curation, Methodology, Project administration, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Gillberg Neuropsychiatry Centre (GNC), Department of Psychiatry and Neurochemistry, Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, Göteborg, Sweden, Institute of Health and Wellbeing, University of Glasgow, Glasgow, United Kingdom, Institute of Applied Health Science, Centre for Health Science, University of Aberdeen, Inverness, United Kingdom

  • Darko Sarovic,

    Roles Data curation, Formal analysis, Methodology, Validation, Writing – review & editing

    Affiliation Gillberg Neuropsychiatry Centre (GNC), Department of Psychiatry and Neurochemistry, Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, Göteborg, Sweden

  • Philip Wilson,

    Roles Data curation, Methodology, Validation, Writing – review & editing

    Affiliation Institute of Applied Health Science, Centre for Health Science, University of Aberdeen, Inverness, United Kingdom

  • Angela Sämfjord,

    Roles Data curation, Methodology, Validation, Writing – review & editing

    Affiliation The Child and Adolescent Psychiatric Clinic, The Queen Silvia Children’s Hospital, Gothenburg, Sweden

  • Christopher Gillberg

    Roles Conceptualization, Methodology, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliations Gillberg Neuropsychiatry Centre (GNC), Department of Psychiatry and Neurochemistry, Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, Göteborg, Sweden, Institute of Health and Wellbeing, University of Glasgow, Glasgow, United Kingdom


It is unclear whether the research literature on adolescent gender dysphoria (GD) provides sufficient evidence to adequately inform clinical decision making. In the first of a series of three papers, this study sought to systematically review published evidence regarding: the prevalence of GD in adolescence; the proportions of natal males/females with GD in adolescence and whether this changed over time; and the pattern of age at (a) onset (b) referral and (c) assessment. Having searched PROSPERO and the Cochrane library for existing systematic reviews (and finding none), we searched Ovid Medline 1946 –October week 4 2020, Embase 1947–present (updated daily), CINAHL 1983–2020, and PsycInfo 1914–2020. The final search was carried out on the 2nd November 2020 using a core strategy including search terms for ‘adolescence’ and ‘gender dysphoria’ which was adapted according to the structure of each database. Papers were excluded if they did not clearly report on clinically-verified gender dysphoria, if they were focused on adult populations, if they did not include original data (epidemiological, clinical, or survey) on adolescents (aged at least 12 and under 18 years), or if they were not peer-reviewed journal publications. From 6202 potentially relevant articles (post de-duplication), 38 papers from 11 countries representing between 3000 and 4000 participants were included in our final sample. Most studies were observational cohort studies, usually using retrospective record review (26). A few compared to normative or population datasets; most (31) were published in the past 5 years. There was significant overlap of study samples (accounted for in our quantitative synthesis). No population studies are available, so prevalence is not possible to ascertain. There is evidence of an increase in frequency of presentation to services, and of a shift in the natal sex of referred cases: those assigned female at birth are now in the majority. No data were available on age of onset. Within the included samples the average age was 13 years at referral, 15 years at assessment. All papers were rated by two reviewers using the Crowe Critical Appraisal Tool v1·4 (CCAT). The CCAT quality ratings ranged from 45% to 96%, with a mean of 78%. Almost half the included studies emerged from two treatment centres: there was considerable sample overlap and it is unclear how representative these are of the adolescent GD community more broadly. The increase in clinical presentations of GD, particularly among natal female adolescents, warrants further investigation. Whole population studies using administrative datasets reporting on GD / gender non-conformity may be necessary, along with inter-disciplinary research evaluating the lived experience of adolescents with GD.


Gender Dysphoria (GD) is a categorical diagnosis in the Fifth Edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) [1]. It is also used as a general descriptive term referring to a person’s discontent with assigned gender. In recent years, GD diagnoses have been increasingly made in child and adolescent services [24]. There has been a parallel increase in demand for gender transition interventions, particularly among natal females [24], and including pre-adolescents [5]. Such transitions have increasingly involved the use of puberty suppression, cross-sex hormones and surgical procedures, usually in accordance with the so-called ‘Dutch model’, where intervention is staged in accordance with a young person’s age and stage of pubertal development [6]. Calls to improve availability of medical interventions have sometimes been made on the basis of reports of much increased levels of mental health problems, including suicide attempts, among youth with GD [7, 8], and claims that the medical procedures referred to would improve mental health outcomes [6, 9].

Gender and sex are two terms which are often used interchangeably but that are not synonymous. The sex of a person refers to either male or female status broadly based on the sex chromosomes and genitalia. The word is used in social, medical and legal contexts in most countries to ‘categorise’ people under the two sexes, as boys/men or girls/women.

The term gender is harder to define, as it reflects how the individual identifies or feels, how a person ‘fits’ with social norms, activities and attributes that are commonly associated with male or female sex. For many people sex and gender are consistent, but for some, there is discordance between the biological anatomy of the body and gender self-perception/identity. The term ‘transgender’ is often used to describe this identity, whilst others will relate to terms such as GD or gender incongruence. The literature and common understanding of this area is evolving extremely rapidly, and there is increasing acknowledgement that gender identity may not relate to a binary gender definition at all. Gender does not necessarily reflect sexual orientation, an enduring pattern of emotional, romantic and/or sexual attraction [10]. Gender identity on the other hand is a component of one’s personal multi-dimensional sense of self, encompassing moral, ethical, spiritual / religious beliefs [11]. For most people who do not have abnormalities of the external genitalia, sex is documented at birth and, in the majority population, sex and gender are consistent throughout life.

A strict definition, such as in the DSM-5 [1], requires that individuals diagnosed with GD have to suffer clinically significant distress or impairment in social, school, or other important areas of functioning. A range of terminology is used to describe young people experiencing GD. It is apparent from the literature that the terminology is changing rapidly, and that the terms ‘assigned female / male at birth’ (AFAB /AMAB) or ‘natal fe/male’ are commonly used in recent literature. We have chosen to use the term ‘natal fe/male’ (abbreviated to NF / NM) because it is less cumbersome and is inclusive of those experiencing GD and who not identify with either male or female genders (although we acknowledge it excludes many intersex people).

There is currently an intense international debate regarding a number of issues relating to GD [12]. A recent high profile example involved the Gender Identity Development Service (GIDS) in London, UK, altering its procedures due to a High Court ruling that it would be ‘highly unlikely’ that children under 13, and ‘very doubtful’ that 14- and 15-year olds, could be Gillick competent [13], and therefore they could not consent to puberty suppression treatment [14]. This decision was met with equivocal support and criticism, not least as it has implications for the broader application of the Gillick framework for consent to medical procedures. (It has since been successfully appealed [15]) A lack of good quality evidence has been acknowledged [16]. A recent review by the Swedish Agency for Health Technology Assessment and Assessment of Social Services [17] indicated that there is very little in terms of empirical evidence in the field, both in terms of overall GD epidemiology, the association of GD with mental health problems, the rate and types of medical interventions provided and outcomes (including outcomes for those not treated medically or surgically) in the longer term.

Scope of the review

This review addresses the first of three sets of questions addressing the current state of evidence on gender dysphoria experienced in adolescence. Our over-arching aim was to establish ‘what does the literature tell us about gender dysphoria in adolescence?’ We broke this down into seven specific questions:

  1. What is the prevalence of GD in adolescence?
  2. What are the proportions of natal males / females with GD in adolescence (a) and has this changed over time (b)?
  3. What is the pattern of age at (a) onset (b) referral (c) assessment (d) treatment?
  4. What is the pattern of mental health problems in this population?
  5. What treatments have been used to address GD in adolescence?
  6. What outcomes are associated with treatment/s for GD in adolescence?
  7. What are the long-term outcomes for all (treated or otherwise) in this population?

The present paper focuses on questions 1, 2, 3a, 3b, and 3c. We shall address question 4 in a second paper, and questions 3d, and 5–7 in a final paper. The methodology below includes the searches conducted for the whole review.

We set out to include any paper offering primary data in response to any of these questions, regardless of the focus of that paper.


Protocol and registration

The systematic review protocol was submitted to PROSPERO on the 28th November 2019, and registered on 17 March 2020 (registration number CRD42020162047). An update was uploaded on 2nd February 2021 to include specific detail on age criteria and clinical verification of condition. The review has been prepared according to PRISMA 2020 [18] guidelines (see S1 Checklist).

Eligibility criteria

The volume of non-peer-reviewed literature in initial searches proved so great that we took the decision to only include peer-reviewed journal papers featuring original research data. This decision was made subsequent to initial PROSPERO registration, but prior to full text screening. Complete inclusion criteria were:

  • Focused on gender dysphoria or transgenderism;
  • Includes data on adolescents (aged 12–17 years inclusive);
  • Includes original data (not review paper or opinion piece);
  • Peer-reviewed publication (not theses or conference proceedings);
  • In English language.

Information sources

We searched PROSPERO and the Cochrane library for existing systematic reviews. We searched Ovid Medline 1946 –October week 4 2020, Embase 1947–present (updated daily), CINAHL 1983–2020, and PsycInfo 1914–2020. After selecting the final sample of articles, the first author used their reference lists as a secondary data source.


The final search was carried out on the 2nd November 2020 using a core strategy which was adapted according to the structure of each database. The core strategy included search terms for ‘adolescence’ and ‘gender dysphoria’. The specific search strategies employed in each database are detailed in Table 1.

Study selection

The study selection process is illustrated in Fig 1.

In the first stage of screening, papers were excluded based on their title or abstract if they did not clearly report on gender dysphoria or transgenderism and if they were focused on adult populations. In the second stage of screening, papers were excluded on the basis of title and abstract if they did not include original data (epidemiological, clinical, or survey) on adolescents (aged at least 12 and under 18 years). At both stages papers were retained if there was insufficient information to exclude them.

Full-text files were obtained for the remaining records.

Papers were rejected at this stage if they:

  • Contained no original data (including literature and clinical reviews, journalistic / editorial pieces, letters and commentaries);
  • Included only case studies or selected case series;
  • Pertained to conditions other than GD (e.g., Disorders of sexual development or HIV);
  • Did not include clinically-identified GD (e.g., survey where participants self-identify, with no clinical contact);
  • Pertained to populations other than those with GD (e.g., LGBTQ more broadly);
  • Pertained to populations including or restricted to those aged 18 years or older. This included papers where adolescents and adults were included in the same sample, but adolescents were not separately reported (in many cases age range was not reported and so a ‘balance of probabilities’ assessment had to be made based on the reported mean age);
  • Pertained to populations restricted to those aged under 12 years of age. This included papers where adolescents and children were included in the same sample, but the majority of participants were clearly under 12 (based on mean or median age);
  • Where participants were practitioners, not patients;
  • Referred only to conference proceedings;
  • Were written in a non-European language (e.g., Turkish);
  • Could not be obtained (including due to being published in non-English language journals, or in theses).

Following initial full text screening, all remaining papers were assessed by a second reviewer to reduce the risk of inclusion bias. Where reviewers reached a different conclusion, discussion took place to reach consensus. If agreement could not be reached, a third reviewer was consulted, and discussion used to reach consensus amongst all three reviewers.

Data extracted from eligible papers were tabulated and used in the quantitative and qualitative synthesis. The following information was recorded: sample size; natal sex; age (years); dates of data collection; study design; study location. Given the limited number of specialist treatment centres globally, we assessed how many of the included papers featured the same or overlapping samples.

Quality assessment

All papers were rated by two reviewers using the Crowe Critical Appraisal Tool v1·4 (CCAT [19]). CCAT is suitable for a range of methodological approaches, assessing papers in terms of eight categories: Preliminaries (overall clarity and quality); Introduction; Design; Sampling; Data collection; Ethical matters; Results; Discussion. Each category is rated out of 5 and all eight categories summed to give a total out of 40 (converted to a percentage). In the present review, each paper was then assigned to one of five categories, based on the average rating of the reviewers, where a rating of 0–20% was coded 1 (poorest quality), and 81–100% coded 5 (highest quality). Inter-rater reliability was shown to be very good (k = 0·93, SE = 0·05).

Data collection process

Data were extracted from the papers using the CCAT form ( by two reviewers per paper and compiled by the first author (LT). Any data missing from forms was extracted by LT. Once compiled, instances of overlap between papers (i.e., if the same sample was described in two papers) were identified and tabulated, and the final sample for each question defined.


Number of studies included, retained and excluded

The PRISMA diagram in Fig 1 provides details of the screening and exclusion process. The searches returned 8655 results, reduced to 6202 following de-duplication. Titles and abstracts were screened by one reviewer (LT) and 4659 records excluded after initial screening and a further 699 excluded on second stage title / abstract screening. This left 553 eligible for full text screening. An initial screening (LT) of full texts reduced the number of records to 155. Forty-eight papers were included in the final dataset, of which 38 included data for the present paper. Full characteristics of included studies are provided in Table 2.

The majority of samples were from the Netherlands (n = 10), followed by the USA (n = 10), the UK (n = 7), Canada (n = 5), Belgium (n = 2), Finland (n = 2), Germany (n = 1), Israel (n = 1), Australia (n = 1), Italy (n = 1), Switzerland (n = 1) and Turkey (n = 1) (note two papers together described six samples, hence the total is 42). The Netherlands data all pertained to the same centre and research group. All seven of the UK samples came from the same Gender Identity Development Service (GIDS: Tavistock & Portman NHS Trust) in London, and three of the Canadian papers came from the same Transgender Youth Clinic in Toronto. Accordingly, not all 42 samples are necessarily mutually exclusive. Overlapping samples were not always acknowledged, and so where overlap may have occurred (based on location, setting, age and date variables) this has been noted and has been taken into account in any analysis. Fig 2 provides a graphical representation of overlap between samples and indicates which papers contributed data to which analyses. Based on the reported information, in total we estimate between 3000 and 4000 adolescents assessed at specialist centres for GD between 1980 and 2019 were included in the 38 papers.

Fig 2. Overlap between included samples.

Key -//-: dates truncated; *: year of publication; NL: Netherlands; CA: Canada; UK: United Kingdom; AU: Australia; BE: Belgium; FI: Finland; DE: Germany; IL: Israel; IT: Italy; CH: Switzerland; TR: Turkey; US: United States.

Most studies were observational cohort studies, usually using retrospective record review (n = 26). A few studies included comparison to a normative sample or given population norms (n = 6). All but one paper was published within the past ten years (2011 or later) and all but seven in the past five years (2016 or later). Only five papers explicitly included data from before 2000 (a further six may have included pre-2000 data but did not report dates). All papers included both NM and NF participants, all studies reported the proportion of NM and NF participants in their sample, and most included age data (with age at assessment being the most widely reported) (see Table 2 and Fig 2).

Twenty-four samples were reported to have met clinical diagnostic criteria for GD / GID, usually using one of the DSM manuals (5/28 did not state which criteria were applied). The remaining 18 samples did not report whether participants met diagnostic criteria, but were included on the basis of being established patients within a specialist treatment centre, either in active assessment or treatment (n = 14) or were the result of secondary data mining where ICD 9/10 codes and appropriate keywords were used to establish likely GD (n = 4).

A substantial group of papers narrowly missed inclusion criteria, mostly on the age criterion and some on the verified GD criterion, and were not included in the final sample of reviewed papers. We documented characteristics of all studies excluded at the final full text screen in Table 5.

Overall findings based on included studies

1. What is the prevalence of GD in adolescence?

It is not possible to address this question from the existing literature. Whilst a number of surveys exist that would allow one to make estimates of prevalence, none were conducted using whole population samples. Further, we chose to focus this review only on papers where GD had been clinically verified as we were interested in adolescents seeking and considered eligible for intervention. Given that Amsterdam and London centres are the only specialist centres in their respective countries, both of which have state-funded health systems, it would not be unreasonable to use their data as a likely indication of incidence / prevalence. The figures reported in de Graaf et al. [20] give the largest and most recent sample from these 2 locations (252 and 610 respectively), but these are sub-samples of the clinic population for whom data were available, and they do not comment on prevalence as a proportion of the population. It is possible to say there has been an increase in adolescents presenting for treatment in recent years. For example, Chen et al. [21] report the majority of their sample (73·6%) presented for treatment in the final two years of a 13 year period (2002–2015).

2. What are the proportions of natal males / females with GD in adolescence and has this changed over time?

All included studies featured data on participants’ natal sex, usually at the time of first being assessed by a specialist gender clinic. A simple pooling of proportions from all the papers indicates 36% were natal males and 64% natal females. Restricting our analysis only to those studies we could be certain had distinct samples (and aiming to select the largest / widest date range within, see Fig 2), the proportions remained similar at 37% natal male and 63% natal female.

One paper addressed the question of a recent shift in natal sex ratio directly. Chiniara et al. [22] conducted a within-sample analysis of their 2014–2016 referred participants in Toronto and found no change in that short time period. They also compared the natal sex ratio in their sample to those previously published and found a shift in more recent years (1:3 favouring natal females vs 0·8–0·9:1 in earlier studies).

Although only a few papers in our sample addressed the question directly, we used pooled data to explore whether there is evidence of a shift in recent years to more natal boys or girls seeking assessment / treatment. Papers were grouped into three categories according to the date range that samples were assessed: pre-2000; 2001–2010; 2011 onward. This was challenging as most studies were retrospective chart reviews covering wide date ranges from the late 1980s to beyond 2010. However, it is possible to say that the proportion of natal males is slightly lower (30%) in those papers featuring participants assessed only from 2011 onwards (ten papers). These data are summarised in Table 2.

3. What is the pattern of age at (a) onset (b) referral and (c) assessment?

Only one of the included papers focused specifically on age of onset: Matthews et al. [23] reported a mean age of onset of 6·80 years (SD 3·9) (range 1–15) among 168 referrals to the London GIDS. Six papers reported explicitly on age of referral (Costa et al., 2015 [24]; de Graaf et al. 2018 [20]; Heard et al. 2018 [25]; Holt et al. [26], 2016; Mahfouda et al. [27]; Matthews et al. 2019 [23]) giving a pooled mean age of referral as 13·2 years (SD 0·9) (data from Costa et al., 2015 [24], excluded due to overlap with de Graaf et al., 2018 [20]). One paper (Becerra-Culqui et al., 2018 [28]) used medical records and took participants’ age from ‘the first evidence of transgender and/or gender nonconforming status’ based on the presence of certain keywords in medical notes. Some papers (e.g., Chen et al., 2016 [21]) were not explicit about whether they were reporting age at referral or assessment, but usually it could be inferred that the reported age was at assessment.

Age was most usually reported at point of assessment or intervention (see Table 3). Not all papers reported full age data: a mean and standard deviation was usually given, but not always a range. The pooled mean age of assessment was 15·1 years (SD 1·0) and the range (from fewer papers) was 6·0–18·0 years.

Quality assessment

The CCAT quality ratings ranged from 45% to 96%, with a mean of 78%. Most papers achieved an overall rating of 4 (good) or 5 (very good), with strengths and weaknesses within certain discrete categories; most papers achieved good ratings in the ‘preliminaries’ and ‘introduction’ categories, whereas the ‘ethics’ and ‘discussion’ categories were most likely to include lower ratings: 17 and 16 papers respectively achieved ratings below 4. In total, only one paper was rated as 3 (moderate quality): Cohen-Kettenis & Van Goozen (2002) obtained low ratings across most categories, due to unclear sampling and diagnostic information, lack of information to permit replication, and conclusions which are not supported by the findings. Of the remainder, 16 were rated as high quality (4), and 21 as very high quality (5; see Table 4). There was no relationship between the year of publication and quality rating (r = 0·2).

Table 4. Quality ratings using Crowe Critical Appraisal Tool (CCAT).


This systematic review synthesises the current evidence regarding the age and natal sex of adolescents presenting to specialist services and assessed as having gender dysphoria (GD). Based on 38 papers meeting inclusion criteria, there is evidence of an increase in frequency of presentation to services since 2011, and of a shift in the natal sex of referred cases: those assigned female at birth are now in the majority. Within these samples the average age of referral was 13 years, and the average age of assessment was 15 years. This review is the first of its kind to focus on adolescent samples where diagnostic criteria for GD were met, or significant GD features were clinically verified.

Although other good quality review papers have been published [12, 17, 2931], they have tended not to apply a systematic review methodology or have taken a broader scope in their inclusion criteria. We believe this is the first systematic review focused only on adolescents aged under 18 years and on clinically-verified samples taking into account likely study overlap.

Due to a lack of population-based research including cases of clinically-verified GD, this review was unable to report overall prevalence of GD (although it would not be unreasonable to use the Amsterdam and London data to make a good estimate). At present, the only means of estimating prevalence is to use population-based survey data, which carries risk of respondent bias (and such papers were excluded from our sample). Some studies used administrative records to ascertain samples of adolescents with GD. The reliability of this method is dependent on GD being accurately recorded, and on administrative data systems having universal coverage. These criteria could not be met by the samples included in the present review and should remain a focus for development in future research.

This review confirmed that the increase in referrals and the shift in sex-ratio that has been observed more widely in survey data and referred populations is also present in clinically-verified samples. We were able to report data from relatively few papers, however. Given the size of the literature, it would be useful if more studies clearly reported or clearly differentiated samples according to the stage of identification / referral / assessment participants had reached. It is clear that many of the samples reported in this review began as much larger samples with significant attrition before completing assessment and / or intervention. Not all papers fully reported either attrition figures or reasons for drop-out. Most papers were cross-sectional descriptions of cohorts of clinic patients with no longer term follow-up or comparison with control or normative samples, which limits the external generalisability of the findings.

We were unable to report on age of onset of GD as this was rarely reported. Where it was reported, it was based on patient / parent recall and difficult to verify clinically (e.g., Matthews et al. [23] report a lower age of onset of 1 year). Age of referral also proved difficult to report on as most samples that could be classed as clinically-verified were further along the line in their clinical journey. Whilst some papers reported age at referral, it was not always clear whether ‘referral’ meant the age at which a young person first sought help and was referred to a specialist service, or the age at which they had their first contact with that service. Given that most treatment takes place at national specialist centres, waiting lists may mean that age of first specialist contact may be two or three years beyond age at first referral. This will vary according to the economic basis of national health systems.

Age at assessment data reported in this review suggests that young people may not be receiving specialist support when they need it. The Dutch model, developed by the Center of Expertise on GD in the Netherlands where GD interventions have been pioneered, considers age 12 years to be the lower threshold for puberty suppression treatment, and age 16 as the threshold for cross-sex hormone treatment [6]. Participants in our samples had an average age of 15 years at the time of assessment, and so are likely to have already undergone significant pubertal changes associated with their natal sex. It may be that young people are first presenting to services once pubertal changes have begun and GD has become established, but delays between first presentation to services and being seen in a specialist service may also be important. It is possible that critical windows of opportunity for intervention are being missed. Our third review paper will focus on the data regarding age at intervention and related outcomes in more detail. More transparent reporting of age data and consistent use of terminology would allow us to better understand the clinical landscape and could inform rational service development.

Strengths and limitations

This review has strength in the broad search strategy and thorough hand screening process applied. There are methodological limitations which need to be considered. The application of strict inclusion criteria in a rapidly growing field, with new findings emerging on an almost daily basis, means it is impossible to be completely up to date. For example, Zucker and Aitken (2019) [32] recently confirmed the shift in sex ratio of transgender adolescents in a large meta-analysis, but this was not included as it is, at present, a conference proceeding and not a peer-reviewed paper. The broad initial search criteria led to the need for some narrowing of criteria following initial screening (but prior to full-text screening). The addition of parameters regarding type of publication, upper age of participants, and the clinical verification of GD naturally narrowed the pool of papers and therefore may have meant papers with important findings have been excluded (for example, if a paper included an upper age limit of 21 even though the majority were younger than 18). We recorded all papers that only narrowly missed inclusion on the age criterion (Table 5), but this will naturally have led to important papers in the field being missed from the review. For example, the paper by Aitken et al. (2015) had to be excluded as in Sample 1 (Canadian) was the eldest was 19 years old, and in Sample 2 (Netherlands) diagnostic status was unclear. Alberse et al. (2019) [33] had an upper age range of 18.03, requiring its exclusion. These were otherwise good papers that we would like to have been able to include. However, it was important that we consistently apply our a priori criteria at every stage of screening, even when it meant that important papers only very narrowly missing inclusion had to be excluded. To apply flexibility only at the final screen presented a risk to the integrity of screening at earlier stages. We do not wish to give the impression that the importance of these papers has not been considered. There are several non-systematic reviews available that will include these high-profile papers. Our objective was to take in the totality of the literature and then examine the state of the evidence once strict criteria were applied.

Table 5. Papers excluded at second full text screen (i.e., closely missed meeting inclusion criteria).

Due to the scale of the overall review, no formal hand searching was included, although we did check whether any relevant texts cited in our review of papers had been included in our sample, and they had (either excluded at an earlier stage or included in the final sample). We opted to use a quality assessment tool for studies of diverse designs (CCAT). This allowed all papers to be rated using the same system, but also involved reviewers having to make subjective ratings rather than apply a strictly quantifiable checklist. This may have led to issues with quality, such as over-statement of the significance of findings, not being sufficiently prominent. The quality of the literature is mixed, and we were unable to clearly answer all the research questions. Although this presents a limitation to this review, it also constitutes an important finding in and of itself.

Although we were able to include 38 papers from a range of countries in this review, almost half arose from two well-established treatment centres: those in Amsterdam and London. The Amsterdam team has led the way in developing assessment and treatment protocols for GD and provides a wealth of data over a long period (since 1996 within the included papers), and the London GIDS is a hub for the whole of the UK now dealing with hundreds of referrals per year. This presents the advantage of being able to observe the adolescent GD population over a long period of time, assessed using the same or similar tools, and within a relatively stable social context. It is not clear, however, what proportion of young people experiencing GD have access to these national specialist centres and how many may be accessing private facilities or self-medicating with hormones obtained via other routes: we do not know how representative these samples are. Another disadvantage is that most of the papers included in this review are likely to include data from the same samples of participants, also limiting generalisability. The overlap between samples was rarely overtly stated, and there is a risk that readers may add greater weight to collective findings than is warranted. The samples included here may also represent those most severely affected not only by GD but by poor mental wellbeing more generally. The second paper in this series will focus specifically on the evidence regarding psychological distress / psychiatric comorbidity in young people with GD.


GD is an area of growing prominence and therefore generates a growing literature. The observed increase in referrals, particularly in NF adolescents, warrants further investigation. Whilst improvements in availability of services and diagnostic practices are likely to have contributed to this, there has undoubtedly been a shift in cultural attitudes leading to gender non-conformity being more acceptable. The role of de-stigmatisation in the experiences of young people with GD and their decisions to seek support should be explored within the context of mental wellbeing more broadly. It is clear that this is a particularly vulnerable population often presenting as psychologically complex cases: without good epidemiological data we cannot begin to elucidate the lived reality of GD and ensure that intervention / support is equitable, appropriate and timely, and minimises harm. This review has been limited by heterogeneity in recording and reporting practices, and by limited representation beyond national, publicly-funded clinical services. Clinical research centres should gather data prospectively on all referrals with full informed consent and document their assessment protocols, treatments and outcomes. Whole population studies using administrative datasets reporting on GD / gender non-conformity may be necessary to gain a clear understanding of the epidemiology of clinical GD, along with inter-disciplinary research evaluating the lived experience of adolescents with GD.

Supporting information

S1 Checklist. PRISMA checklist.

From: Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 2021;372:n71. doi: 10.1136/bmj.n71.



Special thanks to Ingrid Vinsa, Research Nurse and Administrator at the Gillberg Neuropsychiatry Centre, for her invaluable assistance in obtaining full text papers and assistance to CG in supervision of this piece of work.


  1. 1. American Psychiatric Association. Diagnostic and statistical manual of mental disorders: DSM-5. 5 ed. Washington, DC.: American Psychiatric Publishing; 2013.
  2. 2. Aitken M, Steensma TD, Blanchard R, Vanderlaan DP, Wood H, Fuentes A, et al. Evidence for an altered sex ratio in clinic-referred adolescents with gender dysphoria. Journal of Sexual Medicine. 2015;12(3):756–63. pmid:25612159
  3. 3. Arnoldussen M, Steensma TD, Popma A, van der Miesen AIR, Twisk JWR, de Vries ALC. Re-evaluation of the Dutch approach: are recently referred transgender youth different compared to earlier referrals? European Child and Adolescent Psychiatry. 2019. pmid:31473831
  4. 4. de Graaf NM, Giovanardi G, Zitz C, Carmichael P. Sex ratio in children and adolescents referred to the Gender Identity Development Service in the UK (2009–2016). Archives of Sexual Behavior. 2018;47(5):1301–4. pmid:29696550 2018-23540-001.
  5. 5. Steensma TD, Cohen-Kettenis PT, Zucker KJ. Evidence for a Change in the Sex Ratio of Children Referred for Gender Dysphoria: Data from the Center of Expertise on Gender Dysphoria in Amsterdam (1988–2016). Journal of Sex & Marital Therapy. 2018;44(7):713–5. pmid:29412073
  6. 6. De Vries ALC, McGuire JK, Steensma TD, Wagenaar ECF, Doreleijers TAH, Cohen-Kettenis PT. Young adult psychological outcome after puberty suppression and gender reassignment. Pediatrics. 2014;134(4):696–704. pmid:25201798
  7. 7. di Giacomo E, Krausz M, Colmegna F, Aspesi F, Clerici M. Estimating the Risk of Attempted Suicide Among Sexual Minority Youths: A Systematic Review and Meta-analysis. JAMA Pediatrics. 2018;172(12):1145–52. pmid:30304350. Language: English. Entry Date: 20181211. Revision Date: 20181231. Publication Type: Article. Journal Subset: Biomedical.
  8. 8. Garcia-Vega E, Camero A, Fernandez M, Villaverde A. Suicidal ideation and suicide attempts in persons with gender dysphoria. Psicothema. 2018;30(3):283–8. pmid:30009750.
  9. 9. Dhejne C, Van Vlerken R, Heylens G, Arcelus J. Mental health and gender dysphoria: A review of the literature. International Review of Psychiatry. 2016;28(1):44–57. pmid:26835611
  10. 10. American Psychological Association. Answers to your questions: For a better understanding of sexual orientation and homosexuality. Washington, DC: American Psychological Association, 2008.
  11. 11. Luyckx K, Schwartz SJ, Goossens L, Beyers W, Missotten L. Processes of Personal Identity Formation and Evaluation. In: Schwartz SJ, Luyckx K, Vignoles VL, editors. Handbook of Identity Theory and Research. New York, NY: Springer New York; 2011. p. 77–98.
  12. 12. Kaltiala-Heino R, Bergman H, Työläjärvi M, Frisén L. Gender dysphoria in adolescence: current perspectives. Adolesc Health Med Ther. 2018;9:31–41. pmid:29535563.
  13. 13. Griffith R. What is Gillick competence? Hum Vaccin Immunother. 2016;12(1):244–7. pmid:26619366.
  14. 14. Limb R, James L. Journal of Medical Ethics blog [Internet].
  15. 15. Thornton J. Court upholds Gillick competence in puberty blockers case. The Lancet. 2021;398(10307):1205–6. pmid:34563272
  16. 16. de Vries ALC, Richards C, Tishelman AC, Motmans J, Hannema SE, Green J, et al. Bell v Tavistock and Portman NHS Foundation Trust [2020] EWHC 3274: Weighing current knowledge and uncertainties in decisions about gender-related treatment for transgender adolescents. International Journal of Transgender Health. 2021:1–8. pmid:33899047
  17. 17. SBU. Gender dysphoria in children and adolescents: an inventory of the literature. A systematic scoping review.
  18. 18. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. pmid:33782057
  19. 19. Crowe M, Sheppard L, Campbell A. Reliability analysis for a proposed critical appraisal tool demonstrated value for diverse research designs. Journal of clinical epidemiology. 2012;65(4):375–83. Epub 2011/11/15. pmid:22078576.
  20. 20. de Graaf NM, Cohen-Kettenis PT, Carmichael P, de Vries ALC, Dhondt K, Laridaen J, et al. Psychological functioning in adolescents referred to specialist gender identity clinics across Europe: A clinical comparison study between four clinics. European Child & Adolescent Psychiatry. 2018;27(7):909–19. pmid:29256158 2017-57225-001.
  21. 21. Chen M, Fuqua J, Eugster EA. Characteristics of Referrals for Gender Dysphoria over a 13-Year Period. Journal of Adolescent Health. 2016;58(3):369–71.
  22. 22. Chiniara LN, Bonifacio HJ, Palmert MR. Characteristics of adolescents referred to a gender clinic: Are youth seen now different from those in initial reports? Hormone Research in Paediatrics. 2018;89(6):434–41. pmid:29920505
  23. 23. Matthews T, Holt V, Sahin S, Taylor A, Griksaitis D. Gender Dysphoria in looked-after and adopted young people in a gender identity development service. Clinical Child Psychology & Psychiatry. 2019;24(1):112–28. pmid:30101601 Language: English. Entry Date: 20181227. Revision Date: 20190507. Publication Type: Article. Journal Subset: Biomedical.
  24. 24. Costa R, Dunsford M, Skagerberg E, Holt V, Carmichael P, Colizzi M. Psychological Support, Puberty Suppression, and Psychosocial Functioning in Adolescents with Gender Dysphoria. Journal of Sexual Medicine. 2015;12(11):2206–14. pmid:60706063.
  25. 25. Heard J, Morris A, Kirouac N, Ducharme J, Trepel S, Wicklow B. Gender dysphoria assessment and action for youth: Review of health care services and experiences of trans youth in Manitoba. Paediatrics & Child Health (1205–7088). 2018;23(3):179–84. pmid:29769803 Language: English. Entry Date: 20180914. Revision Date: 20180914. Publication Type: Article. Journal Subset: Biomedical.
  26. 26. Holt V, Skagerberg E, Dunsford M. Young people with features of gender dysphoria: Demographics and associated difficulties. Clinical child psychology and psychiatry. 2016;21(1):108–18. pmid:25431051.
  27. 27. Mahfouda S, Panos C, Whitehouse AJO, Thomas CS, Maybery M, Strauss P, et al. Mental Health Correlates of Autism Spectrum Disorder in Gender Diverse Young People: Evidence from a Specialised Child and Adolescent Gender Clinic in Australia. Journal of Clinical Medicine. 2019;8(10):20. pmid:31547002.
  28. 28. Becerra-Culqui TA, Liu Y, Nash R, Cromwell L, Flanders WD, Getahun D, et al. Mental health of transgender and gender nonconforming youth compared with their peers. Pediatrics. 2018;141 (5) (no pagination)(e20173845). pmid:29661941
  29. 29. Leibowitz S, de Vries ALC. Gender dysphoria in adolescence. International Review of Psychiatry. 2016;28(1):21–35. pmid:26828376
  30. 30. Skordis N, Kyriakou A, Dror S, Mushailov A, Nicolaides NC. Gender dysphoria in children and adolescents: an overview. Hormones (Athens, Greece). 2020;19(3):267–76. Epub 2020/02/06. pmid:32020566.
  31. 31. Zucker KJ. Epidemiology of gender dysphoria and transgender identity. Sexual Health. 2017;14(5):404–11. pmid:28838353
  32. 32. Zucker KJ, Aitken M, editors. Sex Ratio of Transgender Adolescents: A Meta-Analysis. 3rd biennial EPATH Conference Inside Matters On Law, Ethics and Religion; 2019; Rome, Italy: EPATH.
  33. 33. Alberse AME, de Vries AL, Elzinga WS, Steensma TD. Self-perception of transgender clinic referred gender diverse children and adolescents. Clinical child psychology and psychiatry. 2019;24(2):388–401. pmid:30672324
  34. 34. Tack LJW, Craen M, Lapauw B, Goemaere S, Toye K, Kaufman JM, et al. Proandrogenic and Antiandrogenic Progestins in Transgender Youth: Differential Effects on Body Composition and Bone Metabolism. Journal of Clinical Endocrinology and Metabolism. 2018;103(6):2147–56. pmid:29672753.
  35. 35. Feder S, Isserlin L, Seale E, Hammond N, Norris ML. Exploring the association between eating disorders and gender dysphoria in youth. Eating Disorders. 2017;25(4):310–7. pmid:28281883.
  36. 36. Sorbara JC, Chiniara LN, Thompson S, Palmert MR. Mental Health and Timing of Gender-Affirming Care. Pediatrics. 2020;146(4):10. pmid:32958610.
  37. 37. Kaltiala-Heino R, Sumia M, Tyolajarvi M, Lindberg N. Two years of gender identity service for minors: Overrepresentation of natal girls with severe problems in adolescent development. Child and Adolescent Psychiatry and Mental Health. 2015;9 (1) (no pagination)(9). pmid:25657818
  38. 38. Kaltiala-Heino R, Tyolajarvi M, Lindberg N. Sexual experiences of clinically referred adolescents with features of gender dysphoria. Clinical child psychology and psychiatry. 2019;24(2):365–78. pmid:30968725
  39. 39. Levitan N, Barkmann C, Richter-Appelt H, Schulte-Markwort M, Becker-Hebly I. Risk factors for psychological functioning in German adolescents with gender dysphoria: poor peer relations and general family functioning. European Child and Adolescent Psychiatry. 2019. pmid:30877477.
  40. 40. Amir H, Yaish I, Oren A, Groutz A, Greenman Y, Azem F. Fertility preservation rates among transgender women compared with transgender men receiving comprehensive fertility counselling. Reproductive Biomedicine Online. 2020;41(3):546–54. pmid:32651108.
  41. 41. Fisher AD, Ristori J, Castellini G, Sensi C, Cassioli E, Prunas A, et al. Psychological characteristics of Italian gender dysphoric adolescents: a case-control study. Journal of Endocrinological Investigation. 2017;40(9):953–65. pmid:28357782
  42. 42. de Vries ALC, Steensma TD, Cohen-Kettenis PT, VanderLaan DP, Zucker KJ. Poor peer relations predict parent- and self-reported behavioral and emotional problems of adolescents with gender dysphoria: a cross-national, cross-clinic comparative analysis. European Child and Adolescent Psychiatry. 2016;25(6):579–88. pmid:26373289
  43. 43. Cohen-Kettenis PT, Van Goozen SHM. Adolescents who are eligible for sex reassignment surgery: Parental reports of emotional and behavioural problems. Clinical Child Psychology and Psychiatry. 2002;7(3):412–22. pmid:34886951.
  44. 44. de Vries AL, Doreleijers TA, Steensma TD, Cohen-Kettenis PT. Psychiatric comorbidity in gender dysphoric adolescents. Journal of child psychology and psychiatry, and allied disciplines. 2011;52(11):1195–202. pmid:21671938.
  45. 45. De Vries AL, Steensma TD, Doreleijers TA, Cohen-Kettenis PT. Puberty suppression in adolescents with gender identity disorder: A prospective follow-up study. Journal of Sexual Medicine. 2011;8(8):2276–83. pmid:20646177.
  46. 46. Klaver M, de Mutsert R, Wiepjes CM, Twisk JWR, den Heijer M, Rotteveel J, et al. Early Hormonal Treatment Affects Body Composition and Body Shape in Young Transgender Adolescents. Journal of Sexual Medicine. 2018;15(2):251–60. pmid:29425666.
  47. 47. Klaver M, de Mutsert R, van der Loos M, Wiepjes CM, Twisk JWR, den Heijer M, et al. Hormonal Treatment and Cardiovascular Risk Profile in Transgender Adolescents. Pediatrics. 2020;145(3):03. pmid:32102929.
  48. 48. Schagen SE, Delemarre-van de Waal HA, Blanchard R, Cohen-Kettenis PT. Sibling sex ratio and birth order in early-onset gender dysphoric adolescents. Archives of sexual behavior. 2012;41(3):541–9. pmid:21674256
  49. 49. van der Miesen AIR, de Vries ALC, Steensma TD, Hartman CA. Autistic Symptoms in Children and Adolescents with Gender Dysphoria. Journal of Autism and Developmental Disorders. 2018;48(5):1537–48. pmid:29189919
  50. 50. Akgul GY, Ayaz AB, Yildirim B, Fis NP. Autistic Traits and Executive Functions in Children and Adolescents With Gender Dysphoria. Journal of sex & marital therapy. 2018;44(7):619–26. pmid:29419374
  51. 51. Joseph T, Ting J, Butler G. The effect of GnRH analogue treatment on bone mineral density in young adolescents with gender dysphoria: findings from a large national cohort. Journal of pediatric endocrinology & metabolism: JPEM. 2019;31.
  52. 52. Russell I, Pearson B, Masic U. A Longitudinal Study of Features Associated with Autism Spectrum in Clinic Referred, Gender Diverse Adolescents Accessing Puberty Suppression Treatment. Journal of Autism and Developmental Disorders. 2020.
  53. 53. Skagerberg E, Davidson S, Carmichael P. Internalizing and externalizing behaviors in a group of young people with gender dysphoria. International Journal of Transgenderism. 2013;14(3):105–12. 2013-33312-002.
  54. 54. Edwards-Leeper L, Feldman HA, Lash BR, Shumer DE, Tishelman AC. Psychological profile of the first sample of transgender youth presenting for medical intervention in a US pediatric gender center. Psychology of Sexual Orientation and Gender Diversity. 2017;4(3):374–82. 2017-33352-001.
  55. 55. Kuper LE, Lindley L, Lopez X. Exploring the gender development histories of children and adolescents presenting for gender affirming medical care. Clinical Practice in Pediatric Psychology. 2019;7(3):217–28. 2019-52280-002.
  56. 56. Kuper LE, Mathews S, Lau M. Baseline Mental Health and Psychosocial Functioning of Transgender Adolescents Seeking Gender-Affirming Hormone Therapy. Journal of developmental and behavioral pediatrics: JDBP. 2019;03. pmid:31166250
  57. 57. Kuper LE, Stewart S, Preston S, Lau M, Lopez X. Body Dissatisfaction and Mental Health Outcomes of Youth on Gender-Affirming Hormone Therapy. Pediatrics. 2020;145(4):04. pmid:32220906.
  58. 58. Lopez CM, Solomon D, Boulware SD, Christison-Lagay ER. Trends in the use of puberty blockers among transgender children in the United States. Journal of Pediatric Endocrinology and Metabolism. 2018;31(6):665–70. pmid:29715194.
  59. 59. Moyer DN, Connelly KJ, Holley AL. Using the PHQ-9 and GAD-7 to screen for acute distress in transgender youth: findings from a pediatric endocrinology clinic. Journal of Pediatric Endocrinology & Metabolism. 2019;32(1):71–4. pmid:30530884.
  60. 60. Nahata L, Quinn GP, Caltabellotta NM, Tishelman AC. Mental Health Concerns and Insurance Denials Among Transgender Adolescents. LGBT health. 2017;4(3):188–93. pmid:28402749
  61. 61. Olson-Kennedy J, Chan YM, Garofalo R, Spack N, Chen D, Clark L, et al. Impact of Early Medical Treatment for Transgender Youth: Protocol for the Longitudinal, Observational Trans Youth Care Study. JMIR Research Protocols. 2019;8(7):e14434. pmid:31290407.