Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Phenogenon: Gene to phenotype associations for rare genetic diseases

  • Nikolas Pontikos ,

    Contributed equally to this work with: Nikolas Pontikos, Cian Murphy, Ismail Moghul

    Roles Conceptualization, Data curation, Investigation, Software, Supervision, Visualization, Writing – original draft, Writing – review & editing

    Affiliations UCL Genetics Institute, University College London, London, United Kingdom, Institute of Ophthalmology, University College London, London, United Kingdom, Moorfields Eye Hospital, London, United Kingdom

  • Cian Murphy ,

    Contributed equally to this work with: Nikolas Pontikos, Cian Murphy, Ismail Moghul

    Roles Formal analysis, Investigation, Writing – original draft, Writing – review & editing

    Affiliations UCL Genetics Institute, University College London, London, United Kingdom, Warwick Medical School, University of Warwick, Coventry, United Kingdom

  • Ismail Moghul ,

    Contributed equally to this work with: Nikolas Pontikos, Cian Murphy, Ismail Moghul

    Roles Formal analysis, Investigation, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliation UCL Cancer Institute, University College London, London, United Kingdom

  • Gavin Arno,

    Roles Writing – review & editing

    Affiliations Institute of Ophthalmology, University College London, London, United Kingdom, Moorfields Eye Hospital, London, United Kingdom, Laboratory of Visual Physiology, Division of Vision Research, National Institute of Sensory Organs, National Hospital Organization Tokyo Medical Center, Tokyo, Japan

  • Kaoru Fujinami,

    Roles Resources

    Affiliations Institute of Ophthalmology, University College London, London, United Kingdom, Moorfields Eye Hospital, London, United Kingdom, Laboratory of Visual Physiology, Division of Vision Research, National Institute of Sensory Organs, National Hospital Organization Tokyo Medical Center, Tokyo, Japan, Department of Ophthalmology, Keio University School of Medicine, Tokyo, Japan

  • Yu Fujinami,

    Roles Resources

    Affiliations Graduate School of Health Management, Keio University, Tokyo, Japan, Division of Public Health, Yokokawa Clinic, Osaka, Japan

  • Dayyanah Sumodhee,

    Roles Writing – review & editing

    Affiliation Queen Mary University, Mile End Road, Bethnal Green, London, United Kingdom

  • Susan Downes,

    Roles Resources

    Affiliations Oxford Eye Hospital, West Wing, John Radcliffe Hospital, Oxford, United Kingdom, Nuffield Department of Clinical Neurosciences, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom

  • Andrew Webster,

    Roles Resources, Writing – review & editing

    Affiliations Institute of Ophthalmology, University College London, London, United Kingdom, Moorfields Eye Hospital, London, United Kingdom

  • Jing Yu ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Nuffield Department of Clinical Neurosciences, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom

  • UK Inherited Retinal Dystrophy Consortium, Phenopolis Consortium

    Membership of the UK Inherited Retinal Dystrophy Consortium and Phenopolis Consortium is provided in the Acknowledgments.


As high-throughput sequencing is increasingly applied to the molecular diagnosis of rare Mendelian disorders, a large number of patients with diverse phenotypes have their genetic and phenotypic data pooled together to uncover new gene-phenotype relations. We introduce Phenogenon, a statistical tool that combines, Human Phenotype Ontology (HPO) annotated patient phenotypes, gnomAD allele population frequency, and Combined Annotation Dependent Depletion (CADD) score for variant pathogenicity, in order to jointly predict the mode of inheritance and gene-phenotype associations. We ran Phenogenon on our cohort of 3,290 patients who had undergone whole exome sequencing. Among the top associations, we recapitulated previously known, such as "SRD5A3—Abnormal full-field electroretinogram—recessive" and "GRHL2 –Nail dystrophy—recessive", and discovered one potentially novel, “RRAGA–Abnormality of the skin—dominant”. We also developed an interactive web interface available at to visualise and explore the results.


As DNA sequencing cost decreases, whole exome sequencing (WES) has become prevalent in the molecular testing of individuals with rare Mendelian disorders. This has led to the identification of many variants of unknown pathogenicity and clinical significance, with associated difficulty in variant interpretation. A common practice for variant prioritisation is to search for phenotypically similar disease cases with variants in known genes. Conventionally, this is done by searching databases such as dbSNP [1] and ClinVar [2] for genetic variants, Online Mendelian Inheritance in Man (OMIM) for genes, and targeted disease databases such as RetNet [3] for retinal dystrophy. However, when no candidate genes or variants are found in published cases with a known genetic diagnosis, an alternative solution is to group unsolved cases with similar phenotypes to increase the chances of finding shared genetic variations across genes.

The UK Inherited Retinal Dystrophy Consortium (UKIRDC) successfully applied this approach by whole exome sequencing 365 unsolved pre-screened retinal dystrophy patients from London, Leeds, Oxford and Manchester [49]. The WES and phenotype data were deposited as part of our Phenopolis database ( [10], which itself hosts 5122 (as of 2nd February 2019) exomes of patients with a range of disorders such as dementia, Crohn’s disease, seizures and bone-marrow failure (S1 Table).

This unique dataset provided the ideal opportunity to develop a novel statistical analysis tool, Phenogenon, in order to uncover gene-phenotype associations from large and phenotypically diverse cohorts of patients. The complete workflow of Phenogenon is described in (S1 Fig). Phenogenon does not require explicit thresholds for variant filtering, which rely on assumptions of disease prevalence and mode of inheritance, but instead bins genetic variants according to their population frequencies (gnomAD) and predicted pathogenicity (CADD) to produce a two-dimensional heatmap for each gene-phenotype association. The HPO Goodness of Fit (HGF) score is then calculated from each heatmap which allows for prioritisation of genes per phenotype. In addition, the heatmap is also used to derive a predicted mode of inheritance (MOI) of a gene-phenotype relation, which is a common use case when a novel gene is under consideration for a patient with unknown family history.

We applied Phenogenon to the Phenopolis exome dataset and were able to recapitulate known gene-phenotype relations, such as "SRD5A3—Abnormal full-field electroretinogram—recessive" and "GRHL2 –Nail dystrophy—recessive". We also discovered potentially a novel relation, "RRAGA–Abnormality of the skin—dominant".

Scripts to perform Phenogenon analysis are available at and an interactive visualisation tool is available at

Materials and methods

Patient phenotyping and selection

This study dataset includes 5122 exomes from the Phenopolis database comprising Mendelian and common disease patients. We used Human Phenotype Ontology [11] (HPO) as the standardised phenotype vocabulary for recording patient phenotypes, which were entered manually from patient notes by medical coders and extracted computationally from patient letters using cTAKES [12]. Patient relatedness was estimated using KING [13], and related individuals were excluded so as not to skew the genetic association tests. This resulted in a subset of 3290 exomes from unrelated individuals (Table 1).

Table 1. Total number of 3290 exomes by predominant phenotypes.

Variant calling and filtering

The variant calling and annotation pipeline has been described previously [10]. In brief, exomes were aligned using Novoalign to build GRCh37 of the human genome and variants were called and filtered using the Genome Analysis Tool Kit (GATK) best practices. Variants that did not pass the GATK filters, were not covered in gnomAD or were non-coding, defined as more than 5 base pairs away from nearest coding region, were filtered out. Variants with a missing rate of more than or equal to 20% in our data were also discarded. This left a total of 973,426 variants which were annotated with gnomAD frequencies [14] and CADD Phred score [15]. GnomAD was used as it remains the largest resource for population level variant frequency annotation; and CADD due to its popularity, ability to predict indels and ease of local installation.

Scoring “gene—Phenotype—Mode of inheritance” associations

We considered variant frequencies in gnomAD under both modes of inheritance (MOI), dominant or recessive. In the case of dominant inheritance, we defined the variant gnomAD frequency (GF) to be the gnomAD allele frequency, and in the case of recessive inheritance, the estimated homozygote frequency:

Given a gene, HPO term and MOI, variants found on the gene are binned according to their GF and CADD score (Fig 1A and 1B). We selected a bin height of 5 for CADD and a bin width of 1/4000 = 0.00025 for GF. From here on, variants with a GF < 0.00025 are considered to be rare variants. Binned variants are then used to identify patient carriers who are considered to be either cases or controls based on whether they had the selected HPO term or any of its child terms (Fig 1B). A case/control Fisher’s exact test (Fig 1C) is applied to each bin according to the contingency table in S1 Table. The Fisher test is repeated for all bins and a heatmap is produced coloured by the negative logarithm of the p-values. This heatmap is referred to as the Phenogenon profile for a “gene—HPO—MOI” relationship (Fig 1D). The z scores of the bins are then weighted (wi), according to S2 Table, and summed using a variation of Stouffer’s Z-score method, in order to obtain an overall Z score for the “gene—HPO—MOI” relationship (Fig 1D and 1E): (1) Where krareis the number of non-empty rare bins (GF < 0.00025). The motivation for the scale factor krare is explained in the S1 File. Finally, the Z score is converted to a p-value assuming a standard normal distribution and the negative logarithm of the p-value is used to define the HPO Goodness of Fit (HGF) for that “gene—HPO—MOI” relationship: (2) Where ϕ is the cumulative density function of the normal distribution.

Fig 1. Phenogenon profiling workflow.

A) The distribution of frequency vs CADD Phred score for variants of a single gene were binned according to empirically chosen cut-offs. B) Variants within each binned area are further analysed. Individuals carrying these variants are identified and then filtered on the basis of whether they have a selected HPO term. C) Fisher’s Exact test is then used to determine the significance of the gene-phenotype relationship. D) A Phenogenon heatmap is produced using the Fisher Exact P-Values for each binned area. E) Fisher Exact Scores for each of the binned area in the first column are collapsed into a single HPO goodness of fit score (HGF) using a Scaled Stouffer transformation.

For a given gene and HPO term, HGF scoring can be performed assuming either dominant or recessive mode of inheritance (MOI). When testing for recessive MOI, patients are assumed to be compound heterozygous if they carry a second variant, with a higher or equivalent CADD score and a lower or equivalent GF.

The signal ratio is calculated for each “HPO-gene-MOI” relationship, based on the observation that if a wrong MOI is assumed, the Phenogenon heatmap profile tends to produce more significant p-values bins for non-rare variants (GF > 0.00025) (S2 Fig).

The signal ratio (SR) is defined as: (3) Where kall represents the total number of non-empty non-rare bins with GF > 0.00025.

The “gene-HPO-MOI” score is then defined as: (4) The larger M value is deemed to be the most likely MOI.

Benchmarking Phenogenon

In order to benchmark our method and to choose an appropriate NP and HGF cut-off, we selected a list of 12 known gene-HPO-MOI relationships (Table 2). Our list included SCN1A (for dominant MOI) and ABCA4 (for recessive MOI). SCN1A encodes Sodium Voltage-Gated Channel Alpha Subunit 1, mutations of which have been linked to epilepsy with divergent clinical severity [16]. The mutations are either dominantly inherited or arise de novo [17] with the majority of mutations found in the severe form of epilepsy (severe myoclonic epilepsy in infancy; MIM# 607208) being mostly de novo [16]. ABCA4 encodes ATP Binding Cassette Subfamily A Member 4, and biallelic mutation of the ABCA4 gene leads to a spectrum of retinal diseases including Stargardt macular dystrophy, and cone-rod dystrophy [18].

Table 2. Known HPO-gene-MOI relationships used to benchmark Phenogenon.

We also compared the performance using our Phenogenon modified Stouffer’s Z-score method compared to Fisher’s method. Similar to Stouffer’s Z-score method, Fisher’s method also combines p-values to produce an overall p-value. However, it lacks the ability to assign weights, and therefore treats bins with different CADD phred scores equally. Specifically, Fisher’s method combines p-values using the following formula: Where X2 is a test statistic that follows a chi-squared distribution.

For each gene, we determined the MOI (using the M score) for each of the HPO terms with an affected sample size > = 60, unless stated otherwise; then according to the determined MOI, we calculated an HGF score for each of the HPO term. We calculated a mean and a standard deviation of the HPO HGF scores for the gene, and chose HPO terms with an HGF score at least one standard deviation higher than the mean as positive hits for the gene. We then compare the positive HPO terms with a set of hand-curated truth set to determine an error rate.

We benchmarked Phenogenon on predictions for the HPO terms and the MOI for each gene. A gene-HPO relation is deemed true if the relation is supported by the Human Phenotype Ontology.

We surmised that Phenogenon would not perform well for HPO terms that are too specific or too general. Specific HPO terms have small number of affected patients (NP), which limit the power of any measures of association analysis. On the other hand, general HPO terms, such as ‘Phenotypic abnormality’ (HP:0000118) and ‘All’ (HP:0000001), include almost all the samples for test, and will limit the analysis power in a similar way. To find out the optimal sample sizes for predictions, we chose a number of NP cut-offs to choose to only predict HPO terms with a NP equal or higher than the cut-offs.

We surmised that MOI prediction works best for gene-HPO relations supported with a high HGF score. To assess MOI predictions, we first chose an HGF cut-off, and benchmark MOI prediction on gene-HPO relations with a HGF score higher than the HGF cut-off. For comparison, we chose to use HGF score only for MOI prediction, so that: Where HGFd and HGFr are HGF scores assuming dominant and recessive MOI, respectively.

To demonstrate the benefit of using estimated homozygote frequency over allele frequency for association analyses when assuming recessive MOI, we also included predictions for comparison to use allele frequency (instead of estimated homozygote frequency) to produce M and HGF scores for recessive relations.

Phenogenon on a large patient cohort

Following the benchmarking, we applied Phenogenon to all protein coding genes in the Phenopolis dataset (number of unrelated patients: 3290, number of protein coding genes with variants: 21321), under both dominant and recessive inheritance modes. A breakdown of patient phenotypes is shown in Table 1.


Phenogenon made correct predictions on both HPO and MOI in a controlled environment

To benchmark Phenogenon, we selected 12 genes for which mutations have been reported to be causal in the cohort. The HPO term with highest HGF score for each tested gene can be found in Table 2.

As shown in Fig 2A, for both “ABCA4 –Macular dystrophy—recessive” and “SCN1A –Seizures—dominant”, bins showing strong association correctly clustered with rare variants (GF < 0.00025).

Fig 2. Using phenogenon to predict gene-HPO-mode of inheritance (MOI) relationships for the 12 known genes.

A. Examples of using Phenogenon to profile known relationships: ABCA4—Macular dystrophy (HP:0007754) -recessive, and SCN1A—Seizures (HP:0001250)—dominant. The color scales represent the HGF score. The majority of high-scoring bins are for rare variants (HGF < 0.00025). B. Error rate in predicting HPO when number of patients selected per gene is higher than ‘HPO NP cut-off’. The lines give the trend of error rates for each prediction model. C. Error rate for MOI when HPO selected per gene is higher than HGF cut-off. The lines give the trend of error rates for each prediction model. Orange line: model using gnomAD allele frequency instead of estimated homozygote frequency for recessive MOI; Red line: model using HGF for both HPO association and MOI prediction; Blue line: model using Fisher method to combine p values; Green line: our current model for Phenogenon.

Phenogenon (green line, Fig 2B) outperformed Fisher (blue line, Fig 2B), demonstrating the benefit of assigning higher weights to bins with higher CADD score.

Phenogenon correctly predicted HPO terms for which there are at least 55 patients affected (NP > 55) (green line, Fig 2B), although as expected, the error rate increased when including HPO terms see in fewer patients (NP < 20). Interestingly, the error rate increased when HPO NP > 100 (Fig 2B), suggesting that there are divergent genetic causes for less specific HPO terms. In addition, it also made wrong HPO prediction when assuming wrong MOI.

The M score (green line, Fig 2C) was more accurate in predicting the MOI than using HGF alone (red line, Fig 2C). Furthermore, as shown in Fig 2B and 2C, using GF defined as the gnomAD allele frequency when assuming recessive MOI (orange line) had a poorer performance than using estimated homozygote frequency (green line) in predicting HPO and MOI.

Phenogenon found known gene-HPO-MOI relationships in a large patient cohort

We performed Phenogenon on the 3290 unrelated samples of the Phenopolis cohort. As shown in Table 3, from the top 10 relations discovered using Phenogenon, six were known (SCN1A and USH2A are shown in Table 2 instead); the MOI of all were predicted correctly. We were also able to uncover other strong gene-phenotype relationships when including HPO terms with at least 60 individuals affected (Table 3). For instance, GRHL2 (OMIM: 608576) known to cause recessive ectodermal dysplasia/short stature syndrome, which involves nail dystrophy [19], was correctly linked to Nail dystrophy with recessive MOI by Phenogenon (HGF score: 10.54). STAT1 (OMIM: 600555) known to cause dominant or recessive immunodeficiency, was also correctly linked to Severe combined immunodeficiency, with dominant MOI (HGF score: 10.38). Other examples include SRD5A3 –Abnormal full-field electroretinogram (HGF: 11.13) with recessive MOI (known to cause recessive congenital disorders of glycosylation, which may involve retinal disorders [20].) and PDE6A –Retinal dystrophy (HGF: 9.40) with recessive MOI (known to cause recessive retinitis pigmentosa [21]). Among the top 10 findings, there are four relations that were previously unreported. Whilst three of them were likely false positives, we think that the association of “RRAGA—abnormality of the skin—dominant” may reflect a novel disease mechanism. RRAGA encodes Ras-related GTP-binding protein A that activates mTORC [22], which was found to regulate skin morphogenesis and epidermal barrier formation [23], therefore its mutations are the possible pathogenic cause of the skin disorders observed on the patients in the Phenopolis dataset. AIP encodes a receptor for aryl hydrocarbons and a ligand-activated transcription factor, and was associated with Dementia by Phenogenon. This is likely a false positive since all the variants contributing to the HPO’s high HGF score had low sequencing depths (2 to 7) and were all called as homozygote by GATK. Given that the gnomAD allele frequencies of the variants are zero, the likelihood of observing multiple homozygous carriers of the variants in our unrelated samples is low. Considering their low alignment depths, they are likely genotyping errors. NUP205 encodes a nucleoporin, and was associated with Abnormal electroretinogram by Phenogenon. On the other hand, majority of the variants in the low p value bins in “NUP205 –Abnormal electroretinogram—dominant” have a GF higher than 0. This contradicts the presumption that most retinal disorders in the Phenopolis dataset are rare Mendelian disorders, therefore we believe “NUP205 –Abnormal electroretinogram—dominant” is a false positive. Interestingly, despite that experts in the consortium have ruled out TTN as a causative gene for retinal disorders, the reason why Phenogenon associated TTN with Abnormality of the anterior segment of the globe remains unclear.

Table 3. Top-ranked gene-phenotype-MOI relations reported by phenogenon.


Aggregated databases of high throughput sequencing data of large numbers of HPO-annotated patients are indispensable for the genetic diagnosis of rare disease patients.

However, phenotypic and genetic biases are often inherent to these datasets. Phenotypic bias may be caused by certain patients such as in our dataset, eye patient, having more HPO terms than other types of patients, such as neurological patients, that typically only have one HPO term. Genetic bias may be caused by exome capture biases in coverage which we attempted to control for by imposing strict thresholds on the missingness. Despite these phenotypic and genetic biases, using our new tool, Phenogenon, we were able to recapitulate several known gene-phenotype-MOI relationships (Table 3).

Phenogenon can also be applied to a combination of phenotype terms. For example, considering patients affected by both ‘Rod-cone dystrophy’ (HP:0000510) and ‘Hearing impairment’ (HP:0000365), the top two genes predicted are USH2A (HGF: 9.53) and ADGRV1 (HGF: 8.31), both known to cause Usher syndrome that affects both visual and hearing systems. However, a caveat of such an approach is a reduced sample size hence decreased predictive power.

We recognise our reported novel relations require further scrutiny, in particular in the case of dominant MOI associations, as the results of these are sensitive to various parameters such as the version of CADD used. In particular, we witnessed CADD score increases for a number of synonymous variants between version 1.3 and 1.4 of CADD. Furthermore, the association signal can also be driven by uncharacteristic variants with a higher GF and CADD than expected. For instance, in the predicted relation “NUP205—Abnormal electroretinogram—dominant”, around 70% of the enriched rare variants have GF > 0 while having a CADD > 15 (S3 Fig). The “TTN–Abnormality of the anterior segment of the globe—dominant” also warrants further investigation as this is a large gene prone to artefact (S4 Fig). We therefore recommend that these relationships are examined more closely using our interactive webtool

Until the release of the gnomAD database, there was no reliable source to estimate variant homozygote frequency, and therefore to date, all gene-phenotype association tools have used allele frequency, regardless of the MOI. We argue that using homozygote frequency when assuming recessive MOI improves the model performance.

In conclusion, we have developed a statistical tool, Phenogenon, to detect and visualise “gene—HPO—MOI” relationships. Our approach has suggested some strong candidate relationships and correctly recapitulated existing relationships. The adoption of the HPO nomenclature by large rare disease sequencing projects leads us to believe Phenogenon will be of increasing utility in understanding gene-phenotype-MOI relationships as genetics is phased into routine NHS practice.


We thank Lucy Withington, Tom Vulliamy, Stephanie Halford and Suzanne Broadgate for their indispensable help during the development of the work described in this paper.

UK Inherited Retinal Dystrophy Consortium (UKIRDC): Graeme Black (chair), Georgina Hall, Stuart Ingram, Rachel Taylor, Forbes Manson, Panagiotis Sergouniotis, Andrew Webster, Alison Hardcastle, Michel Michaelides, Vincent Plagnol, Nikolas Pontikos, Michael Cheetham, Gavin Arno, Alessia Florentino, Chris Inglehearn, Carmel Toomes, Manir Ali, Martin McKibbin, Claire Smith, Kamron Khan, Susan Downes, Jing Yu, Stephanie Halford, Suzanne Broadgate, Veronica van Heyningen.

Phenopolis consortium: UKIRDC, GOSgene, Pier Lambiase, Petros Syrris, Alison Hardcastle, Andrew Webster, Tom Vulliamy, S Rahman, Simon Mead, Sisodiya, Sanjay, Tony Segal, Andrew Smith, Prof David Kelsell, Hywell Williams, Sergei Nejetsev.


  1. 1. Sherry ST. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001; pmid:11125122
  2. 2. Landrum MJ, Lee JM, Benson M, Brown G, Chao C, Chitipiralla S, et al. ClinVar: Public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016; pmid:26582918
  3. 3. P Daiger S F. Rossiter B, Greenberg LJ, Christoffels A, Hide W, Daiger S, et al. Data services and software for identifying genes and mutations causing retinal degeneration. Iovs. 1997.
  4. 4. Sergouniotis PI, Chakarova C, Murphy C, Becker M, Lenassi E, Arno G, et al. Biallelic variants in TTLL5, encoding a tubulin glutamylase, cause retinal dystrophy. Am J Hum Genet. 2014; pmid:24791901
  5. 5. Arno G, Carss KJ, Hull S, Zihni C, Robson AG, Fiorentino A, et al. Biallelic Mutation of ARHGEF18, Involved in the Determination of Epithelial Apicobasal Polarity, Causes Adult-Onset Retinal Degeneration. Am J Hum Genet. 2017;100: 334–342. pmid:28132693
  6. 6. Fiorentino A, Fujinami K, Arno G, Robson AG, Pontikos N, Arasanz Armengol M, et al. Missense variants in the X-linked gene PRPS1 cause retinal degeneration in females. Hum Mutat. 2018;39: 80–91. pmid:28967191
  7. 7. Arno G, Agrawal SA, Eblimit A, Bellingham J, Xu M, Wang F, et al. Mutations in REEP6 Cause Autosomal-Recessive Retinitis Pigmentosa. Am J Hum Genet. 2016;99: 1305–1315. pmid:27889058
  8. 8. Khan KN, Robson A, Mahroo OAR, Arno G, Inglehearn CF, Armengol M, et al. A clinical and molecular characterisation of CRB1-associated maculopathy. Eur J Hum Genet. 2018;26: 687–694. pmid:29391521
  9. 9. Taylor RL, Poulter JA, Downes SM, McKibbin M, Khan KN, Inglehearn CF, et al. Loss-of-Function Mutations in the CFH Gene Affecting Alternatively Encoded Factor H-like 1 Protein Cause Dominant Early-Onset Macular Drusen. Ophthalmology. 2019; pmid:30905644
  10. 10. Pontikos N, Yu J, Moghul I, Withington L, Blanco-Kelly F, Vulliamy T, et al. Phenopolis: An open platform for harmonization and analysis of genetic and phenotypic data. Bioinformatics. 2017;33: 2421–2423. pmid:28334266
  11. 11. Köhler S, Vasilevsky NA, Engelstad M, Foster E, McMurry J, Aymé S, et al. The human phenotype ontology in 2017. Nucleic Acids Res. 2017; pmid:27899602
  12. 12. Savova GK, Masanz JJ, Ogren P V., Zheng J, Sohn S, Kipper-Schuler KC, et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): Architecture, component evaluation and applications. J Am Med Informatics Assoc. 2010; pmid:20819853
  13. 13. Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010; pmid:20926424
  14. 14. Lek M, Karczewski KJ, Minikel E V., Samocha KE, Banks E, Fennell T, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016; pmid:27535533
  15. 15. Kircher M, Witten DM, Jain P, O’roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014; pmid:24487276
  16. 16. Gambardella A, Marini C. Clinical spectrum of SCN1A mutations. Epilepsia. 2009;50 Suppl 5: 20–3. pmid:19469841
  17. 17. Miller IO, Sotero de Menezes MA. SCN1A-Related Seizure Disorders [Internet]. GeneReviews®. 1993. Available:
  18. 18. Burke TR, Tsang SH. Allelic and phenotypic heterogeneity in ABCA4 mutations. Ophthalmic Genet. 2011; pmid:21510770
  19. 19. Petrof G, Nanda A, Howden J, Takeichi T, McMillan JR, Aristodemou S, et al. Mutations in GRHL2 result in an autosomal-recessive ectodermal dysplasia syndrome. Am J Hum Genet. 2014; pmid:25152456
  20. 20. Taylor RL, Arno G, Poulter JA, Khan KN, Morarji J, Hull S, et al. Association of steroid 5α-reductase type 3 congenital disorder of glycosylation with early-onset retinal dystrophy. JAMA Ophthalmol. 2017; pmid:28253385
  21. 21. Sothilingam V, Garrido MG, Jiao K, Buena-Atienza E, Sahaboglu A, Trifunović D, et al. Retinitis pigmentosa: Impact of different Pde6a point mutations on the disease phenotype. Hum Mol Genet. 2015; pmid:26188004
  22. 22. Kim E, Goraksha-Hicks P, Li L, Neufeld TP, Guan KL. Regulation of TORC1 by Rag GTPases in nutrient response. Nat Cell Biol. 2008; pmid:18604198
  23. 23. Ding X, Bloch W, Iden S, Rüegg MA, Hall MN, Leptin M, et al. mTORC1 and mTORC2 regulate skin morphogenesis and epidermal barrier formation. Nat Commun. 2016; pmid:27807348
  24. 24. Vierimaa O, Georgitsi M, Lehtonen R, Vahteristo P, Kokko A, Raitila A, et al. Pituitary adenoma predisposition caused by germline mutations in the AIP gene. Science (80-). 2006; pmid:16728643
  25. 25. Braun DA, Sadowski CE, Kohl S, Lovric S, Astrinidis SA, Pabst WL, et al. Mutations in nuclear pore genes NUP93, NUP205 and XPO5 cause steroid-resistant nephrotic syndrome. Nat Genet. 2016; pmid:26878725
  26. 26. Chen W, Shimane T, Kawano S, Alshaikh A, Kim SY, Chung SH, et al. Human Papillomavirus 16 E6 Induces FoxM1B in Oral Keratinocytes through GRHL2. J Dent Res. 2018; pmid:29443638
  27. 27. Hartono SP, Vargas-Hernández A, Ponsford MJ, Chinn IK, Jolles S, Wilson K, et al. Novel STAT1 Gain-of-Function Mutation Presenting as Combined Immunodeficiency. Journal of Clinical Immunology. 2018. pmid:30317461
  28. 28. Ovadia A, Sharfe N, Hawkins C, Laughlin S, Roifman CM. Two different STAT1 gain-of-function mutations lead to diverse IFN-γ-mediated gene expression. npj Genomic Med. 2018; pmid:30131873
  29. 29. Freiburg A, Gautel M. A molecular map of the interactions between titin and myosin-binding protein C Implications for sarcomeric assembly in familial hypertrophic cardiomyopathy. Eur J Biochem. 1996; pmid:8631348
  30. 30. Akle S, Chun S, Jordan DM, Cassa CA. Mitigating False-Positive Associations in Rare Disease Gene Discovery. Hum Mutat. 2015; pmid:26378430