Fig 1.
a. Diagrammatic representation of the conserved core sequence that comprises linked gene families found in the human chr22 LCR22A and D segmental duplications; these are: FAM230 lincRNA gene family (highlighted in yellow), the clincRNA family genes (highlighted in green), a spacer sequence (highlighted in gray) and the GGT gene family (highlighted in red. b. Schematic of the linked gene segment FAM230B—LOC105372935—GGT2. Diagrams are approximate and not drawn to scale.
Table 1.
GGT-linked genes present in human LCR22s*.
Fig 2.
A schematic of segmental duplications found in the 22q.11.2 region of human chr22.
A-H represent the eight LCR22s. The GGT-linked gene segments (in red) are represented by the GGT-related symbols in the drawing.
Fig 3.
A segment of the alignment of four FAM230-clincRNA-GGT linked genes (Table 1) with two chimpanzee sequences.
The complete sequence alignment is in S1 Fig. The yellow highlight denotes sequences of the FAM230 genes, green highlight denotes the clincRNA genes with the FAM230B—LOC105372935—GGT2 coordinates used for guideposts. The figure displays the 3’ end FAM230B gene/LOC105372935 junction. The two chimpanzee sequences are from: Pan troglodytes isolate Yerkes chimp pedigree #C0471 (Clint) chromosome 22, Clint_PTRv2, NCBI Reference Sequence: NC_036901.1 and chimp.54546-135457.revcompl. from Pan.troglodytes.clone.rp43-41g5.GenBank:AC099533.36. The human sequences are from Homo sapiens chromosome 22, GRCh38.p12 Primary Assembly NCBI Reference Sequence: NC_000022.
Fig 4.
Left panel: GGT and the associated genes in LCR22s of chr22. The end chromosomal coordinates are shown in parentheses. The gene arrangement diagrams are directly from the NCBI website: https://www.ncbi.nlm.nih.gov/gene (21). Right panel: Schematic representation of GGT-linked genes (but not drawn to scale). GGTLC5P and its associated genes (bottom figure) are presented in the reverse orientation. Note: the FAM230A gene has a 50 kbp sequence gap, thus sequences from both Ensembl and NCBI were used for alignments to obtain more complete identity values. In addition, only Ensembl has annotated the clincRNA gene, AC023490.3.
Table 2.
Percent identity of FAM230-linked genes relative to FAM230B-LOC105372935-GGT2.
Fig 5.
A section of the alignment of FAM230B-LOC105372935-GGT2 and FAM230E-LOC105377182—GGT3P sequences with yellow highlighted sequences showing differences (point mutations, deletions/insertions) between the two FAM230 genes.
Fig 6.
Schematic representation of GGT-linked genes in LCR22E, and H and comparisons with FAM230B-LOC105372935-GGT2.
a. |xxx| represents the absence of parts of the LOC105372935 clincRNA and the entire spacer sequences. The percent identity relative to nt positions of FAM230B-LOC105372935-GGT2 are, POM121L1P, nt positions 28645–30907 96%; GGTLC2, positions 40838–43793, 96%; b. Percent identity of nt postions 28900–40312 of FAM230B-LOC105372935-GGT2 with POM121L10P—BCRP3—spacer, 96%; with GGT1, nt positions, 40295 to 57246, 97%. The lengths of genes in the figure are not to scale.
Table 3.
% Identity GGT5 and GGTLC4P with GGT2 and clincRNA gene LOC105372935.
Fig 7.
Phylogram shows the phylogenetic branch relationships between GGT genes and the LOC105372935 lincRNA gene.
The gorilla GGT1 and GGT5 sequences were from Gorilla gorilla (western gorilla) chromosome 22, gorGor4, NCBI Reference Sequence: NC_018446.2, locus: NC_018446 [20]. The Rheus monkey GGT5 sequence (annotated as LOC720345) was from the NCBI sequence of Macaca mulatta isolate AG07107 chromosome 10, Mmul_10, whole genome shotgun sequence, ACCESSION: NC_041763, REGION: complement 28391649..28419652). The phylogram was obtained using the EBI Clustal Omega sequence alignment and phylogeny programs.
Fig 8.
A phylogram determined from an alignment of primate USP18 sequences with human clincRNA LOC105372935 and two FAM230 sequences.
The phylogram was obtained using the EBI Clustal Omega sequence alignment and phylogeny programs.