Transfer RNAs (tRNAs) are intermediate-sized non-coding RNAs found in all organisms that help translate messenger RNA into protein. Recently, the number of sequenced plant genomes has increased dramatically. The availability of this extensive data greatly accelerates the study of tRNAs on a large scale. Here, 8,768,261 scaffolds/chromosomes containing 229,093 giga-base pairs representing whole-genome sequences of 256 plant species were analyzed to identify tRNA genes. As a result, 331,242 nuclear, 3,216 chloroplast, and 1,467 mitochondrial tRNA genes were identified. The nuclear tRNA genes include 275,134 tRNAs decoding 20 standard amino acids, 1,325 suppressor tRNAs, 6,273 tRNAs with unknown isotypes, 48,475 predicted pseudogenes, and 37,873 tRNAs with introns. Efforts also extended to the creation of PltRNAdb (https://bioinformatics.um6p.ma/PltRNAdb/index.php), a data source for tRNA genes from 256 plant species. PltRNAdb website allows researchers to search, browse, visualize, BLAST, and download predicted tRNA genes. PltRNAdb will help improve our understanding of plant tRNAs and open the door to discovering the unknown regulatory roles of tRNAs in plant genomes.
Citation: Mokhtar MM, EL Allali A (2022) PltRNAdb: Plant transfer RNA database. PLoS ONE 17(5): e0268904. https://doi.org/10.1371/journal.pone.0268904
Editor: Xiang Jia Min, Youngstown State University, UNITED STATES
Received: February 18, 2022; Accepted: May 10, 2022; Published: May 23, 2022
Copyright: © 2022 Mokhtar, Allali. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Transfer RNAs (tRNAs) are intermediate-sized non-coding RNA genes discovered in all organisms that help in the translation of mRNA into protein . tRNAs are found in all types of cells and organelles and are involved in several cellular processes, including viral replication, amino acid biosynthesis, and cell wall remodeling [2, 3]. In plants, tRNA undergoes a post-transcriptional process to obtain the mature form required for its function . Recently, Hummel et al.  reported a variety of cell biological processes that are affected by the organization, expression, and modification of tRNA genes. These modifications are a source of novel biological functions of tRNAs in plants .
In the last decade, the number of sequenced plant genomes has increased with the advances in sequencing technology . It is of critical importance to predict tRNA genes in sequenced genomes as continuous. Thanks to advances in bioinformatics, several tools have been developed to predict tRNA genes. These tools include tRNAscan-SE , ARAGORN , DOGMA , ARWEN , MiTFi , TFAM , tRNAfinder , and SPLITS . tRNAscan-SE  is the most widely used tool for detecting and annotating tRNA genes in sequenced genomes. Both computational prediction tools and tRNA databases are used to identify tRNA genes of specific plant species. There are several tRNA databases such as tRNAdb http://trna.bioinf.uni-leipzig.de/DataOutput/ , GtRNAdb http://gtrnadb.ucsc.edu/ , tRNADB-CE http://trna.ie.niigata-u.ac.jp/cgi-bin/trnadb/index.cgi , and PlantRNA http://plantrna.ibmp.cnrs.fr/ . Unfortunately, these databases are out of date as they do not use recently sequenced plant genomes (Table 1).
Due to the increasing number of plant genomes, we have developed PltRNAdb, a freely available database of tRNA genes from 256 plant species. The PltRNAdb database contains the details of identified tRNAs in the nuclear genome and its available organellar genomes as follows: 1) tRNA sequences, 2) tRNA secondary structure visualization, 3) tRNAs upstream and downstream sequences, 4) tRNAs with introns, 5) tRNAs decoding 20 standard amino acids, 6) possible suppressor tRNAs, 7) tRNAs with unknown isotypes, 8) predicted tRNA pseudo-genes. We hope that by pooling such extensive data into one database, we can improve our understanding of plant tRNAs and open the door to discovering the unknown regulatory roles of tRNAs in plant genomes.
Materials and methods
We retrieved FASTA files of sequenced and annotated nuclear genomes for 256 plant species from the NCBI database (https://www.ncbi.nlm.nih.gov/). In addition, we retrieved the available organellar genomes of those species, including 100 chloroplast and 52 mitochondrial genomes. The 256 plant genomes include 229 Streptophyta, 24 Chlorophyta, and 3 Rhodophyta (Table 2). The details of the studied species are listed in S1 Table, including plant name, NCBI taxid, assembly type and level, sequence representation and coverage, and sequence category and accession numbers.
Prediction of tRNA genes
tRNAscan-SE v.2.0.9  was used in the present study for the prediction of tRNAs in the studied plant genomes. For nuclear genomes, the parameters were set to: Search Mode: Eukaryotic, Searching with: Infernal first pass, Isotype-specific model scan: yes, Covariance model: TRNAinf-euk.cm, Infernal first pass cutoff score: 10, and Temporary directory: tmp. For chloroplast and mitochondrial genomes, the parameters were set as follows: Search Mode: (Organellar), Searching with: (Infernal single-pass; scan Maximum sensitivity mode), Covariance model: (TRNAinf-1415.cm), Cutoff score: (15).
Results and discussion
Recently, the number of sequenced plant genomes has increased due to advances in genome sequencing. This large number of sequenced genomes requires bioinformatics tools to extract various features and make them available in various databases. For plants, almost half of the sequenced genomes have not yet been subjected to tRNA prediction using the available tools [8–15]. Consequently, the recently sequenced plant genomes are not included in the current tRNA databases [16–19].
Here, 8,768,261 scaffolds/chromosomes with a total length of 229,093 giga base pairs representing nuclear, chloroplast and mitochondrial genome sequences of the studied plant species were analyzed to identify the tRNA genes. As a result, 331,242, 3,216, and 1,467 tRNA genes were identified from nuclear, chloroplast and mitochondrial genomes, respectively (Table 3). Fig 2 shows bar charts for the total number of nuclear tRNAs decoding 20 standard amino acids, suppressor tRNAs, tRNAs with unknown isotypes, and predicted pseudo-genes for each genome.
The inner circle represents 256 plant species, and the outer circle includes tRNA decoding the standard 20 amino acids (purple color), predicted pseudogenes (red color), possible suppressor tRNAs (blue color), and tRNA with undetermined isotypes (green color).
To date, further efforts have been made to predict plant tRNA genes and make them available by building web databases. Several databases have been created using tRNAscan, including GtRNAdb , tRNAdb , tRNADB-CE , and PlantRNA . GtRNAdb contains 30,061 predicted tRNA genes derived from 15 plant species, whereas tRNAdb contains 702 tRNA genes derived from 58 plant species. In addition, tRNADB-CE contains 1,352 tRNA genes derived from 2 plant species, while PlantRNA database contains 66,686 genes derived from 47 plant species. Table 4 compares our database with previously developed databases (GtRNAdb and PlantRNA). This comparison includes only the sequenced and annotated plant species (38 species) shared by the compared databases. The comparison includes the number of predicted tRNA genes and the number of tRNAs with introns. The species name, total number of predicted tRNA genes, and number of tRNAs with introns of 218 plant species available only through the current database were listed in S2 Table.
The PltRNAdb database
The PltRNAdb search page offers researchers the ability to dive deep into the database and retrieve tRNA data in two steps. The first step is to select the plant species and the second step is to select the nuclear, chloroplast, or mitochondrial genome and the tRNA type. The tRNA types include Ala, Gly, Pro, Thr, Val, Arg, Leu, Phe, Asn, Asp, Glu, His, Ile, Met/iMet, Tyr, Supres, Cys, Ser, Trp, SelCys, Gln, Lys, and Undet. The results are displayed on the new page with the available details of the tRNA genes. The results page is divided into two subsections. The first is used to display statistical plots of the identified tRNAs in the species searched. The second section contains details such as tRNA sequence ID, chromosome/scaffold accession number, sequence start and end within the chromosome/scaffold, tRNA sequence, tRNA secondary structure visualization, tRNA type, anticodon, intron start, intron end, score, and notes. The tRNA secondary structure button leads to a separate page with details of the selected tRNA, including tRNA secondary structure image, tRNA type, anticodon, tRNA length, upstream and downstream sequence, and tRNA sequence. The results can be downloaded using the Download button at the top of the Results page.
The general statistics page of PltRNAdb offers researchers the ability to take a close look at all available statistics for their selected species. Researchers can select plant species from the drop-down menu using the scientific name of the plant. Summary statistics for the selected species include statistical charts of identified tRNAs and the summary table for the nuclear genome. In addition, the statistics of chloroplast and mitochondrial tRNAs when available. On the Bulk Download page, researchers can download all data for selected plant species. They can download the data in different formats, including the FASTA file of tRNAs, the identified tRNA details in a tabular format, and the statistics file for each genome separately (nuclear, chloroplast, or mitochondrial genome) (S1 Fig).
BLASTN is embedded in PltRNAdb for tRNAs DNA sequence comparisons. BLASTN allows researchers to quickly align their sequence to the tRNA sequences of 256 plant species. Researchers can blast their FASTA sequence against one of the 256 plant species. The results table includes the subject ID, query ID, identity, length, mismatch, gaps, query start and end, subject start and end, E-value, and blast score (S2 Fig).
Case study: Arabidopsis thaliana tRNA genes
In the present study, we select Arabidopsis thaliana to show the user how to navigate PltRNAdb. Due to the high quality of the genome sequence and annotation of Arabidopsis thaliana, this case study also serves as a comparison between the current findings and the genome annotation provided by NCBI. In PltRNAdb, 642, 33, and 36 tRNA genes were detected in the nuclear, chloroplast, and mitochondrial genomes of Arabidopsis thaliana, respectively. In the NCBI genome database, a total of 623 tRNA genes were found in the annotation of Arabidopsis thaliana. Based on the location of the tRNA genes in the genome sequence, the 623 NCBI tRNA genes were compared with the 642 tRNA genes identified in the current study. Of the 623 NCBI tRNA genes, 622 match the current finding and only one NCBI tRNA gene has no match. The 20 tRNA genes from the current finding that were not present in the NCBI tRNA genes were 2 Lys, 2 Leu, 2 Glu, 1 Tyr, 1 Cys, 1 Arg, 1 Met, 1 Asp, 2 undetermined, and 7 pseudo-genes. The common tRNA genes, the tRNA genes unique to NCBI, and the tRNA genes unique to the current analysis are listed in S3 Table.
PltRNAdb includes searching, browsing, visualization, and downloading functionalities. The search page can be accessed via Database Search in the top bar of each page. First, select Arabidopsis thaliana from the plant species drop-down menu and click the Search button. The statistical charts of Arabidopsis thaliana tRNA genes are displayed on the same page. Second, select the nuclear, chloroplast, or mitochondrial genome and the tRNA type from the genome and tRNA Type drop-down menus (Fig 3).
A) Select the plant species dropdown menu, B) bar charts represent statistics of all data available for Arabidopsis thaliana in PltRNAdb, C) Genome and tRNA dropdown menus for Arabidopsis thaliana search in PltRNAdb.
The search results are displayed on a separate page with the statistical charts of the tRNA genes subjected to the search parameters and a table with some details about the tRNA genes. Users can download the search results using the Download button at the top of the results table or download only the FASTA sequence for a tRNA gene by clicking the Download button in the tRNA Sequence column (Fig 4). Users can also access the details of the selected tRNA by clicking the View button in the Secondary Structure column. This page displays the details and image of the tRNA secondary structure of the selected tRNA as well as the download button (Fig 5).
A) Visualize statistics subsection, B) button to download results, C) summary of search results with a hyperlink button for three separate pages to retrieve tRNA FASTA sequence from NCBI, download tRNA FASTA sequence from PltRNAdb, and retrieve details of the selected tRNA.
A view page of the selected tRNA contains A) a download button, B) summary statistics, C) visualization of tRNA secondary structure, D) tRNA start and end, tRNA length, tRNA sequence, and tRNA upstream and downstream sequences.
The general statistics page can be accessed by clicking the General Statistics button in the top bar of any page. This page is divided into three subsections. The first is for selecting Arabidopsis thaliana from the plant species dropdown menu. The second section contains the statistical charts of the total tRNA genes of Arabidopsis thaliana (nuclear, chloroplast, mitochondria) and a bar chart for the tRNA types. The third section contains statistical tables with nuclear, chloroplast, and mitochondrial values. The statistical tables include the total number of predicted tRNA genes, tRNAs decoding 20 standard amino acids, selenocysteine tRNAs (TCA), possible suppressor tRNAs, tRNAs with unknown isotypes, predicted pseudogenes, tRNAs with introns, and the number of isotypes/anticodons (Fig 6).
Conclusion and future work
PltRNAdb is a database of tRNA genes, predicted by tRNAScan , for 256 plant species. Various tools and programming languages were used for visualize tRNA secondary structure, and build the database. PltRNAdb will be regularly updated with new annotated genomes and improve its tools to serve its purpose. Although PltRNAdb focuses on the prediction of tRNA genes in fully sequenced and annotated genomes, we plan to add a subsection for incomplete/unannotated plant genomes to the database to bring all available species together in one database. PltRNAdb will be an excellent resource for researchers interested in tRNAs research areas. We hope that PltRNAdb will improve our understanding of plant tRNAs and open the door to discovering the unknown regulatory roles of tRNAs in plant genomes.
S1 Table. List of plant names, NCBI taxid, assembly type and level, sequence representation, coverage, sequence category and accession numbers for all species examined.
S2 Table. Total number of tRNAs and number of tRNAs with introns for all species unique to PltRNAdb.
S3 Table. Comparison between tRNAs in PltRNAdb and tRNAs provided by NCBI for Arabidopsis thaliana.
S1 Fig. An example of the bulk download page.
The first subsection is used to select the plant species, and the second subsection is used to select the type of data to download.
The authors acknowledge the African Supercomputing Center at Mohamed VI Polytechnic University for the supercomputing resources (https://ascc.um6p.ma/) made available for conducting the research reported in this paper.
- 1. Hurto RL. Unexpected Functions of tRNA and tRNA Processing Enzymes. In: Collins LJ, editor. RNA Infrastructure and Networks. New York, NY: Springer New York; 2011. pp. 137–155. https://doi.org/10.1007/978-1-4614-0332-6_9 pmid:21915787
- 2. Banerjee R, Chen S, Dare K, Gilreath M, Praetorius-Ibba M, Raina M, et al. tRNAs: Cellular barcodes for amino acids. FEBS Lett. 2010;584: 387–395. pmid:19903480
- 3. Phizicky EM, Hopper AK. tRNA biology charges to the front. Genes Dev. 2010;24: 1832–1860. pmid:20810645
- 4. Yukawa Y, Akama K, Noguchi K, Komiya M, Sugiura M. The context of transcription start site regions is crucial for transcription of a plant tRNALys(UUU) gene group both in vitro and in vivo. Gene. 2013;512: 286–293. https://doi.org/10.1016/j.gene.2012.10.022 pmid:23103832
- 5. Hummel G, Warren J, Drouard L. The multi-faceted regulation of nuclear tRNA gene transcription. IUBMB Life. 2019;71: 1099–1108. https://doi.org/10.1002/iub.2097 pmid:31241827
- 6. Warren JM, Salinas-Giegé T, Hummel G, Coots NL, Svendsen JM, Brown KC, et al. Combining tRNA sequencing methods to characterize plant tRNA expression and post-transcriptional modification. RNA Biol. Taylor & Francis; 2021;18: 64–78. pmid:32715941
- 7. Mokhtar MM, El Allali A, Hegazy M-EF, Atia MAM. PlantPathMarks (PPMdb): an interactive hub for pathways-based markers in plant genomes. Sci Rep. 2021;11: 21300. pmid:34716373
- 8. Lowe TM, Eddy SR. tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence. Nucleic Acids Res. 1997;25: 955–964. pmid:9023104
- 9. Laslett D, Canback B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res. 2004;32: 11–16. pmid:14704338
- 10. Wyman SK, Jansen RK, Boore JL. Automatic annotation of organellar genomes with DOGMA. Bioinformatics. 2004;20: 3252–3255. pmid:15180927
- 11. Laslett D, Canbäck B. ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics. 2007;24: 172–175. pmid:18033792
- 12. Jühling F, Pütz J, Bernt M, Donath A, Middendorf M, Florentz C, et al. Improved systematic tRNA gene annotation allows new insights into the evolution of mitochondrial tRNA structures and into the mechanisms of mitochondrial genome rearrangements. Nucleic Acids Res. 2011;40: 2833–2845. pmid:22139921
- 13. Tåquist H, Cui Y, Ardell DH. TFAM 1.0: an online tRNA function classifier. Nucleic Acids Res. Oxford University Press; 2007;35: W350–W353. pmid:17591612
- 14. Kinouchi M, Kurokawa K. tRNAfinder: a software system to find all tRNA genes in the DNA sequence based on the cloverleaf secondary structure. J Comput Aided Chem. 2006;7: 116–126.
- 15. Fujishima K, Sugahara J, Kikuta K, Hirano R, Sato A, Tomita M, et al. Tri-split tRNA is a transfer RNA made from 3 transcripts that provides insight into the evolution of fragmented tRNAs in archaea. Proc Natl Acad Sci. National Academy of Sciences; 2009;106: 2683–2687. pmid:19190180
- 16. Jühling F, Mörl M, Hartmann RK, Sprinzl M, Stadler PF, Pütz J. tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res. 2008;37: D159–D162. pmid:18957446
- 17. Chan PP, Lowe TM. GtRNAdb 2.0: an expanded database of transfer RNA genes identified in complete and draft genomes. Nucleic Acids Res. 2015;44: D184–D189. pmid:26673694
- 18. Abe T, Ikemura T, Sugahara J, Kanai A, Ohara Y, Uehara H, et al. tRNADB-CE 2011: tRNA gene database curated manually by experts. Nucleic Acids Res. 2010;39: D210–D213. pmid:21071414
- 19. Cognat V, Pawlak G, Duchêne A-M, Daujat M, Gigant A, Salinas T, et al. PlantRNA, a database for tRNAs of photosynthetic eukaryotes. Nucleic Acids Res. 2012;41: D273–D279. pmid:23066098
- 20. Das D, Zahra S, Singh A, Kumar S. PtRNAdb: A web resource of Plant tRNA genes from a wide range of plant species. bioRxiv. Cold Spring Harbor Laboratory; 2022;
- 21. Chan PP, Lin BY, Mak AJ, Lowe TM. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 2021;49: 9077–9096. pmid:34417604