A comprehensive overview and evaluation of circular RNA detection tools

Circular RNA (circRNA) is mainly generated by the splice donor of a downstream exon joining to an upstream splice acceptor, a phenomenon known as backsplicing. It has been reported that circRNA can function as microRNA (miRNA) sponges, transcriptional regulators, or potential biomarkers. The availability of massive non-polyadenylated transcriptomes data has facilitated the genome-wide identification of thousands of circRNAs. Several circRNA detection tools or pipelines have recently been developed, and it is essential to provide useful guidelines on these pipelines for users, including a comprehensive and unbiased comparison. Here, we provide an improved and easy-to-use circRNA read simulator that can produce mimicking backsplicing reads supporting circRNAs deposited in CircBase. Moreover, we compared the performance of 11 circRNA detection tools on both simulated and real datasets. We assessed their performance regarding metrics such as precision, sensitivity, F1 score, and Area under Curve. It is concluded that no single method dominated on all of these metrics. Among all of the state-of-the-art tools, CIRI, CIRCexplorer, and KNIFE, which achieved better balanced performance between their precision and sensitivity, compared favorably to the other methods.


Introduction
Circular RNA (circRNA) is a class of noncoding RNA that was discovered decades ago [1]; however, its abundance and ubiquity in eukaryotes were only recognized recently [2][3][4][5] because of the advance of next-generation RNA sequencing (RNA-Seq). Although it appears that many circRNAs remain to be discovered, ongoing studies continue to demonstrate important functions of circRNA in cell physiology [6][7][8][9][10][11]. For instance, the well-known circular RNA sponge for miR-7 (ciRS-7), which originates from the vertebrate cerebellar degeneration-related 1 (CDR1) antisense transcript, has the capacity to serve as a microRNA (miRNA) sponge. It is highly expressed in human and mouse brain cells [2,12]. Given its possession of more than 60 miR-7 binding sites [2,13], it is suggested to inhibit the binding of miR-7 to its target mRNAs. Another circRNA that can potentially also act as an miRNA sponge is derived from the murine sex-determining region Y (Sry) gene; it is a testis-specific circRNA, possessing 16 target sites for miR-138 in mouse [13]. Another function proposed for circRNAs is that they a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 affect gene regulation by competing with linear splicing on the usage of splice sites during the cotranscription process, leading to a change in the level of gene expression [14,15]. Although most circRNAs are from exons, there are also intron-containing circRNAs. It's reported that they are largely concentrated in the nucleus [16,17]. Intriguingly, evidence suggests that these intron-containing circRNAs enable the regulation of gene transcription in cis. Specifically, they can enhance the RNA polymerase II (Pol II) transcription activity of their parental genes, although the underlying mechanism is still not fully understood [16,17]. CircRNAs are characterized by their noncollinearity, in which a splice donor attacks an upstream acceptor, forming a covalently closed circular structure [1,2,4,12,[18][19][20]. This characteristic endows them with the ability to escape exonuclease digestion, enabling them to persist longer in the cell than their linear counterparts [3,21,22]. This feature in combination with their ubiquity in cancer tissues [23,24], saliva [25,26], blood [27][28][29], and exosomes [30,31] suggests that circRNAs are promising as biomarkers for diseases. On the other hand, their circular structure also serves as a key element for the detection of circRNAs. Specifically, the backsplicing junction reads within the structure presented in RNA-Seq data facilitate the genome-wide identification of this RNA species, although other mechanisms such as genomic tandem duplication, template switching during PCR amplification, or trans-splicing between precursor mRNAs (pre-mRNAs) can also potentially generate such reads [7,19] and complicate the detection process. To resolve this issue, Jeck et al. [3] developed a biochemical strategy termed CircleSeq that involves treating samples with an exonuclease that digests linear RNAs but preserves circRNAs (RNase R). However, it has been asserted that RNase R resistance alone cannot be used to determine whether an isoform is circular or not, because some circRNAs were found to be susceptible to this exonuclease [3,19,32,33]. CircRNA expression is reported to be specific to different tissues/cell lines and developmental stages [32][33][34][35]. Despite the fact that some circRNAs have been experimentally verified to be abundantly expressed, even more highly than their linear counterparts, the vast majority of them are usually expressed at low levels [3,32,36]. This not only constitutes another challenge for their identification, but also raises doubts about their functions, indicating that the majority of them may be inert byproducts of noncanonical pre-mRNA splicing [3,32].
The advent of high-throughput next-generation sequencing technology has enabled the sequencing of hundreds of millions of short reads, and its single-base-pair resolution provides a precise and efficient way to identify circRNAs. The detection of circRNAs from RNA-Seq data can be achieved using various software packages. There are approximately a total of 11 different tools that have been developed for this purpose. However, despite the development of this range of computational tools, no systematic evaluations of their performance have been performed. Although some attempts have been made to compare several of these packages [7,37] and some comparisons were included in papers by those who developed these tools [33,[38][39][40][41][42], different conclusions were drawn with regard to their performance, owing to different subsets of tools being compared, different filtering strategies being applied, or diverse datasets being utilized, among others. The fact that some circRNAs are susceptible to exonucleases and most of them are expressed at low levels means that there is an inherent bias in filtering for reliable circRNA candidates based on resistance to RNase R and/or the selection of backsplicing junction reads above a specific abundance threshold [33,43]. For example, recently, Hansen et al. found that, when focusing on the top 100 most highly expressed candidates detected by one tool alone, a large number (77%-88%) of the candidates predicted by 3 of the 5 tools evaluated would be qualified as artefacts, based on the criteria of RNase R resistance [37].
In this study, we perform a comprehensive evaluation of 11 different circRNA detection tools, with the aim of providing useful guidelines for researchers engaged in this field. These tools have been run and compared on 4 different datasets: (1) positive dataset: a dataset of simulated reads, encompassing a total of 14,689 circRNAs detected in HeLa cells from Cir-cBase [44]; (2) background dataset: a large negative dataset comprised of reads generated from mRNA sequences deposited in the NCBI Reference Sequence (RefSeq) database; (3) mixed dataset, generated by combining the positive and background datasets together; and (4) real datasets. These real datasets were established by downloading 6 runs of rRNA-depleted RNA-Seq data from NCBI Sequence Read Archive (SRA), including 4 runs of RNA-Seq data from the HeLa cell line and 2 runs from an immortalized human fibroblast cell line (Hs68), of which, 2 runs of RNA-Seq data from the HeLa cell line and 1 run from Hs68 were further treated with RNase R enzyme during sample preparation. The performance of the software packages was evaluated based on metrics such as sensitivity, precision, F1 measure, Area under Precision-Recall Curve (AUC), memory (i.e., Random Access Memory [RAM]) consumption, running time, and physical disk space utilized. Notably, a striking difference in the use of physical disk space was observed among these tools, which is an important factor that should not be overlooked when running software on a large dataset or several moderately sized datasets in parallel.

Evaluation with the positive dataset
As stated above, the positive dataset comprises 14,689 circRNAs derived from those detected in HeLa cells deposited in CircBase, with the number of supporting read pairs ranging from 2 to 24 and circle size varying from 51 to 846,530 base pairs (bps). We applied the 11 circRNA detection tools to identify circRNAs on this dataset. Table 1 show that most tools achieved high precision (>94%) and varying sensitivity (52%-93%). As F1 score (i.e., F1 = (2 Ã precision Ã sensitivity)/(precision + sensitivity)) weights precision and sensitivity equally and serves as a good metric to indicate whether a tool achieves favorable precision and sensitivity simultaneously, the performance of each method in terms of F1 was also included in Table 1. In summary, regarding the F1 measure, KNIFE, CIRI, PTESFinder, Segemehl, and CIRCexplorer were the top 5 performers on this dataset, with an F1 score above 0.85. Moreover, the effect of filtering for reliable circRNAs by increasing supporting read counts is shown in Fig 1A. In general, we observed that the precision of each tool increased with thresholds for read counts, but some highly expressed false positives were also reported by several tools, leading to a fluctuation of precision at the end. Also, we could observe that NCLScan consistently dominated other tools regarding the precision measure. Meanwhile, KNIFE, Segemehl, CIRI, PTESFinder and CIR-Cexplorer achieved the best sensitivity. Consistent with the F1 measure, the same 5 methods still performed best in terms of AUC.

Evaluation with the background dataset
The background dataset contained only reads from linear RefSeq mRNAs; therefore, the number of candidates and supporting read counts reported, which could be directly accessed from each tool's output, served as an indicator of false positive rate (Table 2). Here, NCLScan, MapSplice, CIRCexplorer, DCC, and PTESFinder tended to have a low false-positive rate, whereas Segemehl, find_circ and UROBORUS yielded the worst performance.

Evaluation with the mixed dataset
In the mixed dataset, we have 14,689 true positives. Similar to the positive dataset, metrics like precision, sensitivity, F1 measure, and AUC can be applied to evaluate the tools' performance on this dataset. The results are presented in Table 1. We can see that NCLScan maintains the highest precision, while KNIFE, CIRI, PTESFinder, CIRCexplorer, and Segemehl exhibit the best with regard to F1 measure. Compared with the findings on the positive dataset, as shown in Fig 2, considerable drops of precision rate (−7.61%, −6.62%, and −6.23%) were observed for Segemehl, find_circ, and UROBORUS, respectively, indicating that their performances were vulnerable to background noise. Meanwhile, KNIFE, CIRI, and circRNA_finder also suffered minor loss of precision (−3.39%, −1.21%, and −0.56%, respectively). On the other hand, small decreases of sensitivity (−4.46%, −2.90%, and −0.87%) were only observed for UROBORUS, Segemehl, and KNIFE. Notably, NCLScan, CIRCexplorer, DCC, Mapsplice, and PTESFinder were robust to background noise, showing no pronounced reductions in precision or sensitivity rate. Fig 1B shows the influence on performance of each method when increasing the threshold for supporting read counts on this dataset. In general, NCLScan and CIRCexplorer dominated other tools regarding the precision measure, while KNIFE, CIRI, Segemehl, PTES-Finder, and CIRCexplorer continued to be more sensitive than the rest of the tools. Of special note, except NCLScan, DCC, and MapSplice, the precision of all the other tools dropped to 0 in the end, because of the highly expressed false positives reported. The highest AUC achieved on this dataset was KNIFE (0.87), followed by CIRI (0.85), PTESFinder (0.83), Segemehl (0.80), and CIRCexplorer (0.78).

Evaluation with the real datasets
For the real datasets, we found that the number of circRNA candidates detected correlated with the types of candidates they were able to detect. Generally, methods that were able to detect exonic, intronic, and intergenic circRNAs reported more candidates than tools that were limited to detect exonic and intronic circRNAs, but the tools that could detect exonic and intronic circRNAs predicted more candidates than tools that only reported exonic circRNAs, with the exception that KNIFE and PTESFinder tended to be sensitive at detecting exonic cir-cRNAs (Table 3). Since we had no information about the true or false circRNA candidates detected in these samples, we mainly assessed each method's performance from 4 perspectives:  Counts of candidates and true positives were calculated at increasing thresholds for the number of supporting reads, and the First, we calculated and compared the percentage of circRNA candidates detected by each method that were not depleted after RNase R treatment; though true circRNAs may not be enriched after RNase R treatment, we assumed that the higher the percentage a method achieves, the more reliable the method is. Second, for a specific sample, we calculated the proportions of circRNA candidates detected by a particular method that were also detected by every other method. Third, we assessed the sensitivity of each method at reads level, i.e., the number of backspliced junction reads per circRNA each method can recover. Fourth, we manually compiled a list of 282 circRNAs validated from 17 studies and checked how many of these verified circRNAs were detected by each method.
Percentage of circRNA candidates that were not depleted after RNase R treatment. After filtering for circRNA candidates with !2 supporting read counts, we normalized the backspliced junction read counts by sequencing depth [28,39]. Similar to Hansen et al. [37], the ratio of normalized read counts between RNase R-treated and untreated samples was calculated. As shown in Table 3, with approximately equal sequencing depth, RNase R treatment indeed enabled the detection of many more candidates on the Hs68 samples. This is also confirmed on the HeLa samples. Although less than half the sequencing depth of the RNase Runtreated sample, a much larger number of candidates were detected by all the tools except PTESFinder, Segemehl, and UROBORUS on the HeLa RNase R-treated sample. In addition, we found that, on both HeLa and Hs68 samples, while MapSplice was capable of recovering the largest proportion of "not depleted" candidates, CIRI was much more sensitive to detecting such candidates and ranked second regarding the proportion of "not depleted" candidates. CIRCexplorer also exhibited decent performance and was ranked third in this analysis. When precision-recall results for each tool were further computed and depicted in the figures above. (a) Positive dataset (Inset: precision above 0.99 was detailed). (b) Mixed dataset (Inset: precision above 0.97 was detailed).
https://doi.org/10.1371/journal.pcbi.1005420.g001 we focused on the top 100 most highly expressed candidates, as in Hansen et al. [37], we found that, while find_circ, UROBORUS, and Segemehl exhibited a relatively poor performance, most tools performed similarly, with a close percentage of "not depleted" candidates detected on HeLa (65%*75%) and Hs68 (72%*80%) samples. Proportions of circRNA candidates detected by a specific method that were also detected by every other method. Mathematically, for all the methods M, 8 i,j 2 M, if we assume N i and N j as the total number of candidates detected by method i and j, respectively, and C(i,j) as the common candidates detected by both methods, then for method i, the proportion of common candidates is P(i,j) = C(i,j)/N i , while for method j, the proportion of common candidates is P(j,i) = C(i,j)/N j . If a large proportion of candidates detected by one method are often detected by the other methods (i.e., 9 i 2 M, 8 j 2 M\{i}, we have P(i,j) ! P threshold , e.g., P threshold = 0.5), then the method would tend to have high precision. On the other hand, if candidates detected by one method frequently overlap with a large proportion of candidates detected by the other methods (i.e., 9 i 2 M, 8 j 2 M\{i}, we have P(j,i) ! P threshold , e.g., P threshold = 0.5), then the method is sensitive and probably includes many true positives. After filtering for candidates with !2 backspliced junction reads on HeLa and Hs68 RNase R-treated sample data, we assessed the proportion of candidates detected by a specific method that were also detected by We found that NCLScan was a conservative method in that a high proportion of candidates detected by this method were frequently detected by other methods, while CIRI and Segemehl were sensitive methods in that a large percentage of candidates detected by other methods were frequently detected by these 2 methods, but Segemehl tended to sacrifice much more precision, because a relatively large portion of candidates detected by this method were frequently missed by the other methods. In the case of UROBORUS, its behavior seemed to depend on the dataset. The number of candidates detected on the Hs68 RNase R-treated sample by this method was markedly smaller than by the other methods; as a consequence, a large proportion of candidates detected by the other methods were left out by this method.
Sensitivity at reads level. CircRNAs that are relatively abundant at a specific condition or developmental stage may possess important functions, and the number of backspliced junction reads is often used to quantify the expression level of circRNAs [3,24,27,[34][35][36]38]. Therefore, the tools that are able to recover more of such reads will better serve this purpose. Of special note, for paired-end data, when both mates from a read pair span the same backspliced junction, which can be the case for small circRNAs, different tools undertake different counting methods. To our knowledge, CIRI, circRNA_finder, CIRCexplorer, KNIFE, DCC, and NCLScan take it once, while find_circ, MapSplice, PTESFinder, Segemehl, and UROBORUS count them twice. In our analysis, we focused on circRNA candidates detected from RNase Rtreated samples. As shown in Fig 4A and 4B, consistent results were achieved on these 2 samples. Basically, these tools can be clustered into 4 groups according to their sensitivity. MapSplice, CIRI, and PTESFinder were in the most sensitive group, followed by the group of KNIFE, find_circ, and Segemehl. While CIRCexplorer, circRNA_finder, DCC (all of which Having obtained candidates with !2 supporting reads, we normalized the supporting read counts with sequencing depth and defined a candidate as "not depleted" or "significantly enriched" if its normalized read counts were not reduced or had !5 folds of enrichment after RNase R treatment, respectively. The percentage of "not depleted" candidates in the RNase R-untreated sample is provided. Among the top 10 and top 100 most highly expressed candidates, the number of candidates "significantly enriched" or "not depleted" after RNase R treatment are also provided. In this  are based on the STAR aligner), and NCLscan formed the third group, followed by the outlier of UROBORUS. In addition, the result was corroborated by similar analysis on the positive dataset ( Fig 4C). Of note, probably due to the relatively high error rate (1%) introduced into the synthetic reads on the positive dataset and the strict filtering step applied by PTESFinder (no mismatches or indels within "n" nucleotides either side of the junction position, n = 10 in this case), the performance of PTESFinder on this dataset was not as sensitive as suggested from the real datasets. This indicates that its performance was vulnerable to sequencing error and depended on the sequencing depth. Sensitivity for compiled experimentally validated circRNAs. We performed a broad survey of the published circRNA-related articles, from which we compiled a total of 282 experimentally verified circRNAs from 17 studies. Notably, they included circRNAs detected in various tissues or cell lines at different developmental stages, so many of them may not be expressed in HeLa or Hs68 samples. The detection of these validated circRNAs by each method on HeLa and Hs68 RNase R-treated samples is shown in Fig 5. It shows that varying numbers of validated circRNAs were lost upon filtering for candidates with !2 supporting reads. In addition, CIRI was found to be the most sensitive method when we only considered circRNA candidates with !2 supporting backspliced junction reads.

Computational cost overview
We evaluated the computational efficiency of each software package using the metrics of runtime, RAM consumption, and physical disk space. We found that the computational cost not only correlated with the sequencing depth, but was also affected by the abundance of circRNA candidates detected in each sample. As Fig 6A shows, when running on large datasets (i.e., Hs68_RNaseR− and Hs68_RNaseR+) with an equal number of 3 threads allocated, only CIR-Cexplorer, circRNA_finder, DCC, CIRI, and find_circ could finish within a day or so, while MapSplice took an incredibly long time of about 13 days and a month to finish Hs68_RNaseR − and Hs68_RNaseR+, respectively. The relative long runtime of MapSplice was also confirmed by the study of Hansen et al. [37]. For Central Processing Unit (CPU)-intensive tools such as MapSplice, we recommend users to run these tools on servers with adequate processors allocated to reduce running time. Regarding memory consumption, only UROBORUS and find_circ were capable of processing large datasets on a standard PC equipped with 8 gigabytes (GB) of RAM. NCLScan consistently required approximately 10 GB, while CIRCexplorer, cir-cRNA_finder, and DCC needed about 27 GB to run the underlying STAR aligner. Also, Segemehl was the least efficient, demanding about 50 GB to run all the time. For other software packages, moderate or sharp increase of memory consumption was observed when the dataset shifted from moderate to large size ( Fig 6B). In addition, MapSplice, PTESFinder, KNIFE, Segemehl, CIRI, and NCLScan were found to be the 6 least efficient software packages regarding physical disk space usage (Fig 6C), indicating that users should prepare adequate computational resources before running these pipelines on large datasets.

Discussion
The global and accurate identification of circRNAs from RNA-Seq data serves as a fundamental step towards revealing their biogenesis and functions. Here, we provided an improved and easy-to-use circRNA read simulator that we believe will benefit the circRNA research community. Besides, we performed a comprehensive evaluation of 11 circRNA detection tools using synthetic and real datasets based on multiple metrics such as sensitivity, precision, F1 measure, AUC, RAM consumption, running time, and physical disk space used.
Taken together, we observed concordant results on the synthetic and real datasets. Generally, CIRI, CIRCexplorer, and KNIFE, which achieved better balanced performance between their precision and sensitivity, compared favorably to the other methods, whereas NCLScan and MapSplice were conservative methods with comparable precision but less favorable sensitivity. Conversely, Segemehl was sensitive but suffered with the presence of many false positives in the output. Together with find_circ and UROBORUS, these 3 methods exhibited the worst precision based on our comparisons with the background and real datasets. The performance of PTESFinder was noticeable on the synthetic dataset but less pronounced on the real datasets. Its performance tended to be variable depending on the dataset. Also, we found that CIRI and MapSplice were the most sensitive methods to recover backspliced junction reads for candidates detected. For the positive dataset used, it should be noted that we might have introduced bias into this dataset when we generated backspliced junction reads for candidates deposited in CircBase. Specifically, the HeLa circRNAs used were reported in the Salzman 2013 study [36], thus potential favor may be with KNIFE. But the high recall rate achieved by KNIFE on this dataset could also be attributed to its high sensitivity, as demonstrated in Szabo et al. [33]. For KNIFE, it also should be pointed out that the output from its de novo module was not incorporated in our study, thus the sensitivity of this tool may be underestimated in our analysis on real datasets. In addition to the performance factor, practical issues may also affect the choice of an optimal tool. For instance, Hansen et al. failed to run KNIFE and Segemehl in their study [37]. In our experience, generally, the installation process would be more complicated for tools with more dependencies (Table 4). Also, great differences in computational cost were observed among these tools (Fig 6); practitioners need to pay attention to this when analyzing large datasets. Finally, a comprehensive manual and user-friendly output would be beneficial to the users. Though most tools included detailed user guides, some only provided limited descriptions (e.g., circRNA_finder and PTESFinder). Besides, the backspliced junction reads provided critical information for further scrutinizing the authenticity of a candidate of interest; however, only some of the tools (e.g., CIRI and KNIFE) included the identifiers of such reads in the output.
Of all the methods evaluated here, no single tool dominates on all the metrics used, and there is still much space for further improvement regarding methods for the global detection of these noncollinear molecules from RNA-Seq data. For example, all of them were originally designed to identify circRNAs originating from the same gene locus, but a recent report [45] pointed out that circRNAs could derive from gene fusion events and potentially play a critical role in cancer pathogenesis. These fusion circRNAs are overlooked by the current methods, underscoring the complexity of the RNA world and the need to refine the existing methods or develop new ones for circRNA detection.

Software packages for detecting circRNAs
To our knowledge, about 11 tools are now available for the detection of circRNAs from RNA-Seq data and can be broadly divided into two groups according to the underlying strategies to detect circRNAs [7,10,19] (Table 4). For instance, KNIFE, NCLScan, and PTESFinder all require that the putative circRNA sequences to be constructed with gene annotation information are provided in order to detect circRNAs. This strategy was called "pseudo-referencebased" in [7,10] or "candidate-based" approach in [19]. However, the difference is that KNIFE directly constructs all the potential out-of-order exon-exon junction sequences from gene annotation information before alignment, while NCLScan and PTESFinder create the putative circRNA sequences levering the mapping information of the segmented anchors obtained after alignment to the genome or transcriptome. The other strategy that other tools used was called "fragmented-based" in [7,10] or "segmented read approach" in [19], which identified backsplicing junctions from the mapping information of a multiple-split read's alignment to the genome. Under this category, specifically circRNA_finder, CIRCexplorer, DCC, MapSplice, and Segemehl can be assigned to a subgroup, because they devise spliced alignment algorithms to detect and parse the backsplicing events, whereas find_circ and UROBORUS can be grouped together, as they both gather the unmapped reads after mapping them to the genome, extract the first and last 20 bp anchors from the unmapped reads, and then derive the backsplicing events from the mapping information of these anchors. Finally, CIRI is unique. It detects the paired chiastic clipping (PCC) signals from the mapping information of reads by local alignment with BWA-MEM [46] and combines with systematic filtering steps to remove potential false positives. For evaluation of their performance, these tools and associated software packages were deployed on an Ubuntu 10.04 server, equipped with 2 Intel(R) Xeon(R) E5530 CPUs and 102 GB of RAM. We followed the instructions and recommendations provided in their manuals and focused on output circRNAs with !2 backspliced junction reads. Here, we provide a brief summary of these software packages. For details of the algorithms underlying each tool, users can refer to the papers introducing each method. circRNA_finder [34] requires paired-end sequencing data and relies on the RNA-Seq spliced alignment software STAR [49]. After read alignment, the output putative chimeric junction reads are filtered and collapsed into a set of putative circularization junctions based on the following restrictions: (1) At most, 3 mismatches are allowed, and only unique mapped reads are used. (2) The distance between the splice donor and acceptor should be less than 100 kilobases (kbs). (3) One read in a pair should span the backsplicing junction site, while the other should be mapped within the interval between the splice donor and acceptor. In this study, neither circRNA candidates without GT/AG splice sites nor those derived from mitochondria were taken into consideration.
CIRCexplorer [15] is a Python-based tool, providing user-friendly circRNA detection output. Initially, it uses TopHat [50] to do the spliced alignment of reads to genome, then extracts the unmapped reads to detect backsplicing events by alignment with TopHat-Fusion [51]. Reads that are split and mapped to the same chromosome but in reverse order are candidate backspliced junction reads. The mapping positions of these reads are realigned and adjusted if needed, in order that the donor and acceptor splice sites derived are consistent with canonical splice sites from known gene annotation. Currently, it also supports parsing the STAR spliced alignment output. Since TopHat and TopHat-Fusion require much more time to complete the alignment step, in this study, we took the intermediate STAR alignment output generated during running circRNA_finder as the input for CIRCexplorer. As a consequence, the computational costs (i.e., RAM, running time, and physical disk space) of these 2 pipelines were almost equivalent because the computational resources required by them were negligible during the circRNA detection phase compared with that in the process of read alignment.
Another software package that utilized STAR alignment software and is evaluated in this study is DCC [42]. However, to improve the detection of small circRNAs, in addition to the usual alignment of read pairs from paired-end data as a whole, the DCC pipeline also recommends that users run an alignment of each segment from read pairs separately. This almost doubles the computational cost during the alignment phase. During the process of detection, several filtering steps are applied: (1) If paired-end data are being dealt with, mapping of mates should match with the relevant circRNA. (2) If biological replicates are available, filtering for common circRNAs detected by these replicates would be allowed. (3) The canonical GT/AG splicing signal should be presented in the backsplicing junction borders. (4) Backsplicing events from mitochondria are discarded. (5) Candidate circRNAs from repetitive or homologous regions are removed.
Find_circ [2] is one of the 5 tools evaluated in this study that utilizes Bowtie2 [52] and/or Bowtie [52] to perform read alignment. In short, first, it collects the unmapped reads generated during the first round of alignment to the genome. Second, it extracts the first and last 20-bp anchors from each unmapped read to perform the second alignment. If the 2 anchors are mapped to positions within spliced exons in an opposite orientation, it indicates circRNA splicing. Third, it extends the anchors' alignment, collects and outputs the identified splice junctions, and keeps those junction-spanning reads. Finally, it applies a series of filtering steps to check and report reliable circRNA candidates.
UROBORUS [38] is also a circRNA-detection pipeline based on the Bowtie RNA-Seq alignment tool. First, it employs TopHat to perform splice alignment. Second, it collects the first and last 20 bp of an unmapped read as anchors and realigns these anchors using TopHat to gather balanced mapped junctions and unbalanced mapped junctions. Third, these 2 types of junction-spanning anchors are separately handled to infer the potential backspliced junction reads. In the end, the reads obtained above are further aligned to the genome using Bowtie; those that map to the same chromosome but in reverse orientation are annotated as candidate circRNA-derived reads.
PTESFinder [41] employs both Bowtie and Bowtie2 to perform read alignment. It only detects backsplicing junctions stemming from known splice sites. Intriguingly, it does not make use of the paired-end information, even if it is available. The detection process can be summarized as follows. First, it extracts the first and last 20-bp anchors from a read and aligns them to transcriptome reference sequences. Second, it exploits the anchors' mapping information to detect the exon-shuffling events and also generates the putative circRNA sequences flanking the backsplicing junction sites. Third, it aligns the original reads to the putative cir-cRNA sequences, genome, and transcriptome. Finally, to eliminate potential false positives, it requires greater mapping scores obtained from the putative circRNA sequences than those from genome or transcriptome, and also user-adjustable criteria on mapped reads must be satisfied to support putative circRNA sequences.
KNIFE [33] starts by mapping reads to the genome, rRNA sequences, transcriptome, and customized linear and backspliced junction databases separately, with the help of Bowtie2. It discards possible backspliced junction reads when they also map with high scores to the other databases mentioned above. For those remaining backspliced junction-spanning reads, it further categorizes them into circRNA and decoy reads based on the mapping information of the mate when paired-end data are available. Finally, for reads aligned to none of the databases mentioned above, it remedies with a de novo analysis module to detect circRNAs derived from unannotated splice sites. However, the break points of these inferred circRNAs are window-/ bin-based; in other words, they are not exact break points, so we did not incorporate these cir-cRNA candidates into this study. The major advantage of KNIFE over other tools, according to its developers, is the method of filtering circRNAs for which there is high confidence. It employs a statistical framework to obtain the posterior probability of every circular read collected to subsequently predict whether it is true or false positive.
Unlike the above circRNA detection tools based on Bowtie, which require extracting a fixed size of anchors from the unmapped reads to identify potential backspliced junctions, the underlying BWA-MEM aligner of CIRI [39] can automatically determine the break points of query reads derived from circRNAs. After BWA-MEM alignment, CIRI scans the alignment results twice. Briefly, during the first scan, it collects the PCC signals supporting the backsplicing junctions and appropriate paired-end mapping signals consistent with the circRNA templates. Then, it checks and filters for those junction signals with canonical GT/AG splice sites (if a gene annotation file is provided, other possible splice-site signals flanking exon boundaries in this file will also be considered). During the second scan, it further clusters the unbalanced junction reads missed during the first scan by applying a dynamic programming alignment algorithm and also filters out those potential false positive junctions derived from repetitive or homologous regions.
MapSplice [48] is one of the 3 software packages evaluated in this study that are able to identify multiple types of splice junction events. Specifically, it's de novo splice mapping software that can segment reads into multiple anchors to detect canonical and noncanonical junctions in RNA-Seq data. This algorithm was applied in the study by Jeck et al. [3] and detected more than 25,000 distinct circRNA species in human fibroblasts that were resistant to RNase R. This tool is memory efficient when running on an RNA-Seq dataset with a regular sequencing depth but takes longer to run than all of the other packages presented here when equivalent numbers of threads are allocated.
Segemehl [47] is also a multi-split RNA-Seq mapping tool that can identify circRNA, canonical splicing, trans-splicing, and gene fusion events. It is claimed to be more sensitive than its counterparts at detecting these events. Notably, Segemehl consumes a large amount of RAM when the reference genome is large, for example, approximately 50 GB for the human genome. In such cases, runs on computers with a small memory allocation will fail.
NCLscan [40] is another RNA-Seq analysis tool that is claimed to be accurate at identifying noncollinear transcripts such as trans-splicing, fusion, and circRNAs from transcriptome data. One of the key steps in this pipeline is to construct the putative noncollinear references with gene annotation information and BLAT alignment output of the concatenated sequences from unmapped read pairs. To eliminate false positives, it undertakes several stepwise alignments and filtering, integrating different aligners such as BWA [53], BLAT [54], and license-required Novoalign (www.novocraft.com). Table 5 provides a summary of all of the datasets used. Detailed descriptions of these datasets are provided below.

Datasets used
Positive dataset. The positive dataset contains a total of 1,071,113 pairs of synthetic reads, with the sequence length of 101 bp and insert size of 350 bp. These synthetic data encompass a total of 14,689 circRNA species. The number of backspliced junction read pairs supporting each circRNA ranges from 2 to 24, while the size of circRNAs varies from 51 to 846,530 bp. This simulated positive dataset was generated by an improved circRNA read simulation tool named CIRI-simulator [39], which was originally released with CIRI. To accurately generate mimicking circRNA reads, we overhauled this circRNA simulation tool. It now supports generating circRNAs deposited in CircBase, which are far more appropriate circRNA candidates than those generated from the joining of 2 randomly chosen out-of-order exons. A total of 14,689 circRNA species were produced from those reportedly detected in HeLa cells. This accurate and easy-to-use simulation tool, which we believe will benefit the circRNA research community, can be accessed at: https://github.com/linatbeishan/circRNA_detection_review. Background dataset. Simulated paired-end RNA-Seq data were generated with a widely used read simulator ART [55]. Briefly, the RefSeq mRNA sequences were downloaded from the UCSC Genome Browser first, then the simulator was executed using the downloaded sequences as input database. Indel and substitution variants were introduced into the generated reads. Specifically, to take the influence of poor sequencing quality into consideration, we shifted down the quality score of reads to increase substitution sequencing errors. Finally, a large negative dataset with sequencing length of 101 bp and insert size of 350 ± 10 bp was generated (the command used: art_illumina -ss HS25 -d simulate -na -i refMrna.fa -o simulate -l 101 -f 200 -p -m 350 -s 10 -sp -rs 20160830 -qs -13 -qs2-13).
Mixed dataset. The mixed dataset was generated by combining the background and positive datasets to further evaluate the performance of each method.
Real datasets. We included 6 runs of real datasets produced in 2 separate independent c-ircRNA-related studies [3,39]. The first is the HeLa cell-line dataset, which was also used in Chen et al. [7]. This dataset comprises 4 runs of rRNA-depleted RNA-Seq libraries downloaded from the NCBI Sequence Reads Archive (accession numbers SRR1636985, SRR1636986, SRR1637089, and SRR1637090). Specifically, SRR1636985 and SRR1636986 are from samples further treated with RNase R enzyme after rRNAs had been depleted. Therefore, we combined SRR1636985 and SRR1636986 as a HeLa_RNaseR+ sample and SRR1637089 and SRR1637090 as a HeLa_RNaseR− sample. After cleaning the raw data, there were approximately 80.5 million and 36.8 million PE101 read pairs left for HeLa_RNaseR− and HeLa_RNaseR+, respectively. Furthermore, to eliminate possible bias from data generated by a single group and assess the performance of each method on large datasets, we incorporated another 2 runs of deep sequencing PE100 RNA-Seq data derived from Hs68 cells (accession numbers SRR444975 and SRR445016), which were also used in [37]. Both runs were similar in being rRNA-depleted, but SRR445016 was from samples additionally treated with RNase R enzyme. After cleaning, the numbers of remaining read pairs for SRR444975 and SRR445016 were approximately 202.5 million and 196.4 million, respectively.