Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A PCR-free rapid protocol for one-pot construction of highly diverse genetic libraries

Abstract

In vitro protein display methods can access extensive libraries (e.g., 1012–1014) and play an increasingly important role in protein engineering. However, the preparation of large libraries remains a laborious and time-consuming process. Here we report an efficient one-pot ligation & elongation (L&E) method for sizeable synthetic library preparation free of PCR amplification or any purification steps. As a proof of concept, we constructed an ankyrin repeat protein templated synthetic library with 1011 variants in 150 μL volume. The entire process from the oligos to DNA template ready for transcription is linearly scalable and took merely 90 minutes. We believe this L&E method can significantly simplify the preparation of synthetic libraries and accelerate in vitro protein display experiments.

Introduction

Protein therapeutics have a potentially limitless ability to improve how we treat diseases. However, developing new therapeutic proteins remains a long and challenging process. One of the most effective protein engineering strategies currently is directed evolution. This process mimics the natural evolution and enriches protein variants with the desired functions from large libraries of variants [1, 2]. Since the likelihood of identifying a desired variant is directly proportional to the size of the library, large protein libraries are highly desirable in directed evolution.

A prerequisite of directed evolution is protein display, a process of coupling phenotype (protein function) with genotype (DNA/RNA). Many display technologies have been developed to provide the linkage of phenotype to genotype. The library size afforded by cell-based protein display technologies such as phage, E. coli, and yeast display is limited by the cell transformation efficiency and is typically <1010, a library size that is relatively straightforward to produce with PCR-based methods [39]. Through negation of the transformation step, cell-free protein display technologies such as mRNA [10] and ribosome [11] display can screen much larger libraries with 1012–1014 different variants and are potentially superior to cell-based methods. However, preparation of libraries with 1012–1014 genetic diversity is both time-consuming and laborious [12]. For a gene with 500 bp, a typical PCR reaction (e.g., 50 μL) using 10 ng of DNA template has a maximum diversity of 2x1010 (10 ng/500 bp/ 660 (g/mol/bp) x 6x1023 molecules/mol). Thus, a library with 1012–1014 variants would require 2.5–250 mL of PCR reaction. The complexity in library preparation is further compounded in the case of antibody or scaffold libraries whose variable regions are scattered throughout the gene, necessitating multiple rounds of PCR reactions followed by gel purification and subsequent ligation of these PCR products to reconstitute the entire gene. For example, Kondo, et al. recently reported the synthesis of a monobody library with 1013 variants that required multiple 15 mL PCR reactions followed by digestion, ligation, and a final amplification in a 60 mL PCR reaction [13, 14]. According to the authors, the entire process took a whopping six months to complete [13].

To simplify the preparation of large libraries for in vitro protein display, we here report an alternative Ligation & Extension (L&E) method. To demonstrate this method, we used the Regulatory Factor X-associated Ankyrin-containing protein (RFXANK) [15, 16], a human ankyrin protein (named HARPin here), as a model binder scaffold. HARPin shares high structural homology to the designed ankyrin repeat protein (DARPin) previously developed by Pluckthun and co-workers [17, 18] and consists of 3 ankyrin repeat domains sandwiched between N- and C-capping domains (Fig 1A). The entire HARPin gene has 516 bp. We used a ligation-mediated strategy similar to that described previously [19, 20], which utilizes short antisense splint oligos that anneal at the junctions of adjacent forward strand fragments to generate the full-length gene. Nine synthetic sense oligos spanning the entire length of the HARPin gene and the T7 promoter are synthesized (i.e., NF1-9) and evenly divided into three groups (Fig 1B). The three oligos within each group are first ligated together with the help of short splint oligos (i.e., SP1-8, Fig 1C, step 1). All sense oligos, except for NF1, are phosphorylated at the 5’ end, while all splint oligos contain 3’ phosphate groups to prevent their ligation. In designing the oligos, we used the following criteria: i) total length <90 nucleotides, ii) splint oligo Tm ~40°C for the annealing region with each sense oligo, and iii) splint oligos bind the framework region of the binding scaffold and include no randomized codons. Next, the three ligation reactions are combined, and additional splint oligos are added to promote the formation of a long single-stranded DNA representing the entire gene (Fig 1C, step 2). Finally, primer CR, which anneals to the 3’ of the target gene, and a polymerase with strong strand displace activity (i.e., Bst 2.0), is added to generate the full-length double-stranded gene (Fig 1C, step 3). Although many partial gene fragments exist in the final reaction mixture, only the full-length gene contains the antisense strand of the T7 promoter, enabling the reaction product to be used directly for mRNA synthesis without any purification (Fig 1B, step 4).

thumbnail
Fig 1. Overview of Ligation & Elongation (L&E) library preparation method.

(A) Structural alignment of HARPin (pdb: 3v30, yellow) and DARPin (pdb: 4j8y, blue). Residues in red and green color are randomized in the HARPin and DARPin libraries, respectively. (B) Schematic of an HARPin library and the oligos. Red x’s mark locations of randomized residues, which correspond to red residues in panel A. Green circles indicate the presence of 5’ phosphate moiety; red arrows indicate 3’ phosphate moiety. AR: ankyrin repeat. (C) L&E library construction workflow. The groups in Step 1 are as follows: Group 1: NF1, NF2, NF3, SP1, and SP2; Group 2: NF4, NF5, NF6, SP4, and SP5; and Group 3: NF7, NF8, NF9, SP7, SP8. NF: N-terminal forward, SP: Splint, CR: C-terminal reverse.

https://doi.org/10.1371/journal.pone.0276338.g001

To demonstrate the L&E method, a sizeable HARPin library was prepared. In Step 1, three 30 μL ligation reactions, each containing 15 pmoles of the respective sense and splint oligos, were carried out at 37°C for 30 minutes, resulting in a dominant single-stranded DNA product corresponding to the ligation of the three sense oligos within each group (Fig 2A, yellow arrows). Next, in Step 2, the three ligation reactions were combined along with additional splint oligos in 100 μL reaction volume to generate the single-stranded full-length gene product. After a 30-minute incubation at 37°C, multiple new bands were visible on the gel. The largest product is presumed to correspond to the ligation product of all the sense oligos (620 nt), which includes the regions for the T7 promoter and ribosome binding site (RBS) (Fig 2A, green arrow). The final ligation reaction also contained multiple smaller DNA products from incomplete ligation of multiple sense oligos. In Step 3, 15 pmoles of primer CR, which contains a region necessary for hybridizing the puromycin-containing oligo, and 48 units of Bst 2.0, a polymerase with strong strand displacement activity, were added together with dNTPs. The reaction (150 μL) was carried out at 65°C for 30 minutes to generate the full-length double-stranded gene (693 bp). Step 3 produced a smear of DNA products ranging from approximately 100 bp to 700 bp (Fig 2B, lane 1). This is unsurprising given that the product of Step 2 contained multiple DNA fragments as well as excess NF oligos that can anneal to each other to yield an array of dsDNA of varying size.

thumbnail
Fig 2. Library preparation results.

(A) Single-stranded DNA products of Step 1 and 2 visualized on a 15% TBE-Urea gel stained with SYBR Green. Yellow arrows mark the three-way ligation products of Step 1, and the green arrow marks the presumed complete ligation product after combining all three groups. (B) Double-stranded DNA products from Step 3 visualized on a 1% agarose gel stained with ethidium bromide. Lanes 1 and 2 used DNA template of Ligation 2 (Step 2) produced in the presence and absence of ligase, respectively. Lane 3 (PC) contains gel-purified full-length HARPin double-stranded DNA (693 bp) amplified by PCR as a positive control. (C) mRNA product from Step 4 visualized on a denaturing 1.5% agarose gel stained with EtBr (1.5% agarose, 1% bleach). Lanes 1–3 contained the corresponding transcription product from panel B. The full-length mRNA contains 669 nt. The gels are representative of three independent experiments.

https://doi.org/10.1371/journal.pone.0276338.g002

To demonstrate the utility of the DNA library without the need of purification, five microliters of the reaction product of Step 3 was used directly for mRNA transcription (Step 4). This in vitro transcription reaction routinely yielded 108–217 μg mRNA in 20 μL reaction volume. As anticipated, and despite the DNA template exhibiting a smearing pattern, the in vitro transcription reaction produced a single RNA product of the expected size (669 nt, Fig 2C, lane 1), validating our hypothesis that only the full-length gene contains the antisense DNA of the T7 promoter and that the nonspecific and shortened DNA fragments produced during library creation do not interfere with mRNA transcription. Thus, transcribed mRNA can be purified using a commercial RNA clean-up kit (e.g., Zymo Research EZNA RNA Clean & Concentrate kit) and be eluted in a buffer appropriate for the subsequent in vitro protein synthesis reaction.

The amount of the full-length double-stranded DNA product from Step 3 was quantified by qPCR (Fig 3). Starting from 15 pmoles (or 9x1012 molecules) of each sense oligo in Step 1, we routinely obtained 4.25x1011–1.07x1012 molecules of the double-stranded full-length gene product in the 150 μL reaction volume of Step 3, corresponding to an overall full-length gene ligation yield of 4.7–11.8%. Since the overall ligation efficiency is inversely proportional to the number of ligation events, increasing the length–thus reducing the number–of sense oligos may lead to a higher overall ligation efficiency. Importantly, the entire process of assembling the DNA library took one and a half hours and the reaction volume can be easily scaled up linearly for larger libraries. Moreover, it is conceivable that higher concentrations of oligos can be used during the ligation, making it possible to further reduce the reaction volume for larger libraries. This contrasts with the conventional PCR-based method used, for example, by Kondo, et al. which took months to create a monobody library with 1013 variants [13]. In addition, unlike the conventional PCR-based library creation methods that are at risk of introducing amplification bias and skewing the library diversity, our L&E method is PCR-free, offering it the potential to most faithfully preserve the DNA library diversity afforded by chemical synthesis.

thumbnail
Fig 3. L&E library quantification.

(A) qPCR amplification plot. Black lines represent the standard curve. Red lines represent the serially diluted library samples (1:50, 1:250, 1:1250). (B) Visualization of the final qPCR products on a 1% agarose gel stained with EtBr. Lanes 1–5 are from reactions containing 9x109 – 9x105 molecules of the standard DNA. Lanes 6–9 are from reactions containing 50-, 250- and 1250-fold diluted HARPin library samples. Red arrow indicates the desired amplicon (609 bp). (-): qPCR product from ligase-negative control reaction. NTC: No template control PCR. Middle lane contains GeneRuler 100 bp Plus DNA ladder (Thermo Fisher); the most intense band corresponds to 500 bp. (C) Starting quantity (SQ) calculation based on the qPCR standard curve. Values represent the copy number. Total SQ is the extrapolated total copy number of library in a final 150 μL double-stranded DNA synthesis reaction.

https://doi.org/10.1371/journal.pone.0276338.g003

We employed Next-Generation Sequencing (NGS) to assess the quality of this new library. To simulate the conditions of the library’s use in protein engineering, 500 nanograms of the in vitro transcription mRNA product (from Fig 1, Step 4) was reverse transcribed and then PCR amplified to append the NGS linkers. Approximately 750 nanograms of the PCR product was subjected to the MiSeq Next-Generation Sequencing reaction and yielded a total of 293,067 sequence reads (Fig 4A). After removing duplicate sequences and those of incorrect length, the cleaned dataset contained 162,422 unique reads (Fig 4A). The frequency of each base at the randomized codon position ranges between 19.3% to 33.7% for the N nucleotides and 38.4% to 61.1% for the K nucleotides (S1 Fig), which is consistent with oligos synthesized with machine mixing (Fig 4B and S1 Fig). The average frequency of each amino acid at the randomized codon positions varies from 1.9% to 10.4%, which is consistent within that afforded by the NNK codon (Fig 4C and S2 Fig). Overall, 51.4% of the full-length reads contain at least one internal stop codon (Fig 4A). This is slightly lower than that predicted for a gene with 20 randomized NNK codons (62.5% = 1/32 STOP codon frequency at each position x 20 positions, assuming an even distribution of each possible base at the N and K positions). The final dataset of full-length sequences contained 78,808 unique variants (Fig 4A). The abundance of unique full-length sequences combined with a lack of noteworthy codon bias validate the use of L&E method for in vitro library creation.

thumbnail
Fig 4. Next-Generation Sequencing results.

(A) Results of sequencing reads at each step of analysis. (B, C) Frequency analysis of the dataset containing de-duplicated, full-length sequences. (B) Average nucleotide frequency at each position of the NNK randomized codons. (C) Average frequency of each amino acid at the randomized residues *: STOP codon. Expected amino acid frequency denotes the relative frequency of each amino acid encoded by the 32 possible NNK codons, assuming an even distribution of possible nucleotides at each N and K position. Error bars represent standard deviations of the means. Expected frequency was calculated by dividing the number of codons for each amino acid by the total number of possible codons encoded by NNK (i.e., 32).

https://doi.org/10.1371/journal.pone.0276338.g004

In conclusion, we report a simple and efficient L&E method for synthetic library creation. A synthetic library with ≥1011 variants was produced in a 150 μL reaction volume. The entire process, from the commercially available oligonucleotides to a DNA template ready for mRNA transcription, took 1.5 hours. Furthermore, since the in vitro reaction volume is linearly scalable, more extensive libraries can be easily accommodated without increasing the reaction time. Finally, although only the HARPin scaffold was used here, we believe that our method should be extendable to any synthetic library, making L&E an alternative library creation method that has the potential to significantly accelerate the in vitro protein engineering process.

Materials and methods

Oligonucleotides

All oligonucleotides were synthesized by Integrated DNA Technologies (Coralville, IA; S1 Table) and PAGE purified in-house. Briefly, 5 μL of 100 μM stock of each oligo was loaded onto a 5% Urea-PAGE gel and visualized after staining with SYBR Green II nucleic acid stain (Thermo Fisher; Waltham, MA). The appropriate bands were excised and resuspended in 800 μL TE buffer (10 mM Tris-HCl pH 8.0, 0.1 mM EDTA), underwent three rounds of freeze-thaw to disrupt the gel, and incubated at 37°C for 24 hours with shaking. The supernatant was transferred to fresh tubes the next day, and the DNA was concentrated via ethanol precipitation.

Library preparation

The sense and splint oligos were divided into three groups. Group 1: NF1, NF2, NF3, SP1, and SP2; Group 2: NF4, NF5, NF6, SP4, and SP5; and Group 3: NF7, NF8, NF9, SP7, SP8. The HARPin library was created in three steps (Fig 1C). In Step 1, each group of oligos (15 pmol each) were mixed with 2 U of T4 DNA ligase (Lucigen; Madison, WI), Ribo-ATP (1 mM, Thermo Fisher), PEG-8000 (7.5%, VWR; Wayne, PA) in 30 μL of 1x NEB buffer r2.1 (NEB; Ipswich, MA) and incubated at 37°C for 30 minutes. In Step 2, the three reactions from Step 1 were combined along with oligos SP3 and SP6 (15 pmoles each), 2 U of fresh T4 ligase, Ribo-ATP (final 1 mM), and PEG-8000 (final 7.5%) in 100 μL total reaction volume and incubated for an additional 30 minutes at 37°C to generate single-stranded DNA sense molecules spanning the entire HARPin gene and the T7 promoter. Aliquots of reactions from Steps 1 and 2 were visualized on Novex 15% TBE-Urea gels (Thermo Fisher) after staining with SYBR Green II nucleic acid stain (Thermo Fisher).

In Step 3, to generate the full-length double-stranded DNA, primer CR (15 pmoles) was added to the reaction mixture together with dNTPs (final 1.4 mM each, NEB); 15 μL of 10x isothermal amplification buffer (NEB); MgSO4 (final 8 mM total, NEB); and Bst 2.0 WarmStart polymerase (48 units, NEB) in a final reaction volume of 150 μL. The reaction was carried out at 65°C for 30 minutes, followed by heat inactivation at 80°C for 20 minutes. The double-stranded DNA molecules generated were visualized on 1% agarose gels stained with ethidium bromide (EtBr). GeneRuler 100 bp Plus ladder (Thermo Fisher) was used for comparison.

In vitro transcription

To demonstrate that only the full-length genes contain the anti-sense oligo of the promoter recognizable by the T7 RNA polymerase, 5 μL of the product from Step 3 was used directly for mRNA transcription reaction (20 μL total volume) using the TranscriptAid T7 High Yield Transcription kit (Thermo Fisher). After a 1.5-hour incubation at 37°C, RNase-Free DNase I (2 U, NEB) and DNase I Buffer (NEB) were added to the mixture to remove the DNA template (50 μL final reaction volume). The transcription products were cleaned up using RNA Clean & Concentrator-25 kit (Zymo Research; Irvine, CA). An aliquot of the synthesized mRNA was visualized on denaturing agarose gels (1.5% agarose and 1% bleach [21]) stained with EtBr. RiboRuler RNA Ladder, High Range (Thermo Scientific) was used for size comparison.

Library quantification

The amount of double stranded full-length HARPin gene synthesized in Step 3 was quantified via qPCR using primers HARPin_F and HARPin_R (S1 Table). A standard curve was constructed using serially diluted full-length HARPin DNA generated by PCR amplification and purified by agarose gel electrophoresis. Standard curve quantification results are shown in S2 Table. Three different dilutions of the library samples (Step 3) were used in the qPCR reactions to improve the estimation accuracy. All qPCR reactions were carried out as triplicates using Forget-Me-Not EvaGreen qPCR master mix (Biotium; Fremont, CA) and a Bio-Rad CFX96 Real-Time PCR system (Bio-Rad; Hercules, CA). The results were analyzed with Bio-Rad CFX Manager software. All qPCR reactions were also analyzed on 1% agarose gels to ensure that the quantification matches the desired target DNA product.

Library diversity analysis

The diversity of our library was analyzed using next generation sequencing (NGS). First, 500 ng of the mRNA library produced in Step 4 (Fig 1C) was reverse transcribed using the SuperScript II reverse transcription kit (Invitrogen) according to the manufacturer’s instruction, with oligo CR (S1 Table) serving as the primer. The resulting cDNA was then PCR amplified with Q5 High-Fidelity Polymerase (NEB) using primers NGS_F and NGS_R (S1 Table), which harbor the Illumina adapter sequences. The PCR product was then gel purified using the ZymoClean Gel DNA Recovery kit (Zymo Research) and subjected to Amplicon-EZ Next Generation Sequencing (Azenta Life Sciences; Chelmsford, MA). Paired reads were assembled using PEAR [22]. The dataset was processed (i.e., de-duplicated and filtered to only include full-length sequences) using Geneious Prime (v.2022.1.1, Biomatters Ltd; San Diego, CA). Frequency grid profiles were constructed using Unipro UGENE [23]. Sequences including stop codons were removed from the dataset using a Biopython script (https://github.com/milesroberts-123/extract-weird-proteins).

Supporting information

S1 Fig. Nucleotide frequency analysis.

Nucleotide frequency at each position of the sequenced amplicon, including reference sequence. Nucleotide numbers refer to position within the complete HARPin gene. Randomized codons are depicted in red. Analysis performed on dataset containing de-duplicated, full-length sequences.

https://doi.org/10.1371/journal.pone.0276338.s001

(PDF)

S2 Fig. Amino acid frequency analysis.

Frequency of each amino acid at each position of the sequenced amplicon, including reference sequence. Residue numbers refer to position within the complete HARPin gene. Randomized residues are depicted in red. Analysis performed on dataset containing de-duplicated, full-length sequences.

https://doi.org/10.1371/journal.pone.0276338.s002

(PDF)

Acknowledgments

Portions of this research were conducted with the advanced computing resources provided by Texas A&M High Performance Research Computing. Graphical abstract created with BioRender.com. We want to thank Benjamin Thomas for helpful discussion during the NGS data analysis.

References

  1. 1. Wang Y, Xue P, Cao M, Yu T, Lane ST, Zhao H. Directed Evolution: Methodologies and Applications. Chem Rev. 2021;121(20):12384–444. pmid:34297541
  2. 2. McLure RJ, Radford SE, Brockwell DJ. High-throughput directed evolution: a golden era for protein science. Trends Chem. 2022;4(5):378–91. WOS:000793458600005.
  3. 3. Baker M. Protein engineering: navigating between chance and reason. Nat Methods. 2011;8(8):623–6. Epub 20110728. pmid:21799494.
  4. 4. Bornscheuer U, Kazlauskas RJ. Survey of protein engineering strategies. Curr Protoc Protein Sci. 2011;Chapter 26:Unit26 7. pmid:22045562.
  5. 5. Cadwell RC, Joyce GF. Randomization of genes by PCR mutagenesis. PCR Methods Appl. 1992;2(1):28–33. pmid:1490172.
  6. 6. Keefe AD, Szostak JW. Functional proteins from a random-sequence library. Nature. 2001;410(6829):715–8. pmid:11287961; PubMed Central PMCID: PMC4476321.
  7. 7. Simeon R, Chen Z. In vitro-engineered non-antibody protein therapeutics. Protein Cell. 2018;9(1):3–14. Epub 20170307. pmid:28271446; PubMed Central PMCID: PMC5777970.
  8. 8. Stemmer WP. Rapid evolution of a protein in vitro by DNA shuffling. Nature. 1994;370(6488):389–91. pmid:8047147.
  9. 9. Tee KL, Wong TS. Polishing the craft of genetic diversity creation in directed evolution. Biotechnol Adv. 2013;31(8):1707–21. Epub 20130906. pmid:24012599.
  10. 10. Roberts RW, Szostak JW. RNA-peptide fusions for the in vitro selection of peptides and proteins. Proc Natl Acad Sci U S A. 1997;94(23):12297–302. pmid:9356443; PubMed Central PMCID: PMC24913.
  11. 11. Hanes J, Pluckthun A. In vitro selection and evolution of functional proteins by using ribosome display. Proc Natl Acad Sci U S A. 1997;94(10):4937–42. pmid:9144168; PubMed Central PMCID: PMC24609.
  12. 12. Olson CA, Roberts RW. Design, expression, and stability of a diverse protein library based on the human fibronectin type III domain. Protein Sci. 2007;16(3):476–84. pmid:17322532; PubMed Central PMCID: PMC2203324.
  13. 13. Kondo T, Iwatani Y, Matsuoka K, Fujino T, Umemoto S, Yokomaku Y, et al. Antibody-like proteins that capture and neutralize SARS-CoV-2. Sci Adv. 2020;6(42). Epub 20201014. pmid:32948512; PubMed Central PMCID: PMC7556756.
  14. 14. Kondo T, Eguchi M, Tsuzuki N, Murata N, Fujino T, Hayashi G, et al. Construction of a Highly Diverse mRNA Library for in vitro Selection of Monobodies. Bio Protoc. 2021;11(16):e4125. Epub 20210820. pmid:34541043; PubMed Central PMCID: PMC8413553.
  15. 15. Gao J, Xu C. Structural basis for the recognition of RFX7 by ANKRA2 and RFXANK. Biochem Biophys Res Commun. 2020;523(1):263–6. Epub 20191219. pmid:31864703.
  16. 16. Xu C, Jin J, Bian C, Lam R, Tian R, Weist R, et al. Sequence-specific recognition of a PxLPxI/L motif by an ankyrin repeat tumbler lock. Sci Signal. 2012;5(226):ra39. Epub 20120529. pmid:22649097.
  17. 17. Pluckthun A. Designed ankyrin repeat proteins (DARPins): binding proteins for research, diagnostics, and therapy. Annu Rev Pharmacol Toxicol. 2015;55:489–511. Epub 2015/01/07. pmid:25562645.
  18. 18. Binz HK, Stumpp MT, Forrer P, Amstutz P, Pluckthun A. Designing repeat proteins: well-expressed, soluble and stable proteins from combinatorial libraries of consensus ankyrin repeat proteins. J Mol Biol. 2003;332(2):489–503. pmid:12948497.
  19. 19. Chen HB, Weng JM, Jiang K, Bao JS. A new method for the synthesis of a structural gene. Nucleic Acids Res. 1990;18(4):871–8. Epub 1990/02/25. pmid:2179872; PubMed Central PMCID: PMC330339.
  20. 20. Sui Z, An R, Komiyama M, Liang X. Stepwise Strategy for One-Pot Synthesis of Single-Stranded DNA Rings from Multiple Short Fragments. Chembiochem. 2021;22(6):1005–11. Epub 2020/10/31. pmid:33124728.
  21. 21. Aranda PS, LaJoie DM, Jorcyk CL. Bleach gel: a simple agarose gel for analyzing RNA quality. Electrophoresis. 2012;33(2):366–9. pmid:22222980; PubMed Central PMCID: PMC3699176.
  22. 22. Zhang J, Kobert K, Flouri T, Stamatakis A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics. 2014;30(5):614–20. Epub 20131018. pmid:24142950; PubMed Central PMCID: PMC3933873.
  23. 23. Okonechnikov K, Golosova O, Fursov M, team U. Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics. 2012;28(8):1166–7. Epub 20120224. pmid:22368248.