Deep Sequencing of RNA from Three Different Extracellular Vesicle (EV) Subtypes Released from the Human LIM1863 Colon Cancer Cell Line Uncovers Distinct Mirna-Enrichment Signatures

Secreted microRNAs (miRNAs) enclosed within extracellular vesicles (EVs) play a pivotal role in intercellular communication by regulating recipient cell gene expression and affecting target cell function. Here, we report the isolation of three distinct EV subtypes from the human colon carcinoma cell line LIM1863 – shed microvesicles (sMVs) and two exosome populations (immunoaffinity isolated A33-exosomes and EpCAM-exosomes). Deep sequencing of miRNA libraries prepared from parental LIM1863 cells/derived EV subtype RNA yielded 254 miRNA identifications, of which 63 are selectively enriched in the EVs - miR-19a/b-3p, miR-378a/c/d, and miR-577 and members of the let-7 and miR-8 families being the most prominent. Let-7a-3p*, let-7f-1-3p*, miR-451a, miR-574-5p*, miR-4454 and miR-7641 are common to all EV subtypes, and 6 miRNAs (miR-320a/b/c/d, miR-221-3p, and miR-200c-3p) discern LIM1863 exosomes from sMVs; miR-98-5p was selectively represented only in sMVs. Notably, A33-Exos contained the largest number (32) of exclusively-enriched miRNAs; 14 of these miRNAs have not been reported in the context of CRC tissue/biofluid analyses and warrant further examination as potential diagnostic markers of CRC. Surprisingly, miRNA passenger strands (star miRNAs) for miR-3613-3p*, -362-3p*, -625-3p*, -6842-3p* were the dominant strand in A33-Exos, the converse to that observed in parental cells. This finding suggests miRNA biogenesis may be interlinked with endosomal/exosomal processing.


Introduction
Extracellular vesicles (EVs) are nano-membranous particles ranging from 30-2,000 nm in diameter that are released from most cell types into the extracellular environment [1]. EVs are thought to comprise three main classes depending on their originexosomes (Exos, 50-150 nm), shed microvesicles (sMVs, 400-1,500 nm), and apoptotic bodies (400-2,500 nm). Although there is an ongoing polemic amongst researchers regarding the nomenclature, biogenesis, biochemical and functional properties of EV subtypes, the available evidence suggest that exosomes originate by the inward budding of endosomal compartments called multivesicular bodies (MVBs) and are released from the cell into the microenvironment following fusion of MVBs with the plasma membrane, sMVs (ectosomes, microvesicles, microparticles, oncosomes) by outward budding/blebbing from the plasma membrane, and apoptotic bodies through the process of apoptosis/cell shrinkage/nuclear fragmentation [2]. At both functional and biochemical levels, exosomes have been the most widely studied of the EVs. Exosomes have been shown to contain diverse proteins (including oncoproteins, tumour suppressor proteins, transcriptional regulators, splicing factors [1,3,4,5,6], lipids [7], and RNAs (mRNAs, microRNAs (miRNAs) and other non-coding RNAs) [8] -exosomal molecular cargo information can be accessed by publically-accessible databases such as ExoCarta [9] and EVPedia [10]. Although long regarded as cellular debris, recent exosome studies demonstrate that they have important biological roles in the immune, cardiovascular, and nervous systems and in the pathogenesis of diseases such as cancer [11,12,13]. In the last decade it has been established that EVs play a pivotal role in cancer progression and pre-metastatic niche priming for tumour engraftment [14,15,16,17].
It is well recognized that the tumour microenvironment plays a critical role in cancer initiation, progression and metastasis [18]. Intercellular communication between tumour-stroma can be mediated by soluble factors, including cytokines, chemokines, and growth factors [19]. An emerging concept is that tumourstroma interactions can also involve the direct exchange of genetic information, mainly in the form of miRNAs, a class of noncoding RNAs (18-25 nucleotides in length) that regulate the expression of multiple target genes by binding to their encoded mRNAs [13,20,21]. This transfer of genetic material can occur when EVs containing miRNA cargo are released by a donor cell into the extracellular environment and are functionally transferred to recipient cells. Transferred miRNAs can be functional both in vitro [8,22,23,24,25], and in vivo [26,27,28]. Studies have begun to examine the association of microRNA-related polymorphisms and their association with cancer incidence and prognosis as well as the potential for circulating microRNAs or faecal microRNA expression as non-invasive early detection biomarkers for colorectal cancer [29], and utility of miRNAs in recurrence, metastasis and therapeutic outcomes [30]. A major advance in our understanding of exosomal miRNA biology was the finding that sumoylated hnRNPA2B1 directs the loading of certain miRNAs into exosomes through recognition of specific short motifs present in miRNAs [31].
Recently, we described the isolation of two populations of exosomes as well as sMVs from the same human colon carcinoma cell line LIM1863 [4]. The sMVs were prepared by differential centrifugation and exosomes purified by sequential immunocapture using anti-A33-and anti-EpCAM coupled magnetic beads. While the exosome populations (A33-Exos and EpCAM-Exos) could not be distinguished using electron microscopy, buoyant density or stereotypical exosomes markers (TSG101, Alix and HSP70), protein typing using GeLC-MS/MS [3,6] revealed that their protein compositions were quite distinct -EpCAM-Exos containing classical apical trafficking components and A33-Exos, enriched with basolateral trafficking molecules. The proteome profiles of both exosome populations, in turn, were quite distinct from the initial report of sMVs released into the same culture medium. In order to further define these EV subtypes, we investigated their molecular composition using another omics approach, RNA typing.
In this study, we show using deep sequencing that there are a total of 254 miRNAs identified in the four miRNA libraries prepared (A33-Exos, EpCAM-Exos, sMVs and parent LIM1863 cells), of which 63 are highly enriched in EVs. The three LIM1863-derived EV subtypes are enriched with specific miRNAs signatures, when compared with the parental cell line LIM1863. In particular, we report that 32, 2 and 4 miRNAs that are exclusively enriched in A33-Exos, EpCAM-Exos, and sMVs, respectively -some of which enable exosomes to be distinguished from sMVs. Of the 32 miRNAs selectively enriched in A33-Exos, 13 have not been previously implicated with colorectal cancer (CRC) and we discuss how this information can be utilized towards the potential for CRC diagnostics. A notable finding in our study was the finding of 'passenger strand' miRNA (miRNA star) sequences enriched in EVs compared to parent LIM1863 cells.

Materials and Methods
Cell culture and isolation of extracellular vesicles LIM1863 cells [32] were initially cultured to ,80% confluence in a 175-cm 2 flask in RPMI-1640 medium (Invitrogen, Carlsbad, CA) supplemented with 5% foetal calf serum (FCS), 0.1% insulintransferrin-selenium (ITS, Invitrogen), 100 U/ml penicillin and 100 mg/ml streptomycin at 37uC and 5% CO 2 . LIM1863 cells (,3610 7 cells) were harvested (140 g, 3 min), suspended in 15 ml phenol red free RPMI-1640 medium (containing 0.5% ITS, 100 U/ml penicillin and 100 mg/ml streptomycin) and transferred into the Cultivation chamber of a CELLine CL-1000 Bioreactor classic flask (Integra Biosciences); the Nutrient Supply chamber contained 500 ml of RPMI-1640 supplemented with 5% FCS, 100 U/ml penicillin and 100 mg/ml streptomycin. Cells were cultured at 37uC and 5% CO 2 atmosphere. Culture medium in Nutrient Supply chamber was replaced twice a week and the cell suspension from the Cultivation chamber was harvested every 48 h. After each collection, the cell suspension was centrifuged at 140 g for 3 min to sediment LIM1863 cell organoids, which were resuspended in 15 ml of cultivation medium and re-seeded back into the Cultivation chamber. The supernatant was centrifuged at 2,000 g for 10 min to remove floating cells/cell debris and then centrifuged further at 10,000 g for 30 min at 4uC to collect shed microvesicles (sMVs). The resulting supernatant was further centrifuged (100,000 g, 1 h) to collect crude exosomes. Crude exosomes were fractionated into two distinct exosome subpopulations (A33-Exos and EpCAM-Exos) by sequential immunocapture using Dynabeads (Invitrogen) loaded with anti-human-A33 monoclonal antibodies [33] in tandem with anti-EpCAM (CD326)-antibody bound magnetic microbeads (Miltenyi Biotec), as described [4].

Transmission electron microscopy (TEM)
A33-Exos and EpCAM-Exos were eluted from their respective magnetic beads with 0.2 M Glycine, pH 2.5 and harvested by centrifugation (100,000 g, 1 h). For TEM, samples (sMVs, A33and EpCAM-Exos, 1 mg/10 ml PBS) were applied for 2 min to 400 mesh copper grids coated with a thin layer of carbon. Excess material was removed by blotting with filter paper, and samples negatively stained twice with 10 ml of a 2% uranyl acetate solution for 10 min (ProSciTech, Queensland, Australia). Grids were air dried and imaged using a JEOL JEM-2010 transmission electron microscope operated at 80 kV.

Total RNA isolation
Total RNA from LIM1863 cells, sMVs, A33-and EpCAM-Exos was isolated with TRIzol (Life Technology), according to manufacturer's instructions. Briefly, samples were lysed in 1 ml TRIzol Reagent by repetitive pipetting for 5 min at room temperature (RT). Chloroform (0.2 ml/ml TRizol Reagent) was added to solubilized samples and mixtures vortexed vigorously for 15 s, incubated at RT for 2-3 min and then centrifuged (12,000 g, 15 min, 4uC). Aqueous phase was collected, mixed with 5 mg of glycogen (20 mg/ml aqueous glycogen, Invitrogen) and isopropyl alcohol (0.5 ml isopropyl alcohol/1 ml aqueous phase) and incubated for 10 min at RT. Total RNA was recovered by centrifugation at 12,000 g for 10 min at 4uC. Resultant RNA pellets were washed once with 75% aqueous ethanol, air-dried for 5 min and re-dissolved in RNase-free water. The quantity, quality and composition of RNA samples were evaluated using an Agilent 2100 Bioanalyzer (Agilent Technologies, USA).

Small RNA library construction and sequencing
Four small RNA libraries (from parental LIM1863 cells and derived sMVs, A33-, and EpCAM-Exos) were constructed and sequenced with Illumina TruSeq deep sequencing technology (Sample Preparation Guide, Par #15004197 Rev.A, Illumina, San Diego, CA). Briefly, total RNA samples were fractionated on a 15% Tris-borate-EDTA (TBE) polyacrylamide gel (Invitrogen) and a bands corresponding to small RNAs (18,30 nt) were excised and small RNAs extracted by centrifugation. After ligation of 59(59-GUUCAGAGUUCUACAGUCCGACGAUC-39) and 39(59-UGGAAUUCUCGGGUGCCAAGG-39) adaptors, small RNA molecules were reverse transcribed into cDNA, then amplified using the adaptor primers for 14 cycles and the fragments (,150 bps) were isolated from a 6% TBE PAGE-gel. The purified cDNA was directly used for cluster generation and sequenced using an Illumina HiSeq 2000 platform. Image files generated by the sequencer were processed to produce digitalquality data (raw FASTQ files). FASTQ files for all four small RNA libraries (LIM1863 cells, and derived sMVs, A33-Exos and EpCAM-Exos) have been submitted to Sequence Read Archive (SRA) of NCBI under the accession number SRA106214.

Quantitative real-time PCR
Total RNA (3 ml containing 12 ng RNA/ml) prepared from LIM1863 cells and derived EVs was reverse transcribed using a TaqMan miRNA Reverse Transcription (RT) Kit from Applied Biosystems/Life Technologies with Megaplex RT Primers (Human Pool A and Pool B, Applied Biosystems). RT reaction conditions, based on manufacturer's instructions, were: 40 cycles of 16uC for 2 min, 42uC for 1 min and 50uC for 1 s. Resultant Megaplex RT products (2.5 ml) were then mixed with 22.5 mL of Megaplex PreAmp Reaction Mix containing 2.5 ul Megaplex PreAmp Primers Pool A and B (Applied Biosystems). Preamplification cycling conditions were: 95uC for 10 min, 55uC for 2 min, 75uC for 2 min followed by 12 cycles of 95uC for 15 s and 60uC for 4 min. After diluting the pre-amplified cDNAs (25 ml) with 75 ml of Tris-EDTA buffer (1 mM Tris buffer containing 0.1 mM EDTA, pH 8.0), 15 ml of diluted cDNA product was mixed with 450 ml of TaqMan Universal PCR Master Mix (Applied Biosystems) and 435 ml nuclease-free water. To validate the deep sequencing data the mixture was subjected to quantitative real-time PCR (qRT-PCR) analysis using TaqMan low-density array (TLDA) cards (set v3.0) representing a total of 754 assays specific to human miRNAs (present in Sanger miRBase v14). qRT-PCR was performed on a 7900 HT Thermocycler (Applied Biosystems) using the manufacturer's recommended cycling conditions: 50uC for 2 min, 95uC for 10 min followed by 40 cycles of 95uC for 15 s and 60uC for 1 min. qRT-PCR data was collected at the end of each cycle. Cycle threshold (Ct) values were calculated using the SDS software v2.4; automatic baseline settings were assigned a minimum Ct threshold value of 0.2. Ct values .35 were considered to be below assay detection level and excluded from data analysis. Data was analysed using the DDCt method with LIM1863 cell RNA as the reference and global normalization with U6 snRNA and MaMMU6 as candidate controls using Expression Suit Software v1.0.3 (Applied Biosystems).

Bioinformatic analyses
A bioinformatics pipeline developed in BGI-Shenzhen was employed to identify miRNAs and other small RNA categories. Briefly, low-quality reads (small RNAs which contain base ''N'' (undefined by the sequencer), or more than 6 bases with quality lower than 13, or more than 4 bases with quality lower than 10), adaptors and reads smaller than 18 nt were excluded from the raw data to generate clean reads (18-30 nt). Clean reads were aligned to miRBase (v20, http://www.mirbase.org) using BLAST software from NCBI to identify known miRNAs and generate expression profiles. Fold changes in expression levels (sample group versus control) were calculated for each miRNA as log 2 ratios using normalized TPM (transcripts per million reads) values according to the formula: Fold change = log 2 (sample group/control). P-value for each miRNA in one pairwise comparison was performed based on the Poisson test [34]. Clean reads were aligned to the Human Reference Genome (hg19, http://hgdownload.soe.ucsc.edu/) using SOAP 2 [35] to classify repeat associate small RNAs and mRNA degraded fragments. rRNA, tRNA, snRNA, snoRNA, srpRNA were identified by mapping clean reads to GenBank (NCBI, http://www.ncbi.nlm.nih.gov/) and Rfam (http://rfam. sanger.ac.uk/) databases.

Gene Ontology and KEGG Pathway Analysis
TargetScan 6.0 (http://www.targetscan.org) was employed to predict target genes for the miRNA candidates. The potential functions of miRNA target genes were annotated by Gene Ontology and KEGG pathway database. WEGO method [36] was used to present significant GO terms. Two statistical values, P-value (Fisher's exact test) and q-value [37], were calculated to obtain pathways that were significantly enriched and control the false discovery rate.

Presence of RNA in EVs released from the human colon carcinoma cell line LIM1863
Previously, we reported that LIM1863 cells release two EV subtypes -exosomes and shed microvesicles [4], and that within the exosome subtype there are two distinct exosome populations, one enriched for apical surface sorting proteins (EpCAM-Exos), the other, basolateral surface sorting protein (A33-Exos); all three EV populations have distinct proteome profiles [4]. To assess whether RNAs are specifically sorted into EVs, and whether the repertoires of miRNAs (miRs) in the three populations we isolated differ, we embarked on a large-scale purification of EVs from LIM1863 culture medium (CM). To generate enough EVs for total RNA analysis we employed a continuous cell culture approach using CELLine CL-1000 Bioreactor flasks to generate ,1200 mL LIM1863 CM. sMVs were purified using differential centrifugation and A33-Exos and EpCAM-Exos by sequential immunoaffinity capture, see Figure. 1A. The exosome populations (A33-Exos and EpCAM-Exos) could not be distinguished by electron microscopy (50-150 nm diameter) whereas the sMVs were more heterogeneous in size (100-1,500 nm diameter) and consistent with the known morphology ( Figure 1 A-D); all three EV subpopulations contained stereotypical exosome markers (TSG101, Alix, CD9) ( Figure 1E). This approach yielded ,20 mg of sMVs and ,3 mg of A33-and EpCAM-Exos. To determine if LIM1863 cell EVs contain RNA, purified EVs were extracted for total RNA including the small RNA fraction using standard RNA extraction methodology. The quality and quantity of the isolated RNA was determined using an Agilent 2100 Bioanalyzer ( Figure 1F). The RNA yield: A33-Exos 8.8 mg RNA/,3 mg protein, EpCAM-Exos 9.2 mg RNA/,3 mg protein, and sMV: 72 mg RNA/,20 mg protein. Total RNA bioanalyzer profiles indicated that LIM1863 cell-derived sMVs contained 18S and 28S ribosomal RNA (rRNA) whereas A33-and EpCAM-Exos lack detectable amounts of these species of RNA in agreement with exosomes derived from other cell lines [22,38,39,40].
A snapshot of small RNA sequencing data To characterize small RNAs in LIM1863-derived EVs, Illumina HiSeq 2000 high-throughput technology was employed to sequence four small RNA libraries (LIM1863 cells (CL), sMVs, A33-Exos and EpCAM-Exos). Initially, 20330356, 25388242, 22512338, and 24096270 raw reads were produced. After trimming low-quality reads, adaptor sequences and reads where lengths were smaller than 18 nt (BGI in-house software), corresponding 18850584, 22762038, 16407260 and 18195289 total clean reads were obtained. We next mapped all clean reads to miRBase (v.20) to annotate known miRNAs in each library. The results showed 15367876, 152815949, 12771308, and 13611284 annotated clean reads corresponding to CL, sMVs, A33-Exos, and EpCAM-Exos, respectively; clean reads identified for other small RNA categories (rRNA, tRNA, snRNA, snoRNA, srpRNA, repeat-associated RNAs, mRNA degradation) and unannotated RNAs are shown in Table 1. The percentage of miRNAs in the total RNA isolated from each sample corresponded to 77.84, 74.81, 67.14 and 81.52 for A33-Exos, EpCAM-Exos, sMVs, and CL, respectively.
We next examined the four LIM1863 cell-derived miRNA libraries to ascertain how many of the 2578 known miRNAs in miRBase v20 were detectable. Without a cut-off 891, 863, 770, and 759 miRNAs were represented in CL, sMVs, A33-Exos and EpCAM-Exos, respectively. However, for this study, we decided to use a more stringent threshold (.5 TPM cut-off) to allow us to focus on highly-represented miRNAs. This resulted in a total of 254 miRNAs for further analysis (Table S1), including hierarchical clustering of expression levels ( Figure S1).

LIM1863-derived EVs contain 254 distinct miRNAs
An inspection of Table S1 shows that the 254 miRNAs are represented in all four libraries, albeit at varying levels of enrichment. Significantly, more than 75% of these 254 miRNAs are highly represented in each library (.10 TPM) (Figure 2A), and the top 20 miRNAs in the A33-Exos, EpCAM-Exos, sMVs and CL libraries represent 91.02%, 90.72%, 91.02%, and 91.42% of the corresponding total reads of miRNAs. Interestingly, the top three most highly-represented miRNAs identified in LIM1863derived EVs -miR-192-5p, miR-10a-5p, and miR-191-5p -have been reported previously in tissue and serum of CRC patients as potential diagnostic biomarkers. For example, miR-192 has been observed in tissue [41] and serum/plasma [42] from CRC patients, miR-191 in tissue [41,43] and serum/plasma [44], and miR-10a in tissue [45] and serum/plasma [42] from CRC patients. Moreover, miR-192 is reported to suppress metastasis of CRC [46] and its synthesis, along with that of miR-215 (also highly represented in our 254 miRNA dataset), is induced by p53 and shown to play an important regulatory role of genes involved in the TGF-b signalling pathway [47,48].

miRNAs are preferentially enriched in LIM1863derived EVs
To assess whether some miRNAs are specifically sorted into EVs we conducted a miRNA-enrichment analysis for A33-Exos, EpCAM-Exos and sMVs. MiRNAs with ,2 fold changes relative Table 1. Summary of small RNA sequencing of LIM1863 cell and extracellular vesicles. to CL miRNAs were filtered out, yielding 63 miRNAs for comparison ( Table 2). MiRNA representation was most prominent in purified A33-Exos (56 miRNAs represented, of which 32 are selectively enriched compared to other EVs), followed by EpCAM-Exos (25 miRNAs, 2 selectively enriched) and sMVs (13 miRNAs, 4 selectively enriched) (Table 2, Figure 2D). There are only 6 miRNA sequences common to all three EV subtypes, including three 'passenger strand' miRNAs (miRNA* sequences) (miR-451a, miR-4454, miR-7641, let-7a-3p*, let-7f-1-3p* and miR-574-5p*). To date, only miRNA-451a has previously been observed in EVs (embryonic stem cell-derived EVs, [53]). The significance of three miRNA* sequences being common to all three EV subtypes is not clear at this stage and must await analysis of a statistically significant number of EV samples derived from other CRC cell line sources. Overall, in the enriched 63 miRNA dataset we observe 12 certain miRNA* sequences, three of which (let-7a-3p*, let-7f-1-3p*, miR-574-5p*), are highly represented in all EV subpopulations ( Table 2). We next examined the 63 miRNA dataset ( Table 2) to ascertain whether there were any miRNAs that enabled distinction between exosomes (A33-Exos and EpCAM-Exos) and sMVs. This analysis revealed 7 miRNAs significantly enriched in exosomes (hsa-miR-320a 320b, -320c, -320d 221-3p, -374-5p, and -200c-3p), compared to sMVs. Strikingly, miRs-320a/b, -221-3p and -200c-3pc had more than 1000 TPM in each exosome library with 1.18-2.28 log 2 ratio fold changes compared to corresponding values in the range 500-800 TPM (20.33-1.88 log 2 fold changes) in sMVs and CL libraries. MiR-320a is implicated in CRC due to its ability to suppress cell proliferation by targeting b-catenin [54]. The expression of miR-320a can be used to evaluate the risk of CRC metastasis due to its ability to bind directly to the 39-UTR of neurophilin (NRP-1), a co-receptor of vascular epithelial growth factor [55]. The presence of miR-320a in plasma has been reported as a potential biomarker for the early detection of CRC [44]. miR-221-3p, along with miR-222 which we also see in the highly-expressed 254 miRNA dataset (Table S1), has been validated experimentally to regulate cell proliferation by targeting p27/Kip1, a cell cycle inhibitor and tumour suppressor, to promote tumourigenesis [56,57]. Interestingly, elevated levels of miR-222, along with miR-17-3p, -135b, -92 and -95, have been reported in CRC patient plasma and tumour tissue [58]. miR-200c,which along with miR-141 is a member of the miR-141,200c cluster (cluster 59, Table S2) as well as the miR-8 family (Table S3) targets the transcriptional repressor zinc-finger E-box binding homeobox 1/2 (ZEB1/2) [59] and SIP1 [60] is a critical inducer of EMT in several cancer types, including colorectal [61]. Circulating miR-141 is a potential biomarker for CRC metastasis [62].  A salient feature of the 63 miRNAs (TPM .5) enriched in LIM1863-derived EVs ( Table 2) was the observation that 38 were exclusively represented in one or other of the three EV libraries, relative to the CL library. Foremost, was the finding that 32/38 identified miRNAs were exclusive to A33-Exos. Of these, miRs-19a-3p, -19b-3p, the miR-378 family (miRs-378a-3p, -378c and -378d), -107 and the miR-320a/b predominate. Although the miRs-320a/b are also enriched in EpCAM-Exos, they are more enriched in A33-Exos, especially miR-320a (TPM value 3801.67, 2.28 log 2 ratio fold change relative to CL representation). miRNA-107 is also exclusively represented in A33-Exos, albeit to a lesser extent than miRs-19a/b, -378a/b/cd and -320a/b. miRs-19a/b are key oncogenic miRNAs from the miR-17,92a cluster [63], and implicated in various cancers and reported to regulate gene expression levels of important cancer pathways and immune modulatory systems [64]. It is thought that miRs-19a/b are induced by Myc, and regulate cell survival by targeting the expression of PTEN [20], Bcl2L11, Prkaa1 and PP2A [64]; both miRs-19a/b have been observed in CRC patient tissue and blood and reported to be potential biomarkers for this disease [42,65]. miRs-378a/b/c/d are reported to influence cell survival, tumour growth and angiogenesis by targeting the expression of SuFu and Fus-1 expression [66]. miRs-320c/d are reported to inhibit cell proliferation by targeting the transferrin receptor 1 (CD71) [67]; in the case of CRC they are reported to suppress proliferation by targeting b-catenin and the Wnt-signalling pathway [54]. MiR-107, which is induced by p53, is reported to inhibit HIF-1 and thereby tumour angiogenesis [68]. Along with miR-103, miR-107 can promote CRC metastasis by targeting the metastatic suppressors DAPK and KLF4 [69].

qRT-PCR validation for EV-enriched miRNAs
To validate the miRNA expression changes identified by Illumina HiSeq 2000 platform, we performed qRT-PCR using TaqMan array cards A+B (set v3.0) representing 754 assays specific to human miRNAs (miRBase v14). As shown in Figure 3, 42 of the 63 (66.7%) highly enriched miRNAs, seen in our study were identified -the discrepancy being due to the higher coverage of miRNAs in miRBase v20 used in our Illumina HiSeq studies. Overall, 33/42 (78.6%) miRNAs detected by qRT-PCR were consistent in expression with the deep sequencing results. For individual EV datasets with similar expression trends for qRT-PCR validation, 40 miRNAs (95.2%) were observed in A33-Exos, 35 (83.3%) for EpCAM-Exos, while 33 (78.6%) miRNAs were identified with same expression patterns in sMVs.

Gene ontology (GO) and KEGG pathway enrichment analysis
To generate further insights into potential signalling pathway perturbation following EV uptake by recipient cells, we performed GO and KEGG pathway analysis. GO analysis of our enriched 63 miRNAs dataset reflects a strong correlation (21 GO annotated terms representing 10% (1265), Figure S2) of target genes predicted by TargetScan between the target genes of these miRNAs and proteins associated with extracellular matrix, membranes and cancer progression. KEGG pathway analysis ( Table S5) of the 63 miRNA dataset suggests that these EV miRNAs may modulate several genes associated with signalling pathways in recipient cells -these include important pathways implicated in cancer such as the Wnt, Ras, TGF-b, and p53 signalling pathways [73,74,75].

Discussion
EVs are nanometer-sized membraneous particles (30 nm to 2,000 nm in diameter) released from most cell types under both normal and pathological conditions [1,11]. Through their diverse cargo (proteins, lipids, RNA, DNA) they play a pivotal role in intercellular communication [13], including disease pathogenesis [12] such as driving the formation of a pre-metastatic tumour niche [14,15,16,17]. One of the challenges of EV research is the recognition that these extracellular vesicles are heterogeneous comprising three main subtypes -exosomes (50-150 nm), shed microvesicles (sMVs), 400-1,500 nm) and apoptotic bodies (500-2,000 nm) [2], and the technical difficulties involved in purifying them to homogeneity [3]. Recently, we examined different methods for purifying exosomes [3] and sMVs [4] from CRC cell lines. These studies led to the finding, based upon proteome profiling, that two distinct populations of exosomes (A33-Exos and EpCAM-Exos) are released from LIM1863 colon carcinoma cellderived organoids into cell culture medium, along with sMVs, and that both exosome populations differ significantly at the protein level from sMVs [4]. Notably, the A33-Exos contained proteins consistent with release from the basolateral surface and EpCAM-Exos, from the apical cell surface. It is becoming increasingly clear that if we want to fully understand EV biology and their physiological role then they need to be studied using a combination of other omics data, such as lipidomics and RNA biology.
During miRNA biogenesis, the miRNA precursor is generated in the cytoplasm as a double-stranded miRNA duplex [80] and after processing of the precursor RNA duplex there is predominant accumulation of one dominant strand, either the '5p' or '3p' strand, thought to be the mature functional effector miRNA while the other strand, known as the star strand (miRNA*) or 'passenger' strand, is degraded and typically maintained at lower levels in the cell [81]. While both mature and star strands are generated from a single primary transcript, they have different sequences and therefore target different messenger RNAs [81]. Although it is generally believed that the mature strand is the functional miRNA [82], accumulated data indicates that miRNA* can also exert regulatory effects on gene expression [81,83]. In this study, a total of 58 miRNA* sequences were detected, of which 13 were selectively enriched in EVs ( Figure 4A). Interestingly, expression levels of 12 miRNA* sequences (in the 254 miRNA dataset) were greater than their corresponding mature miRNAs (annotated by miRBase 20) ( Figure 4B). Notably, miR-106b-3p*, miR-126-5p* and miR-355-3p* were detected with higher expression levels than their corresponding mature miRNAs in all four libraries. miR-106b, a member of miR-106, 25 cluster, has been shown to down  . Candidate circulating miRNA biomarkers associated with extracellular vesicles and colorectal cancer. (A) Three-way Venn diagram depicting miRNAs identified in LIM1863-derived EVs that are associated with published miRNA data for human CRC tissue/blood and faeces [65,91,92,94,97,98]. 6 miRNAs are common to all three EVs, while 32, 2, and 4 are selectively represented in A33-Exos, EpCAM-Exos, and sMVs, respectively. miRNAs indicated in black bold represent association with CRC, red represents miRNAs not identified in previous CRC reviews/studies. *denotes miRNA star (miRNA*sequence). (B) Three-way Venn diagram of identified miRNAs in EVs associated with published reports for miRNAs from human CRC plasma/serum/faecal samples human [65,91,92,94,97,98]. Notably, miR-21 found in our study has been reported in published in human CRC plasma, serum and faeces samples. 47, 5, 16 miRNAs found in our studies have been previously reported in CRC plasma, serum, and faecal samples, respectively. Red represents miRNAs in our study that are highly-enriched in EVs (63 miRNA dataset); green represents other miRNAs identified in current study (from our 254 miRNA dataset). doi:10.1371/journal.pone.0110314.g005 regulate the expression levels of TGFBR2, SMAD2 and BMP family genes in CRC [84], miR-126-3p to suppress breast cancer metastasis [85], and miR-126-5p to inhibit the migration and invasiveness of prostate cancer cells [86]; the function of their corresponding miRNA* sequences observed in this study awaits further experimentation. In a number of cases we found miRNA* sequences to be dominant strand in A33-Exos (miR-3613-3p*, -362-3p*, -625-3p*, -6842-3p*) to that which was dominant in the parent LIM1863 cells ( Figure 4B). This surprise finding suggests that miRNA biogenesis may be interlinked with endosomal/ exosomal processing and that exosomal miRNA* sequences might affect gene expression in recipient cells in different contexts to mature miRNAs. Alternatively, exosomes may act as a vehicle for removal of miRNA star sequences from the cell in a manner akin reported for 'protein waste management' [87].
In conclusion, we showed significant differences between the miRNA expression profiles of three EV subtypes (two exosome populations and sMVs) secreted from LIM1863 CRC cells. Our findings provide the basis for an in-depth study, using a variety of CRC cells lines that discern the familial archetypes of CRC and accurately predict tumour microsatellite subtype, of the role of certain miRNAs as prospective diagnostic and prognostic clinical markers of this disease and offer the potential of new pharmaceutical reagents. Figure S1 Hierarchical clustering of miRNA expression profiles of the highly expressed 254 miRNAs in cell, sMVs, and exosomes (A33-Exos and EpCAM-Exos) reveals a similarity between exosome subpopulations, and extracellular vesicles (Supplemental Table S1). CLUSTER and TREEVIEW programs were employed for hierarchical clustering and visualization of the miRNA expression profiles. Hierarchical clustering was performed with average linkage. (PDF) Figure S2 Gene Ontology annotation for the target genes predicted by TargetScan of the 63 miRNAs selectively enriched in LIM1863-derived EVs. (PDF)