Circulating microRNA/isomiRs as novel biomarkers of esophageal squamous cell carcinoma

Background MicroRNA (miR)s are promising diagnostic biomarkers of cancer. Recent next generation sequencer (NGS) studies have found that isoforms of micro RNA (isomiR) circulate in the bloodstream similarly to mature micro RNA (miR). We hypothesized that combination of circulating miR and isomiRs detected by NGS are potentially powerful cancer biomarker. The present study aimed to investigate their application in esophageal cancer. Methods Serum samples from patients with esophageal squamous cell carcinoma (ESCC) and age and sex matched healthy control (HC) individuals were investigated for the expression of miR/isomiRs using NGS. Candidate miR/isomiRs which met the criteria in the 1st group (ESCC = 18 and HC = 12) were validated in the 2nd group (ESCC = 30 and HC = 30). A diagnostic panel was generated using miR/isomiRs that were consistently confirmed in the 1st and 2nd groups. Accuracy of the panel was tested then in the 3rd group (ESCC = 18 and HC = 18). Their use was also investigated in 22 paired samples obtained pre- and post-treatment, and in patients with esophageal adenocarcinoma (EAD) and high‐grade dysplasia (HGD). Results Twenty-four miR/isomiRs met the criteria for diagnostic biomarker in the 1st and 2nd group. A multiple regression model selected one mature miR (miR-30a-5p) and two isomiRs (isoform of miR-574-3p and miR-205-5p). The index calculated from the diagnostic panel was significantly higher in ESCC patients than in the HCs (13.3±8.9 vs. 3.1±1.3, p<0.001). The area under the receiver operating characteristics (ROC) curves of the panel index was 0.95. Sensitivity and specificity were 93.8%, and 81% in the 1st and 2nd groups, and 88.9% and 72.3% in the 3rd group, respectively. The panel index was significantly lower in patients with EAD (6.2±4.5) and HGD (4.2±1.7) than in those with ESCC and was significantly decreased at post-treatment compared with pre-treatment (6.2±5.6 vs 11.6±11.5, p = 0.03). Conclusion Our diagnostic panel had high accuracy in the diagnosis of ESCC. MiR/isomiRs detected by NGS could serve as novel biomarkers of ESCC.

Introduction Esophageal cancer is one of the most common cancers worldwide and has high mortality [1,2]. The prognosis of patients with esophageal cancer remains poor despite recent improvements in therapy and perioperative management, and 5-year survival rate remains about 20%, even in developed countries [3]. One reason for this poor prognosis is that most patients with esophageal cancer are diagnosed at an advanced stage [4]. In contrast, early stage esophageal cancer, in particular mucosal cancer is expected cure by endoscopic resection [5,6]. This substantial discrepancy suggests that a specific diagnostic biomarker could be used for early detection would improvement the prognosis of patients with esophageal cancer. While several biochemical markers have been investigated, including squamous cell carcinoma antigen [7], carcinoembryonic antigen [8] and, CYFRA 21-1 [9], their sensitivity has not proved consistently satisfactory across the various stages of esophageal cancer.
MicroRNA(miR)s are classified as small noncoding RNAs (19-25 nucleotides) which regulate the expression of plural numbers of messenger RNAs [10][11][12]. Cancer cells possess miRs which have particular function in promoting cancer development or minimizing cancer suppression. miRs also exist in the blood stream as inclusions in exosomes. These circulating miRs play a role in intercellular communication in the cancer environment and bring about favorable conditions for cancer invasion and metastasis. Because their expression profiles vary between cancer patients and healthy individuals, circulating miRs can act as powerful biomarkers in the diagnosis of cancer. Indeed, many researchers have reported their usefulness as novel biomarkers for several malignant tumors, including esophageal cancer [13][14][15][16][17].
Recent research from deep sequencing represented by the next generation sequencer(NGS) has revealed that miRs are heterogeneous. Isoforms of miR differ slightly from mature miR by base length and sequence and are referred to as isomiR. Although the function of isomiR is not completely understood, they are known to play an important role in cancer development [18,19]. IsomiRs also exist in the blood with high stability, similarly to mature miRs. We hypothesized that combination of circulating miR and isomiRs detected by NGS might act as novel biomarkers for malignant tumors. To date, however, few studies examined the usefulness of miR/isomiRs from blood samples as cancer biomarkers. Here, we aimed to investigate their application in esophageal squamous cell carcinoma (ESCC) using NGS.

Samples
We prospectively collected serum samples of patients treated for esophageal cancer at Hiroshima University Hospital from January 2010 to July 2018. Before April 2016, samples for patients undergoing surgery were collected only at surgery. Thereafter, samples were collected before treatment from all patients with esophageal cancer, such as at endoscopic resection, chemoradiotherapy, neoadjuvant therapy followed by surgery, and palliative chemotherapy. We used 18 consecutive samples from January 2010 to December 2012, 30 from January 2013  to February 2017, and 18 from March 2017 to July 2018 as the first (1 st ), second (2 nd ), and third (3 rd ) groups, respectively. Healthy control (HC) samples were collected at the same time by our laboratory from individuals who were confirmed not to have a medical history of cancer. Among them, 12, 30, and 18 samples were enrolled in the 1 st , 2 nd , and 3 rd groups, with consideration to matching sex and age with ESCC patients. Table 1 summarizes the characteristics of patients and HCs. All patients were histologically diagnosed with squamous cell carcinoma and staged according to the 8 th Edition of the TNM Classification of Malignant Tumors [20]. Treatment strategy was determined at our institutions according to clinical stage and patient condition as described previously [21,22]. Briefly, mucosal cancer was treated with endoscopic resection, submucosal cancer without lymph node metastasis with initial surgery; and respectable advanced cancer with neoadjuvant therapy followed by surgery if overall patient condition was good. Patients who did not wish to undergo surgery or judged unsuitable for resection were treated with definitive chemoradiotherapy, while those with distant metastasis were given palliative chemotherapy.
Among the 66 samples from patients with ESCC before treatment, 22 were collected at 1 month after treatment. Serum samples were also collected from 4 patients who experienced postoperative recurrence at the time of recurrence. Furthermore, samples were collected from patients with esophageal adenocarcinoma (EAD; n = 4) and high-grade dysplasia (HGD; n = 4) who were enrolled to assess specificity for ESCC. Fig 1 shows overview of this study. The study was approved by the Institutional Review Board of Hiroshima University.

RNA extraction from serum samples
After obtaining informed consent, 2ml of peripheral blood was obtained from each patient before any treatment procedure.Serum was separated by centrifugation at 3000 rpm for 10 min at 4˚C. The supernatant was collected into a new tube and the serum sample was stored at -80˚C. Total RNA was isolated from 200 μl serum using a miRNeasy mini kit (Qiagen) according to the manufacturer's protocol.

cDNA library for micro RNA sequencing
An Ion Total RNA-Seq Kit v2 was used to prepare a reconstructive cDNA library for preparation of small RNA sequencing. The size and concentration of base pairs of the cDNA library were measured with an Agilent 2100 Bioanalyzer (Agilent Technologies). Preparation for deep sequencing such as emulsion PCR, bead enrichment, and chip loading were automatically performed on an Ion Chef− instrument (Thermo Fisher Scientific). In the final step of sample preparation for sequencing, the chip was loaded with the Ion Sphere Particle (ISP) sequencing reaction mixture. Synthesized templates were sequenced on an Ion S5−XL sequencer (Thermo Fisher Scientific) using an Ion 540− chip.

Data analysis
After the sequencing reaction, the data were checked for quality. We defined acceptable data as 70% or more above ISP loading density, and 60 or more templates per ISP; 30% or more usable reads, and 5% or less test fragments per total reads; and 100000 or more usable reads per sample. Acceptable data was analyzed using a CLC genomics work bench 7(CLC bio). Small RNAs were merged by count read number and annotated based on miRbase version 21 (http://www.mirbase.org/). IsomiRs were identified by differences such as additions or deletions compared with mature miRs. To compare the read number of small RNAs between samples, total read numbers of each sample were normalized to 1,000,000 reads; in other words, each small RNA read number was calculated per 1,000,000. Diagnostic biomarkers were identified by analyzing normalized read numbers of miR/iso-miRs between ESCC and HC using the Student t-test. As defined diagnostic biomarkers were identified in over 90% of samples of both the ESCC and HC groups, mean read numbers significantly differed more than 2-fold (p<0.05). Candidates miR/isomiRs which met our criteria for diagnostic biomarkers were entered stepwise into a multiple linear regression model to generate a diagnostic panel for ESCC. Minimum Bayesian Information Criteria (BIC) method was applied to select the best model. A panel index was calculated by assigning the read number of candidates of miR/isomiR selected by the diagnostic panel. Receiver operating characteristic (ROC) curves of the candidate of miR/isomiRs and panel index were generated to predict ESCC patients. The panel index was compared between patients with HGD, EAD, and ESCC using the Student t-test, and between pre-and post-treatment using the paired t-test. Data are presented as numbers (%) or as mean ± standard deviation in normally distributed

Identification of diagnostic biomarkers for ESCC
In the 1 st group, 5451 miR/isomiRs were detected in at least one sample (S1 File). Among these, 303 miR/isomiRs were detected in over 90% of each group. Twenty-eight mature miRs and 60 isomiRs met the criteria for diagnostic biomarkers. These 88 candidates were validated in the 2 nd group. The results of sequencing in the 2 nd group were shown in S2 File. As a result, 9 mature miRs and 15 isomiRs also met the criteria in the 2 nd group. Table 2 shows the profile of these candidates of miR/isomiRs, read number for ESCC and HC, fold change, and p-value in the 1 st and 2 nd group.

Creation of the diagnostic panel
Twenty-four candidates which met the criteria for diagnostic biomarker were entered into a multiple regression model with stepwise selection to generate diagnostic panel for ESCC. The model entered variables to forward, and judged combination of three variables as optimal; one mature miRNA (miR-30a-5p) and two isomiRs [miR-574-3p (3' deletion A) and miR-205-5p (3' deletion G)] (S1 Fig and S3 File). Individual read numbers of miR/isomiRs used in the diagnostic panel are shown in Fig 2, Fig 3). Using the optimal cut off value of 4.0, sensitivity and specificity was 93.8% and 81%, respectively ( Fig  4A).

Validating the diagnostic panel
To confirm the diagnostic value of our panel for ESCC, we tested it in another independent group (3 rd group S4 File). Mean value of the panel index was 16.8±20.8 and 3.6±1.3 in patients with ESCC and HC, respectively (p<0.001). Diagnostic sensitivity and specificity using same cut off value was 88.9% and 72.3% (Fig 4B). AUC of the ROC curve was 0.89 (95%CI, 0.78-1.0, p<0.001; S3 Fig).

Comparison of panel index between patients with ESCC, EAD and HGD
The profiles of miR/isomiRs were also investigated in patients with EAD and HGD (S5 File). The mean panel index of patients with EAD and HGD was 4.2±1.7 and 6.2±4.5, respectively. These values were significantly lower than that of patients with ESCC. In contrast, while they were also higher than in HC, the difference was not statistically significant (Fig 5).  ±28.9, respectively. Patients with stage IV tend to have a higher index compared with those with stage I-III disease, but the difference was not significant. A similar trend was seen by pathological stage (Fig 6B). While patients with clinical stage I disease tended to have a lower index than those with advanced stage disease, the index was still significantly higher than that in HCs. Diagnostic sensitivity and specificity using cut off value of 4.0 was 91.0% and 77.4%, respectively. AUC of the ROC curve was 0.93 (95%CI, 0.85-1.0, p<0.001; S4 Fig).

Time course of change in panel index of patients with ESCC during treatment and at recurrence
The 22 paired samples at pre-and post-treatment were investigated for the expression of miR/ isomiR, and a panel index was calculated. Mean panel index after treatment was significantly decreased compared with that before treatment (6.2±5.6 vs 11.6±11.5, p = 0.03; Fig 7) Eighteen cases (81.8%) showed a decrease in panel index after treatment compared with before. Mean

Discussion
We aimed to identify the clinical significance of circulating miR/isomiRs in patients with ESCC detected by NGS. We identified 24 miR/isomiRs as diagnostic biomarkers by comparison between ESCC patients and HCs in different two cohorts. The diagnostic panel generated by these candidates had high accuracy in the diagnosis of ESCC. Early detection is important in improving outcomes in patients with ESCC. Endoscopic screening is the standard for detecting superficial ESCC [23]. Although recent advances in diagnostic technology for cancer such as narrow band imaging provide high accuracy, the relatively low incidence of ESCC renders population-based screening ineffective. Endoscopy also causes chest discomfort in all subjects and sometimes has unpleasant adverse effects, such as aspiration pneumonia. Accordingly, screening for ESCC should be limited to individuals at high risk. In fact, screening endoscopy has been proven effective in detecting early-stage ESCC and precancerous lesions in a high-risk region in China [24]. However, regional differences in the occurrence of esophageal cancer are not seen in Japan or Western countries, indicating the need for biomarkers that can detect patients with ESCC. Given the low invasiveness of blood sampling, circulating small RNA might be an ideal biomarker candidate. Indeed, many studies have confirmed the usefulness of circulating miR in detecting cancer. Theoretically, isomiRs might also be powerful biomarkers, like mature miRs. However, few studies have examined this possibility, primary because the similarity in the sequences of isomiR and mature miR makes it technically difficult to distinguish them by usual quantitative polymerase chain reaction (qPCR). Recent developments in deep sequencing systems, represented by NGS, allow the detection of even slight differences in small RNAs and the identification of isomiRs. Several researchers have described studies focused on isomiRs from tumors. Wu et al reported that expression of isomiRs in colorectal tissue differed between normal mucosa, adenoma, and adenocarcinoma [25]. Roberts et al reported that circulating small RNA, including isomiR, were associated with colorectal adenoma [26]; and Mjelle et al identified circulating miR/isomiR associated with metastasis of rectal cancer [27]. However, few studies have examined differences in miR/isomiR between cancer patients and healthy individuals. To our knowledge, our   present study is the first to show the usefulness of combination of circulating miR/somiRs detected by NGS in the diagnosis of esophageal cancer.
Our diagnostic panel was generated by comparing patients with ESCC at all stages and HC controls. The panel was useful in detecting patients even at stage I, and in distinguishing patients with ESCC from those with from HGD and EAD. These findings would also be useful in distinguishing individuals at high risk of ESCC but without significant symptoms, and in population-level endoscopy screening. This panel includes one mature miR and two isomiRs. According to previous reports, miR-30a-5p plays a dual role in different types of cancer as either an oncogene or onco-suppressor [28]. Function of miR-30a-5p as cancer activators has been reported in pharyngeal cancer [29], ovarian cancer [30] and glioma [31]. Their expression profiles also differ between cancer and normal tissue. Kimura et al reported that miR-30a-5p is up-regulated in ESCC, as well as in a head and neck squamous cell carcinoma cell line compared with normal squamous epithelial cell lines [32]. In contrast, circulating miR-30a-5p is down-regulated in patients with EAD compared with healthy control [33]. MiR-205-5p also has several functions which appear to depend on cellular context and tumor subtype. It is also reported to have specific features in squamous cell carcinoma, and is a reliable biomarker to distinguish squamous cell carcinoma from other subtypes in non-small lung cell cancer tissue [34][35][36]. Circulating miR-205-5p is up-regulated in patients with lung squamous cell carcinoma [36] and cervical cancer [37]. Moreover, a recent study found that miR-205-5p has different function in squamous cell carcinoma and adenocarcinoma in the esophagus [38]. MiR-574-3p is upregulated in hepatocellular carcinoma [39] and prostate cancer [40], and is positively associated with the proliferation of osteosarcoma [41]. Moreover, Krishnan et al described the prognostic impact of miR-574-3p detected by NGS from breast cancer tissue [42].
Of note, these previous reports dealt with the mature miR-205-5p and miR-574-3p whereas our diagnostic panel included isomiR. The two types were previously thought to have a similar function because of their similar sequence, but more recent studies have identified that they have different functions [19,43,44]. In fact, the target messenger RNA of isomiR has concordance and discordance with mature miR, in accordance with the difference between them in sequence [45]. Further study is therefore needed to identify whether these isomiRs have the same function as mature miRs.
Although our panel is not aimed at detecting postoperative recurrence, the panel index was decreased after treatment compared with that before treatment in almost all cases, and reincreased at recurrence in three of four patients. Some miR/isomiRs likely change as a reflection of tumor volume. Supporting this, Komatsu et al reported that levels of circulating miR-25 changed before and after surgery [14]. Follow-up of certain miR/isomiRs by post-treatment survey might be worthwhile.
Several limitations of our study warrant mention. Because few studies have dealt with circulating miR/isomiR detected by NGS, no clear consensus exists for the normalization of miR/ isomiR, nor is there a consistent method for analyzing data. We normalized read number as 1,000,000 reads per sample in accordance with a previous report. If normalization and data analysis methods change, different results will be obtained. Our results were also influenced by the number of samples assigned to each group and the method of statistical analysis. Obtaining repeatable results in future studies therefore requires establishment of a concrete consensus. External validation is preferred to confirm accuracy of our results, but it is difficult because there is no public database containing information on circulating isomiR in ESCC patients. Therefore we tested the application for our diagnostic panel using another cohort, but it was using retrospective single institution samples after all. Prospective confirmation study is needed before clinical application. We investigated miR/isomiR profiles from serum samples stored for several periods. Although there was no substantial difference between the retention periods of samples from patients and HC, the possibility that this difference affected the results cannot be denied. It remains unclear whether these candidate miR/isomiRs for diagnostic biomarkers differ between normal squamous epithelium and squamous cell carcinoma tissue, as does the function of these candidates in vivo, and further studies are needed to clarify these questions. We focused on miR/isomiR in the present study, but other small RNAs are abundant in tissue and blood and can be detected by NGS. These small RNAs might include other powerful biomarkers of ESCC.

Conclusion
We focused on circulating miR/isomiR detected by NGS as novel biomarkers of ESCC. Our diagnostic panel had high accuracy in diagnosis and high specificity as a biomarker of ESCC. Although a number of problems must be resolved before clinical application, miR/isomiRs detected by NGS could serve as novel biomarkers of ESCC.