The effect of H1N1 vaccination on serum miRNA expression in children: A tale of caution for microRNA microarray studies

Background MicroRNAs (miRNAs) are a class of small regulatory RNAs around 21–25 nucleotides in length which govern many aspects of immunity including the host innate and adaptive responses to infection. RT-qPCR studies of select microRNAs show that vaccination alters the expression circulating microRNAs but the effect of vaccination on the global microRNA population (i.e. micronome) has never been studied. Aim To describe vaccine associated changes in the expression of microRNAs 21 days after vaccination in children receiving a pandemic influenza (H1N1) vaccination. Method Serum samples were obtained from children aged 6 months to 12 years enrolled in an open label randomised control trial of two pandemic influenza (H1N1) vaccines, in which participants received either ASO3B adjuvanted split virion or a whole virion non-adjuvanted vaccine. MicroRNA expression was profiled in a discovery cohort of participants prior to, and 21 days after vaccination using an Agilent microarray platform. Findings were followed up by RT-qPCR in the original discovery cohort and then in a validation cohort of participants taken from the same study. Results 44 samples from 22 children were assayed in a discovery cohort. The microarray results revealed 19 microRNAs were differentially expressed after vaccination after adjustment for multiple testing. The microarray detected ubiquitous expression of several microRNAs which could not be validated by RT-qPCR, many of which have little evidence of existence in publicly available RNA sequencing data. Real time PCR (RT-qPCR) confirmed downregulation of miR-142-3p in the discovery cohort. These findings were not replicated in the subsequent validation cohort (n = 22). Conclusion This study is the first study to profile microRNA expression after vaccination. An important feature of this study is many of the differentially expressed microRNAs could not be detected and validated by RT-qPCR. This study highlights the care that should be taken when interpreting omics biomarker discovery, highlighting the need for supplementary methods to validate microRNA microarray findings, and emphasises the importance of validation cohorts. Data from similar studies which do not meet these requirements should be interpreted with caution.


Introduction
A greater understanding the immune response can be derived by identifying changes in gene expression after vaccination [1][2][3]. In order to fully understand immune responses, the regulatory mechanisms behind those changes in gene expression must be investigated. These regulatory mechanisms include the action of non-coding RNAs.
MicroRNAs (miRNAs) are a class of small regulatory non-coding RNAs, around 21-25 nucleotides long, which post-transcriptionally regulate the expression of protein coding genes. They do this by binding to complementary sequences on the 3' untranslated region of mRNA molecules, thereby inhibiting mRNA translation [4]. 2675 human miRNAs are listed in the microRNA registry (miRbase), and are estimated to collectively regulate around 60% of protein-coding genes [5,6]. MiRNAs play a key role in many cell processes, including those involved in host response to infection [7,8]. MicroRNAs target proteins involved in leukocyte development and differentiation, innate and adaptive immune pathways (e.g. leukocyte activation, cytokine release, antibody affinity maturation) and they may even target viral genomes and transcripts [9][10][11][12][13][14][15][16][17][18].
Although miRNAs are primarily intracellular molecules, miRNAs are detected in most body fluids: these extracellular miRNAs can exist in extracellular vesicles (exosomes, microvesicles and apoptotic bodies) or can be associated with argonaut protein and high-density lipoprotein, and, unlike mRNA, show remarkable stability in stored clinical samples [19][20][21][22][23][24]. The biological function of extracellular miRNAs is debated: but an increasingly accepted theory is that they are secreted as intercellular communicators of gene regulation [25]. Over 56 studies show an association between infectious disease and circulating miRNA expression suggesting that induction of an immune response may alter miRNA release. Given infection influences circulating miRNA expression, then so too may vaccination [26]. The stability and ease of sample acquisition in clinic makes extracellular miRNAs attractive biomarkers to provide new correlates of protection or vaccine reactogenicity. Interrogating the functional effects of extracellular miRNAs on immune response could provide new insights into vaccine-mediated immunity which can be exploited to improve vaccine design. RT-qPCR studies of candidate miRNAs have shown vaccination alters serum miRNAs and this is associated with vaccine response [27][28][29][30].
To our knowledge, however, no published studies have used a whole miRNA profiling platform to investigate the effect of vaccination on global serum/plasma miRNA expression. The aim of this study was to use a whole miRNA profiling technique to investigate the effect of two pandemic influenza (H1N1) vaccinations (ASO3B adjuvanted split virion and a whole virion non-adjuvanted vaccine) on the serum miRNA expression of children 3 weeks after the completion of their vaccination course compared with baseline.

Materials and methods
Participant samples were obtained from an open label, randomised, parallel group, UK multicentre study assessing the safety and immunogenicity of two novel pandemic influenza (H1N1) vaccines: ASO3B adjuvanted split virion (GlaxoSmithKline Vaccines, Rixensart, Belgium) or a whole virion non-adjuvanted vaccine (Baxter Vaccines, Vienna) (clinical trial registration number: NCT00980850 [31])clinical trial. Between 26 September and 11 December 2009, during the second wave of the influenza A (H1N1) pandemic in the UK, children aged 6 months to 12 years were enrolled in the trial. Children whose parents provided informed consent and were able to comply with study procedures were included. Key exclusion criteria included prior receipt of a H1N1 vaccination, confirmed pandemic H1N1 influenza infection, severe immunocompromise, or receipt of immunosuppressive treatment. Block randomisation was undertaken stratified by age.
Vaccines were administered by intramuscular injection at enrolment and at day 21 (plus or minus 7 days). Serum samples were collected immediately before vaccination and 21 days (minus 7 days to plus 14 days) after second vaccination (Fig 1). Antibody responses were measured by microneutralisation and haemagglutination inhibition using standard methods at the Centre for Infections, Health Protection Agency [31][32][33].Seroconversion was defined as a fourfold rise to a titre of �1:40 from baseline 3 weeks post 2 nd vaccine dose. Full details of the trial can be found in Waddington et al [31]. The trial was approved by the Oxfordshire Research Ethics Committee A (No 09/H0604/107), the UK Medicines and Healthcare Products Regulatory Agency (EUDRACT 2009-014719-11), and local NHS organisations by an expedited process.

Sample cohorts
MiRNA profiling was conducted on pre-vaccine and three weeks post 2 nd vaccine plasma samples from a random selection of 22 participants. This comprised the discovery cohort. The validation cohort comprised of a second random selection (stratified by type of vaccine received) of 22 participants. A power calculation was conducted to inform the sample size of the validation cohort. pwr.t.test in the "pwr" R package was used to perform a power calculation for a one sample, one tailed, t-test using the mean and standard deviation of the log 2 fold-change derived from the RT-PCR results in the discovery cohort [1]. A one tailed t-test was used based on the direction of the statistically significant log 2 fold-change seen in the discovery cohort. We chose a sample size that would be over 90% powered to identify differential expression of the miRNA of interest at a significance level of 0.05. For full details of the power analysis and the parameters used see S1 Results.

Serum miRNA extraction
Total RNA was extracted from 200 microlitres of previously frozen sera using the Total RNA Extraction Kit (Norgen cat# 17200) as specified by the protocol. As expected for RNA extracted from biofluids, RNA concentration was below detectable limit when measured by spectrophotometry (NanoDrop, Thermoscientific) and flourometer (Qubit, Invitrogen).

MiRNA expression profiling
MiRNA expression profiling was undertaken in a cohort of 22 randomly selected children using the Agilent miRNA microarray 60K (design 031181, based on miRbase version 16.0) using a one colour experimental design. The Agilent miRNA protocol (Version 2.4 September 2011) was used with the exception that after the labelling with Cy3-pCp molecule samples were dried and 'Step 2' was omitted. Samples were hybridised for 40 hours to increase specific signal against background. " [34].
Data analyses were conducted in the statistical language R using Bioconductor packages [35]. MiRNAs detected in less than 50% of samples were filtered out. Where a miRNA was not detected in a sample, its fluorescence value was set to half the value of the minimum detected fluorescence value across all samples for that miRNA. Data were then normalised using variance stabilised normalisation (package vsn version 3.40) [36]. Negative and positive control miRNAs were removed.
Samples were clustered using Ward's linkage method which is appropriate for expression profiling data [37]. People often have an innate expression profile which is unique to thembecause of this, variation between people is can be greater than vaccine induced variation within a person. As a result, paired expression profiles tend to cluster together. The effect of pairing on cluster analyses was removed using a batch correction function in R called Remove-BatchEffect from the package limma, with participant identification number as a batch effect [37]. This function fits a linear model to the data, then removes the component due to the batch effects (in this case participant number).
Differential miRNA expression was tested using multivariable linear regression using participant ID number (to account for pairing in the data) and vaccine status ("pre-vaccine", "post-vaccine") as factors in the model. Sex, age and vaccine type were not significantly associated with miRNA expression so were not included as variables in the final model. The resulting p-values for each miRNA were adjusted for multiple testing using the Benjamini-Hochberg false discovery rate (FDR). affected by sampling factors like haemolysis [38,39]. Spike-in controls (synthetic/non-human small RNAs added to samples prior to miRNA extraction) do not allow for inter-individual differences in global miRNA expression [40]. Candidate endogenous controls were therefore selected from the microarray data. Candidate endogenous control miRNAs had to be present in 100% of samples, have a minimal average fold-change in expression pre and post vaccine and have minimal variance in expression between paired pre and post-vaccination samples. To facilitate identification of potential reference miRNAs, miRNAs present in 100% of samples were ranked in terms of their average log fold-change, and in terms of the standard deviation of the fold-changes. These ranks were then multiplied together to give a "composite" rank. MiRNAs were then ranked by their composite rank and miRNAs with the lowest composite rank were selected as potential controls.
Reanalysis of log2 transformed unnormalized microarray miRNA expression levels normalised using the expression levels each of each candidate reference miRNA, reproduced significant differential expression of the miRNAs of interest, confirming their suitability as a candidate endogenous control. Ideally three endogenous reference genes should be used which have been shown to be stably expressed across conditions [41]. This was our intention but only one out of six assays for our candidate reference miRNAs could be optimised. This optimised assay was for miR-29c.

Quantitative reverse transcription PCR (RT-qPCR)
RT-qPCR was used to confirm microarray results. In the first instance, miRNA specific assays using miRNA Locked Nucleic Acid (LNA) primer sets (Exiqon) were used to qualitatively measure miRNA expression (cat# 339306, see S1 Table for assay IDs). LNA primers were chosen because they have a higher maximum annealing temperature leading to higher specificity (can discriminate highly homologous miRNAs from the same family) compared with standard primers, and they have been shown to be highly sensitive (very desirable for RNA poor biofluids) [42][43][44][45]. Lyophilised primers were resuspended in 220 μl of nuclease free water. Two microlitres of total RNA was universally reversed transcribed using the Universal cDNA Synthesis Kit II (Exiqon cat# 203301) according to the manufacturer instructions. A synthetic oligonucleotide, UniSP6 (Exiqon cat# 203301), was spiked in to each reverse transcription reaction to identify any samples which failed/were outliers in the reverse transcription step. cDNA was then subject to qPCR. Each PCR well contained 0.1μl cDNA, 1μl primer, 5μl 2X IQ SYBR Green supermix (Bio-Rad, cat# 170-8882), 0.2μl ROX reference Dye (Thermofisher, cat # 12223012), 3.7μl of RNAase free water to give a final reaction volume of 10μl and amplified using the following conditions: 95˚C for 10 min, followed by 40 amplification cycles at 95˚C for 10s and 60˚C for 1 min. A post PCR melt curve was performed on each sample for each assay. Only assays with primers efficiencies between 95-105% were taken forward.
TaqMan Advanced miRNA assays (cat# A25576) were used as second line for several miRNA where LNA primers failed to detect their targets (see S1 Table assay IDs). Two microlitres of total RNA was universally reversed transcribed and universally pre-amplified using TaqMan Advanced miRNA cDNA Synthesis Kit (cat# A28007) according to the manufacturer instructions. cDNA was diluted 1:10 with 0.1 X TE buffer then subject to qPCR. Each PCR well contained 2.5μl diluted cDNA, 0.5μl primer, 5μl 2X TaqMan Advanced Master Mix (cat# 4444556) and 2μl RNAase-free water to give a final reaction volume of 10μl and amplified using the following conditions: 95˚C for 20 seconds, followed by 40 amplification cycles at 95˚C for 1 second and 60˚C for 20 seconds.
In all cases, qPCR was conducted using the StepOnePlus Real-Time PCR System (Thermofisher, cat # 4376600). Each assay was conducted in triplicate and an arithmetic mean cycle threshold values (Ct values) calculated. Negative controls without cDNA were included on each PCR plate. Assays were only taken forward if their products met the following criteria for acceptable amplification: amplification curves that were within the limits of detection (Cts < 35 and 5 Cts lower than background Ct values in non-template control wells), had a single Tm peak in melt curve analysis (in the case of LNA primers) and showed evidence of amplification (amplification curve seen when normalised florescence, Rn, is plotted on a linear scale). Positive control RT-qPCR assays for the spiked in UniSp6 oligonucleotide were run for each sample.
The mean Ct for each sample assay was normalised to an endogenous reference miRNA (see below) which was stably expressed between pre-and post-vaccination samples as determined from the microarray data (see Results). Relative expression between paired pre-and post-vaccination samples was calculated using the 2 -ΔΔCt method.
To test for differential expression of a miRNA pre-and post-vaccination using RT-qPCR data a one sample Student's t-test was used to test the null hypothesis that the log 2 fold-change between pre and post vaccine samples were 0 (equivalent to a relative fold change of 1). In the case of the discovery cohort a two tailed t-test was used, in the case of the validation cohort a one tailed t-test was used based on the direction of change seen in the validation cohort. Average log 2 fold-change were exponentiated to 2 to give average fold-change.

Participants
MiRNA profiling was conducted on samples from a random selection of 22 participants. This comprised the discovery cohort. The validation cohort comprised of a second random selection of 22 participants who were stratified by type of vaccine received. Demographics are listed in Table 1. A further breakdown of each cohort by gender and vaccine type can be found in supplementary materials (S2-S5 Tables).

Microarray analysis showed that vaccination changed global miRNA expression
In total 189 miRNAs passed QC and were included for analysis. PCA analysis showed evidence of clustering of pre-and post-vaccination samples with respect to principle components 1 and 3, indicating that vaccination may affect miRNA expression profile (Fig 2).

Selection of candidate reference miRNAs
Six miRNAs were selected as candidate reference miRNAs on the basis that they were ubiquitously expressed in all samples and were stably expressed before and after vaccination (Fig 5).

RT-qPCR failed to detect the majority of differentially expressed and candidate endogenous reference miRNAs
Cross platform validation using RT-qPCR was undertaken to confirm the findings of the microarray data. MiRNAs miR-575, miR-4270, miR-483-5p, miR-3679-5p, miR-1207-5p, miR-1202 were selected for initial validation as they had the strongest and most significant fold-changes. None of these miRNAs could be reliably detected by RT-qPCR. The assays for some of these miRNAs had not been wet-lab validated by the manufacturer. Differentially expressed miRNAs which had pre-validated primers were therefore pursued instead: miR-638, hsv1-miR-H17, miR-30b-5p, hcmv-miR-UL70-3p, hsv2-miR-H25 miR-142-3p and miR-671-5p. Of these miRNAs only miR-30b-5p and miR-142-3p met the criteria for acceptable amplification detectable in serum. In a final attempt to detect the remaining differentially expressed miRNAs, TaqMan advanced miRNA assays were used to assay for miR-575, miR-483-5p, miR-3679-5p and miR-638. Of these only miR-483-5p was detectable. miR-29c and miR-197-3p were the only candidate reference miRNAs that were detectable using RT-qPCR, but only miR-29c fully met the criteria for an acceptable assay because Ct values for miR-197-3p were within 4 Cts of non-template controls and primer efficiency was <95%. There was no evidence that the type of vaccine received correlated with fold-change in miR-142-3p expression in the discovery cohort. The effect size (mean log 2 fold-change/ standard deviation of the log 2 fold-change) in the split virion group was -0.73 which is very similar to the effect size of the whole-virus vaccine -0.70. Raw expression (i.e. non-normalised expression) of miR-29c was more stable than raw expression of miR-142-3p and miR-30b supporting the use of miR-29c as an endogenous control for this cohort (see supplementary materials, S4 Table).

Downregulation of miR-142-3p was not replicated in an independent validation cohort
Although miR-30b did not show significant differential expression at a significance value of < 0.05 in the discovery cohort (when measured by RT-qPCR), the result was of borderline significance (p = 0.05) therefore it was also assessed in the independent validation cohort.

Lack of replication of differential expression of miR-142-3p in an independent validation cohort is not due to selection of miR-29c as an endogenous reference miRNA
Replicating significant downregulation miR-142-3p in the validation cohort is contingent upon the miRNAs being truly differentially expressed and miR-29c being a good endogenous reference in that cohort. A housekeeping miRNA must be consistently non-differentially expressed for it to perform well as a normaliser. If expression of miR-29c had been more variable between timepoints in the validation cohort (compared with the discovery cohort) then it would have introduced extra technical noise, reducing power for detecting differential expression in the validation cohort. If miR-29c had been downregulated after vaccination in the validation cohort then this would have masked downregulation of miR-142-3p and miR-30b.  8 shows that neither of these scenarios were likely-miR-29c expression was not more variable in the validation cohort compared with the discovery cohort, and non-normalised log 2 fold-changes do not suggest that miR-29c is unlikely to be downregulated post vaccination. Non-normalised log2 fold-changes in miR-142-3p expression, were generally positive. After normalisation with miR-29c, pre versus post vaccine miR-142-3p log 2 fold-changes were generally less positive and less variable. This suggests that lack of differential expression was not due to poor performance of miR-29c as a normaliser in the validation cohort.

Discussion
This was the first study to use global profiling techniques to investigate miRNA expression in serum post vaccination. Agilent array analysis identified differential expression of 19 miRNAs Vaccination and serum miRNA expression after correction for multiple testing. This finding supports work by others showing vaccination alters serum miRNA expression [19,46]. For example, Xiong et al found that increased serum miRNA-155 at 4 to 6 weeks post hepatitis B vaccination was associated with non-response to the vaccine (defined as anti-HBsAg antibody levels below 10 mIU/ml) [46]. De Candia et al. Vaccination and serum miRNA expression found that there were elevated levels of miR-150 in adults and children one month after flu vaccination and that this increase correlated with haemaglutinin antibody titres [19]. In-vitro work supported these findings, with CD4+ and B lymphocytes secreting miR-150 into cell culture medium upon mitogenic activation.
Surprisingly, four viral miRNAs expressed by: human cytomegalovirus, herpes simplex 1 virus, herpes simplex 2, and Kaposi's sarcoma virus were significantly upregulated after vaccination. Multiple studies have identified circulating viral miRNAs in asymptomatic patients, found them to be differential expressed between clinical states [47] [48]. It is tempting to speculate that activation of the immune system by pathogens or vaccine antigens could alter the  This plot suggests that miR-29c did not perform better in the pilot cohort compared with the validation cohort as the interquartile range was smaller in the validation cohort. The median expression of mR-29c was positive, which if anything, will potentially bias hsa-miR-30b and hsa-miR-142-3p towards downregulation, making validation more likely. (B) Mean log 2 fold-change pre versus post vaccination in the validation cohort. Box plots delineates range, median, interquartile range. A log2fold-change of 0 equates no change pre versus post vaccination (i.e. a fold-change of 1). Had miR-29c been downregulated after vaccination in the validation cohort (and therefore been an unsuitable normaliser) the this would have artificially elevated log2fold-changes for miR-142-3p but there is no evidence for this. The graph shows miR-29c was more stably expressed pre versus post vaccination than miR-30b and miR-142-3p, supporting its use a normaliser. Normalised fold-changes for miR-30b and miR-142-3p were lower and less variable compared with unnormalised fold-changes-this is in keeping with the removal of technical noise through use of 29c as normaliser. https://doi.org/10.1371/journal.pone.0221143.g008 Vaccination and serum miRNA expression ability of the immune to control latent infections leading to viral reactivation. Unfortunately, we were unable to optimise the RT-qPCR assays to validate upregulation of these four viral miRNAs, and for reasons further discussed below, these viral miRNAs may have been falsely detected by the microarray. Although it is theoretically possible that contamination is the cause of viral miRNAs being detected on the array, we believe that this is highly unlikely in practice because: 1) miRNAs from herpes simplex I, herpes simplex II and Kaposi's sarcoma virus were detected in 100% of samples, and miRNAs from Epstein-Barr virus and human cytomegalovirus were detected in 90% of samples. Such ubiquitous contamination of samples with RNA from all these viruses is unlikely as we do not work with these viruses in our lab; 2) contamination with viruses being the cause of differential expression of viral miRNAs would require systematic addition of miRNAs from each virus to either pre or post vaccination samples (for up-and downregulated viral miRNAs respectively), which we believe unlikely, given one would expect contamination to happen in a random or universal manner; 3) if contamination with viruses was the cause for differential expression of viral miRNAs, one would expect that all the miRNAs of that virus (detected on the array) would be upregulated/downregulated, yet this is not the case as only specific miRNAs are differentially expressed for each virus; 4) we were unable to detect viral miRNAs by RTPCR.
Only 3 (miR-30b, miR-142-3p, and 483-5p) out of 19 of the differentially expressed miR-NAs, and two candidate endogenous control miRNAs (miR-29c and miR-197) could be detected by RT-qPCR. Some assays showed no detectable amplification (miR-575, miR-1202) or were at the limit of detection (hcmv-miR-UL70-3p, miR-638, miR-1207), the remaining assays had significant amounts of background above which miRNAs could not be detected (see supplementary S5 Table). Difficulties in validating results across multiple platforms is not unique to this study. A study by Mestadagh et al showed average concordance between any two platforms (microarray, next generation sequencing, RT-qPCR arrays) was 79.2% in terms of detecting the presence or absence of a miRNA and only 54.6% in terms of detecting differential expression [49]. Absence of detection of several miRNAs in the present study by RT-qPCR could be because those miRNAs were never expressed in the first place or because of PCR limitations-e.g. primers not being able to amplify the desired product, or primers/PCR set up were too insensitive to detect the miRNAs because of their low abundance in serum (a previous paper notes detection of miR-575 in placentae by RT-qPCR for example [50]).
Rather than PCR being an issue, the microarrays may be the issue. Several lines of circumstantial evidence suggest that the microarrays were falsely detecting some of the differentially expressed miRNAs; a) miRNAs which are unlikely to be present in children's serum, e.g. herpes simplex 2 miRNAs, were detected in all serum samples (S2 Fig), b) only 5 of the differentially expressed miRNAs are labelled as "high confidence of existence" in miRbase, c) many of the differentially expressed miRNAs are absent in the fantom5 sequencing database-a human miRNA expression atlas which contains the miRNA expression profile of 121 different cell types, and d) two different primer technologies failed to detect 3 out of 6 differentially expressed miRNAs and candidate reference genes (S6 Table) [5,51,52]. The florescence signals of the probes for the miRNAs that could not be detected by RT-qPCR were too high to suggest that they represented background fluorescence (S3 Fig). More likely, is that their signal has arisen through cross-hybridization with other nucleic acids. Given miRNAs are around 22 nucleotides, median probe length on the Agilent miRNA microarray used in this study in this study (16-mer, IQR: 14-mer, 18-mer) is relatively short compared with mRNA microarrays and thus could contribute to lack of specificity. The hairpin-based nature of the probes on the miRNA microarray used in this study does attempt to reduce issues of non-specific binding but may not be sufficient, especially for biofluid samples where miRNA content is close to the limit of detection. Only one microarray platform was used in this study thus the generalisation of our finding to other miRNA array-based technologies is debateable, nevertheless we would encourage future oligonucleotide hybridisation-based microarray studies, regardless of the microarray technology, to interpret their results with caution in the absence of a second validation method.
RT-qPCR could confirm differential expression in one out of the three miRNAs that could be detected. This suggests that microarrays can provide some reliable data. Despite this, the significant downregulation of miR-142-3p could not be replicated in an independent validation cohort. A challenge for validating differential expression in the independent cohort is that it is dependent on miR-29c being an effective reference gene in the validation cohort, but as Fig 8 shows, the expression of miR-29c in the validation cohort does not appear to be the reason for the lack of validation.
Lack of validation does not appear to be due to inadequate study power because: 1) the validation cohort was over 90% powered to detect significant downregulation of miR-142-3p at the same fold-change as seen in the discovery cohort and 2) average fold-change was greater than 1 for miR-142-3p post vaccination, which is the opposite direction to the discovery cohort where fold-change was less than 1.
Differences in the findings of the discovery and validation cohort is unlikely to be due to differences in the demographics of the two cohorts as there was no obvious interaction between miR-30b and miR-142-3p expression and age, sex or vaccine administered, and the population characteristics of the two cohorts were similar (S3 and S4 Tables). It is possible that despite trying to control for a type 1 error rate by using an FDR adjusted p-value, chance alone identified differential expression in the discovery cohort. Another possibility is the "winners curse" phenomenon which has been described in genetic association studies, and can lead to an overestimated effect size in the amongst differentially expressed genes, meaning that to achieve enough power, a subsequent validation cohort must be larger than the size which is calculated by a standard power calculation [53]. This study reiterates the importance of confirming results in a second independent cohort.
Finally, we note that investigating miRNA expression 3 weeks post 2 nd vaccination may only identify circulating miRNA changes associated with memory responses. We were limited to this timepoint however as this was the only post vaccination timepoint when serum was collected in the trial. An earlier timepoint may have captured changes in the expression of miR-NAs related to the early immune response, and this would be interesting to look at in future studies.

Conclusion
In conclusion, this is the first study to profile serum miRNA expression post vaccination. This study contributes to the miRNAome expression profiling data in healthy children which may be useful to others trying to identify appropriate endogenous reference miRNAs. The microarray data supports the concept that circulating miRNA expression is affected by vaccination but corroboration of these results by RT-qPCR was not shown for many miRNAs. When RT-qPCR could be optimised, the microarray findings were corroborated for 1 out of 3 miRNAs using RNA samples from the same cohort but these findings could not be replicated in an independent cohort. This study therefore underlines the need for a rigorous approach to the analysis and interpretation of probe-based miRNA microarrays that, as highlight by this study, should include cross platform validation and replication in a validation cohort. Based on our study we conclude that findings from studies in which this has not been done should be interpreted with caution. Alternative methods such as next generation sequencing or RT-qPCR arrays may provide a more robust way of investigating whether vaccination affects miRNA.
MiRNAs are important mediators of immunity, thus investigating associations between circulating miRNAs is a worthwhile pursuit which could bring biological insights and identify new surrogates of protection, but as this study shows, great care should be taken when interpreting omics biomarker discovery to ensure robust, reproducible conclusions.

Supporting information
S1 Table. Details of tested assays. Company assay IDs are displayed in columns 2 and 3. Assays that met criteria for acceptable amplification of target product are highlighted in bold. (DOCX) S2  Table. MiRNAs which were differentially expressed after vaccination, and evidence of their existence. Of the 15 differentially expressed Homo sapien miRNAs, only 4 are convincingly expressed in the Fantom 5 database. Ten out 12 of the differentially expressed miRNAs could not be detected by LNA primers when tested. Three of the differentially expressed mRNAs were assayed for using two primer technologies, of which only one could be detected. There is some overlap between the pre-and post-vaccine sample clusters, nevertheless, there is a tendency for pre-vaccine sample to cluster to lower left, and pre-vaccine samples to cluster to upper right. This suggests that some but not all variation in global microRNA expression is accounted for by vaccine status. The plot shows that even without adjustment for pairing in the data, pre and post vaccine samples cluster somewhat separately. The plot shows that the majority of miRNAs that were differentially expressed/selected as candidate endogenous reference miRNAs were relatively well expressed compared with the lower limit of detection. (TIF) S1 Results. Sample size estimate for validation cohort based on estimates derived from the discovery cohort. (DOCX) S1 Dataset. RT-qPCR data for discovery and validation cohorts. (XLSX)