LPEseq: Local-Pooled-Error Test for RNA Sequencing Experiments with a Small Number of Replicates

RNA-Sequencing (RNA-Seq) provides valuable information for characterizing the molecular nature of the cells, in particular, identification of differentially expressed transcripts on a genome-wide scale. Unfortunately, cost and limited specimen availability often lead to studies with small sample sizes, and hypothesis testing on differential expression between classes with a small number of samples is generally limited. The problem is especially challenging when only one sample per each class exists. In this case, only a few methods among many that have been developed are applicable for identifying differentially expressed transcripts. Thus, the aim of this study was to develop a method able to accurately test differential expression with a limited number of samples, in particular non-replicated samples. We propose a local-pooled-error method for RNA-Seq data (LPEseq) to account for non-replicated samples in the analysis of differential expression. Our LPEseq method extends the existing LPE method, which was proposed for microarray data, to allow examination of non-replicated RNA-Seq experiments. We demonstrated the validity of the LPEseq method using both real and simulated datasets. By comparing the results obtained using the LPEseq method with those obtained from other methods, we found that the LPEseq method outperformed the others for non-replicated datasets, and showed a similar performance with replicated samples; LPEseq consistently showed high true discovery rate while not increasing the rate of false positives regardless of the number of samples. Our proposed LPEseq method can be effectively used to conduct differential expression analysis as a preliminary design step or for investigation of a rare specimen, for which a limited number of samples is available.


Introduction
High-throughput sequencing of cDNA derived from an RNA sample, known as RNA-Seq, has recently been developed and applied to various studies depending on the scientific interests such as detecting fusion genes, transcribed single nucleotide polymorphisms (SNPs), and differential expression (hereafter, DE) [1,2]. In particular, profiling gene expression and testing DE between classes have been the primary process of many biological studies.
The main purpose of DE analysis is to identify transcripts that changed significantly in abundance across experimental conditions. This goal has been achieved using different statistical methods for data from array-based technologies. Compared to microarray, however, RNA-Seq has different characteristics such as high dynamic range and low background expression level. In order to address those properties, various methods have been proposed using Poisson and negative binomial distributions [3][4][5].
The software edgeR assumes that the mean and variance are related with a single proportionality constant that is the same throughout the experiments; thus, only one parameter needs to be estimated for each transcript [4]. Instead, DESeq decomposes the variance into two terms, a shot-noise term and a raw-variance term [5]. By assuming that the raw variance of each transcript is a function of the expectation value of a transcript's concentration and condition, DESeq extends the model proposed by edgeR. NBPSeq uses an over-parameterized version of the negative binomial called "NBP distribution", which incorporates a non-constant dispersion parameter directly within a parametric family [6]. All these three methods are based on similar principles, i.e., explicit modeling of the counts using a negative binomial distribution.
Comprehensive reviews on pre-existing methods for DE analysis with both real and simulation datasets has been recently published. By examining 11 methods including edgeR [4], DESeq [5], and NBPSeq [6], it was reported that all methods performed well with large sample sizes, while most of them showed weakness in DE analysis with small sample sizes [7]. A similar result was obtained in Seyednasrollah et al's paper; unlike the study with large sample size, the choice of the method becomes critical when the number of samples is small [8]. With small sample size (two or three samples) per each class, the review papers reported that the methods based on negative binomial modelling, such as edgeR and DESeq, displayed relatively better performance [7][8][9].
Despite the decreasing sequencing costs, RNA-Seq experiments remain expensive to allow extensive biological replications. Moreover, limited specimen availability often leads to studies with a small number of (sometimes none of) replicates. Unfortunately, many of the existing methods have been largely unsuccessful in DE analysis with a small number of samples, and few researches have correctly addressed the problem arisen from non-replicated samples per each class. Therefore, there remains the need for an effective method that could be applicable to DE analysis with small samples or with one sample per each class.
Here, we proposed a method, named local-pooled-error with RNA-Sequencing data (LPEseq), which proposes to analyze DE detection of RNA-Seq experiments with a small number of or non-replicated samples in each class. Based on the LPE method [10], we extended the protocol to RNA-Seq data with a non-replicated set by introducing additional processes. Then, we compared the results obtained from the proposed method with those from several other techniques using both real and simulated datasets. Because it was reported that the performance of edgeR, DESeq, and NBPSeq with a small number of replicates is slightly better than that of other methods [7], these methods were chosen to compare the results with those obtained using our proposed method for replicated data analysis. Since DESeq was updated to DESeq2 recently, DESeq2 was also included in comparison [11]. However, for non-replicated data sets, only edgeR and DESeq are applicable and, thus, the comparison was made among edgeR, DESeq, and our proposed method for non-replicated data analysis. We found that LPEseq generally shows similar performance to other methods in the presence of replicates but performs better for non-replicated data sets.

Materials and Methods
A brief review of the LPE method Jain et al. developed the original LPE method, which pools the error in each local intensity bin and shrinks each error variance estimate toward the mean of other probes (or genes) with similar intensities, for microarray experiments in which gene expression intensity is continuous [10]. The method first evaluates the baseline error distribution for each of the compared experimental conditions, say class X and Y, respectively. For duplicated arrays (subscript with 1 or 2) in each class, for instance in class X, the mean ðA ¼ x 1 þx 2 2 Þ and difference (M = ±|x 1 −x 2 |) values between arrays are evaluated first. Then the variance of M on predetermined quantiles of A is evaluated and a cubic smoothing spline is fit to the variance estimates on the quantiles. The baseline error distribution for class Y is derived using the same way. The test statistic of the LPE method is calculated as follows: where Med(.) denotes median and,ŝ n X and n Y are the number of replicates in the two array samples being compared;ŝ 2 X andŝ 2 Y are the estimates of variance of the genes in class X and Y, respectively; and τ is a scaling factor [10].
The LPE method assumes that genes with similar observed intensities have similar expression variances. Based on this assumption, it estimates the gene-specific variance from a calibration curve derived from pooling the variance estimates of replicated expression differences of genes within similar expression intensities. Since the method is based on calculating the error variance from replicated experiments in the same class, it cannot be directly applied to experiments with no biological replicates in one or both classes. Furthermore, the method is suited for the analysis of continuous probe intensities on microarray platforms, unlike the read counts measured on sequencing technology.

The proposed method: LPEseq
In this study, we improved the LPE method by focusing on two refinement aspects: applicability both to count measurement in RNA-Seq and to experiments with no replicates in each class. We describe here how we addressed these two aspects into the model. Simply, for the former case, the count variable is treated as a continuous variable by performing normalization and log-transformation. For the latter case, we treat two non-replicated samples as if they were replicates and remove the outliers so to reduce the impact on the LPE estimation. For RNA-Seq experiments with replicates, the proposed method conducts the similar procedure, i.e., estimating the variance of M from replicates in each class X and Y, as was described in the previous section.
To be more specific, we focused our attention on the problem of inferring DE between two different classes. Let x 0 ij and y 0 ij represent the number of read counts mapped to a specific transcript (or gene) i in the j th sample (or replicate) from the experimental condition or class X and Y, respectively. Since x 0 ij and y 0 ij are influenced by the sequencing depth, these values are not directly comparable. Instead, the relative abundance of the transcripts across the samples can be normalized. Each transcript's read count was divided by the total number of read counts for that experiment and was log-transformed, i.e., Without replicates, only a single measurement in each class is available; thus, the subscript j can be dropped without ambiguity. Let x i and y i denote the normalized log-transformed count values of the transcript i under two different classes X and Y, respectively. The baseline error distribution was obtained by regarding each sample in different classes as replicates. Under this additional assumption, the M and A values were evaluated using two samples (each from different classes)., i.e., M = ±|y−x| and A ¼ xþy 2 . Then s 2 M ðkÞ , the variance of M of the genes on the predetermined k th bin of A, was calculated in the same manner as for replicate analysis. In this case, however, since the variance of M is not drawn from the same class, differentially abundant transcripts could act as outliers and adversely affect proper evaluation of the LPE per each class.
Here, we give an example to understand the problem. Suppose we have a dataset consists of n transcripts duplicated in two different groups, say X and Y. Three transcripts among n transcripts are differentially expressed as shown in Fig 1B and 1C (DE transcripts are colored in red in the figure). With replicate information, the LPEs are drawn from each class separately as described the previous section (Fig 1B), whereas those with non-replicated experiments have to be evaluated using two single observations from two different classes (Fig 1C). When evaluated with non-replicated data, the LPEs of the three DE transcripts (colored in red in the Fig 1B and  1C) show a clear distinction compared to those with replicates and, thus, the fitted variance curve might be biased (red dashed curve in Fig 1C) due to these apparent outliers. By denoting an outlier in each quantile of a given data, we can remove outlying observations and recalibrate an LPE curve less affected by outliers (orange solid curve in Fig 1C).
For removing outliers, both formal (or tests of discordancy) tests and informal (outlier labeling) methods can be used. Formal tests based on Z-score , where sd denotes the standard deviation of data] usually assume a well-behaving distribution, and test if the target extreme observation is an outlier of that distribution. It is obvious that these tests largely depend on the type of observations and distribution assumed [12]. On the other hand, informal methods usually generate an interval or criterion for outlier detection instead of hypothesis testing. Therefore, any observations beyond this criterion could be considered as an outlier. LPEseq provides various strategies for detecting outliers and the criterion of fold difference between classes was used as the default one.
We summarize the detailed procedure of the LPE analysis without replicates as follows: 1. Calculate the mean intensity of transcript i in two conditions [i.e., (x i + y i )/2].
2. Calculate quantiles (percentiles by default) with the mean intensities of the whole transcripts evaluated at step I, and define "intensity bins" using adjacent quantile values. Thus the width of the bin depends on these quantile values and each bins have different widths but the same number of transcripts. Note that 100 bins are generated by default and the number of bins can be adjusted manually.
3. Place all the transcripts in the bin to which their mean intensities belong.
4. Label the transcript as an outlier if the difference between classes is larger than the threshold (Default value is 1.2, please see the S1 File and S1 Fig).
5. Remove the outliers labeled in step IV and, then, evaluate the variance of M for each bin, where i (k) , n (k) andŝ 2 M ðkÞ denote the genes, the total number of genes, and the variance of M on the k th bin, respectively.
6. Generate the LPE curve by fitting a cubic smoothing spline to the variance along with the bins. This makes the LPE as a function of transcript abundance level.
7. Use the LPE function drawn in step VI to estimate the transcript-specific variance by plugging in the value of each transcript.
Once the variance in each experimental condition is derived, testing a hypothesis on differential expression is similarly conducted as for the analysis with replicates. Note that it is difficult to find the optimal number of bins which might affect the result of the DE test. However, the number of DEGs does not vary much with varying number of bins (S2 Fig). In addition to equal frequency interval binning (using percentiles), we also tried equal spaced interval binning. Although the results did not differ much, the equal spaced interval seemed to be very sensitive to the size of intervals (data not shown).

Real RNA-Seq data sets
Four real datasets, which had been preprocessed and distributed by Recount, were examined [13]. We summarized the characteristics of each dataset below.
1. Sultan's dataset (two replicates in each condition): Sultan et al. [14] performed RNA-Seq experiments in human embryonic kidney and B cell line. With each sample, two replicates were generated, and we analyzed the DE between the two conditions. The dataset provides a case in which a small number of replicates are available.
2. Hammer's dataset (single replicate in each condition): Hammer et al. [15] studied rats with chronic neuropathic pain induced by spinal nerve ligation (SNL) in serial experiments using RNA-Seq. The RNA of rats with SNL for 2 weeks and 2 months was sequenced and compared with that of controls. We performed DE analysis between the SNL and control groups for 2 weeks. The dataset provides a case in which only single replicates are available in each condition.
3. MAQC dataset (7 replicates in each condition): Bullard et al. [16] considered two types of biological samples: Ambion's human brain reference RNA and Stratagene's human universal reference RNA, referred to as Brain and UHR, respectively. Since the dataset has seven replicates for each condition, the dataset provides a situation in which a large number of replicates are available. Also, the dataset was used to measure how much the result with 7 replicates is reproduced by the analysis with different numbers of replicates.

Simulation study
Parameter estimation from a real dataset. The purpose of the simulation study was to investigate the ability of DE detection under varying effects of four different factors: effect size (counts difference), DE portion (number), dispersion, and number of replicates in each class. To generate the simulation datasets, we adopted the same assumption that has been made in many other studies [4,5,7,18,19], i.e., the abundance of transcript i, denoted by X i , follows negative binomial distribution, NB(μ i ,ϕ i ). Here, μ i and ϕ i are the mean and dispersion parameters, respectively.
We estimated (μ i ,ϕ i ) from the Montgomery et al.'s real data set, one of the largest sample datasets including 60 unrelated normal Caucasian individuals [17]. The total number of transcripts of this data is 52580 and the number of sample, denoted as N, is 60. The log-likelihood function for 60 independent observations for each transcript i from a negative binomial distribution, given the counts , a similar form in the supplementary material of the Soneson et al's paper [7]. The maximum likelihood estimate (MLE) of μ i was first obtained for each transcript across 60 samples; then, ϕ i was estimated by numerically maximizing the log-likelihood function using the simulated annealing algorithm [20]. Simulated dataset generation. Based on these estimates, we generated simulation datasets as follows. First, we generated 20,000 transcripts in total. Among them 70% were generated from the zero-inflated Poisson distribution with mean 1.25 and zero probability of 0.9. Then, μ i for the remaining 30% were randomly selected from the values estimated from the real data. Then, for each transcript i, we generated transcript counts, x ij and y ij from NB(μ i ,ϕ i = ϕ) in each class. For DE transcripts, we randomly selected κ% of the total transcripts and added the effect size δ in one of the two classes. In our simulation scenario, we varied the value of κ, δ, ϕ, and the number of replicates per each class as stated below.
• Case I. Different effect size: (500, 1000, and 5000): We varied the effect size ranging from 500 to 5000 in counts, which roughly corresponded to the range of mean and maximum count difference between the two classes in the real dataset with total read counts similar to those of the simulated data (data not shown).  All the cases were repeated 100 times.

The R package of LPEseq
We implemented the method as a package for the R environment and named it "LPEseq." It is freely downloadable from our website (http://bibs.snu.ac.kr/software/LPEseq) or Bioconductor. LPEseq uses any kind of count table format as the input file. The package was designed to perform the analysis with only a single command, allowing researchers who are unfamiliar with the R language to use it easily. All the analyses in this work were performed with the package LPEseq. In addition, the whole analysis code used in this work and the LPEseq manual can be found in our website.

Comparison with other methods
We compared LPEseq with the pre-existing methods edgeR [4,21], DESeq [5], DESeq2 [11], and NBPSeq [6] with respect to true positive rate (TPR) and FDR. In case of the studies with one sample per each class, DESeq2 and NBPseq were excluded for comparison because they are not applicable. The Benjamini-Hochberg procedure was used to adjust multiple testing problem [22]. Transcripts were reported as DE at an adjusted p-value threshold of 0.05. We compared TPR and FDR defined as the number of true DE transcripts detected divided by the total number of true DE transcripts and the number of false DE transcripts detected divided by the number of DE transcripts detected, respectively. In case of replicated data analysis, we ran the programs using the setting provided in the supplementary material of a review article [7] and the reproducible code provided in the DESeq2 [11]. For the non-replicated data analysis, we used the options recommended by manuals of edgeR and DESeq.

Gene-set analysis
We used DAVID bioinformatics database [23,24] for gene set analysis. We uploaded a list of DEGs and performed analysis with gene sets of Gene Ontology terms, KEGG pathways, and Chromosomes. Significantly enriched gene sets were listed based on FDR < 10%.

Analysis of the simulation data
We investigated the performance of LPEseq regarding the true DE detection ahead of false discovery under 72 combinatorial cases of four different parameters, δ, ϕ, κ, and m (i.e., effect size, dispersion, DE proportion, number of replicates, respectively). The primary purpose of developing LPEseq was to apply it to data with non-replicated samples in each class. In such a case, among the methods under consideration, edgeR and DESeq are applicable. Thus, we compared the results obtained from LPEseq, edgeR and DESeq. The overall performances of the three methods with one replicate in each class are reported in Table 1. In the majority of cases, DESeq produced an error message of "parametric dispersion fit error", suggesting the use of "local" fit option for the dispersion estimation by the package. Using the "local" fit option solved the problem but resulted in a loss of power where none of true DE transcript was detected, thus generating NAs for FDR except a few cases. However, the performance of edgeR and LPEseq was dramatic. In all cases, LPEseq showed less than 5% FDR and more than 92% TPR. Even though edgeR performed superior than LPEseq in aspect of finding true DE transcripts, showing more than 97% TPR, FDR was not well-controlled. The failure of edgeR to correctly find the DE transcripts might lead to suspicious interpretation of the outcomes. By comparing the results in Table 1, it is obvious that a significant improvement was obtained by applying LPEseq to datasets with non-replicated samples.
In this study, we sought to establish a method applicable to a small number of replicates, not just one in each class. Table 2 summarizes the results of five different methods with three replicates in each class. As can be seen, when replicated samples were used, no method is clearly superior under all conditions and all methods gave satisfactory results. It can be observed from the table that the performance of LPEseq, i.e.,<5% FDR and TPR quantitatively similar to that of other methods in all cases, is consistent with that of other methods. However, it should be noted that, in the NBPSeq method, in some cases the FDR was >5%. Note that we have also investigated the situations of different numbers (2, 5, and 10) of replicates and with a larger number (4,000) of DEGs. Similar results as described above were observed with these simulation settings (data not shown).
Here, we considered testing between two classes in RNA-Seq experiments assuming that transcripts (or genes) are independent from each other, which might be unlikely in many real datasets. We also have investigated a situation including correlated transcripts with simulated datasets and the results did not differ much in DE analysis (S3 Fig). Since the methods repeat a

Analysis with real data
We used LPEseq to analyze four different RNA-Seq data sets: non-replicates (Hammer's dataset), small replicates (Sultan's dataset), large replicates (MAQC dataset), and large normal sample with unequal group sizes (Montgomery's dataset). The primary purpose of the analysis was to discover DE transcripts and to compare the results obtained using competing methods. First, we compared the number of DE transcripts found by each method (Fig 2). For the Hammer's dataset, which has only one sample in each class, LPEseq, edgeR and DESeq were applicable. Hammer et al. reported that about 2,000 transcripts were DE and, among them, validated 755 transcripts using quantitative polymerase chain reaction (qPCR) [15]. Interestingly,  (Fig 2A).
As shown in Fig 2A, 357 genes were additionally called as DEGs by LPEseq. To identify their roles, we performed gene set analysis (GSA). We used gene ontology (GO) terms and KEGG pathways as gene sets and found the terms such as "defence and inflammatory responses", "sensory perception of pain" and neuroactive ligand-receptor interaction" enriched in higher ranks (< 10% FDR). These terms are strongly related to chronic neuropathic pain induced by SNL, which was the interest of Hammer et al's paper [15]. GSA with a list of genes only found by edgeR, however, failed to provide relevant information (see also S2 File for a whole list of DEGs and GSA results).
The Sultan's dataset consists of two different cell types, a kidney and a B-cell, and it would be expected that for such a comparison a large number of genes are differentially expressed. As can been seen in Fig 2B, more DEGs compared to Hammer's dataset were identified by all methods and many DEGs found by each method were overlapped each other (Fig 2B). The number of DE detected by LPEseq was 2,177, which was the minimum number of DE, while over 3,000 DE were detected by other methods except DESeq2. We noted that the almost DE transcripts found by LPEseq and DESeq2 overlapped with those identified by other methods. However, the other methods included non-overlapped DE transcripts, whose enriched gene functions have no explicit relevance to the comparison of the study (Fig 2B).
To observe characteristics of transcripts uniquely found by each method, we plotted the group mean differences in raw counts using DE transcripts found in each dataset (density plots in Fig 2). As shown in density plot in Fig 2, strong peaks were observed at zero in all competing methods, indicating that there were no considerable mean differences between classes. In fact, these DE transcripts included zero raw counts in both classes, demonstrating that many of these non-overlapping transcripts found were artifacts. For LPEseq, however, the mean difference between classes exhibited a bimodal peak distribution, indicating less likely the DE transcripts found by our proposed method being artifacts.
We also investigated MAQC dataset (comparing brain versus universal reference cell with 7 replicates per each condition) and the similar observations were made. Except edgeR, most methods identified uniquely found by each method and those uniquely identified genes with very small mean difference between conditions showed no enriched term by GSA. Only genes identified by LPEseq were enriched in transcriptional regulation function (See S4 Fig), which is highly activated in brain [25]. By studying real data, it is difficult to be conclusive that LPEseq performs better than other competing methods, but it limits the possibility of producing false calls and the result is robust to the number of replicates in each class (See also S5 and S6 Figs).
To make inferences based upon large as well as small samples through the LPEseq, we analyzed Montgomery's dataset [17], which consists of 33 normal females and 27 normal males. We first compared the LPE curves evaluated with total samples and two randomly selected samples using the Kolmogorov-Smirnov (KS) test. More specifically, we estimated the variance curve conditioned on the expression levels of the genes using the total samples and we randomly selected one sample from each group and estimate the variance using LPEseq. Then we compared the variance curves by performing a KS test. By repeating 10 times, we obtained 10 p-values from a two-sample KS test and none of them rejected the null hypothesis (S6 Fig), showing the variance estimates from LPEseq method are concordant with the 'true' variance estimated from the total samples. We also conducted DE analysis with sex as the group indicator and interpreted the functions of the DEGs using GSA. Since the comparison was made between males and females, it might be reasonable to expect that the genes in sex-chromosomes (X and Y) are more likely to be differentially expressed. As can be seen in the Table 3, genes in Y chromosome were enriched in the list of DEGs identified by each methods, except NBPSeq. Also most of the genes in sex chromosomes were top ranked in the list sorted by increasing p-value (S1 Table). Taken all together, the results show that the LPEseq robustly performs regardless of the sample size and the performance with large sample sizes is comparable to other competing methods.

Conclusions
We proposed a method for DE analysis with a small number of replicates, especially when a single replicate in each class is available. By extending the LPE method, the proposed method is applicable to the RNA-Seq experiments either with or without replicates per each class. Even though the proposed LPEseq followed the idea of the original LPE method, it is different from that in two aspects: (i) estimating local-pooled variance from different classes assuming them as replicates and (ii) removing outliers derived from the replicates assumption between classes. These two differences make DE analyses with non-replicated datasets feasible. By adding an auxiliary step, our proposed method performed more robustly compared to existing methods regardless of the number of replicates in each class, even when only one replicate in each class was available.
In general, the performance of DE test for RNA-Seq data can be influenced by sequencing depth and the number of replications. By increasing the sequencing depth, it is expected that the accuracy of expression counts will increase. With higher accuracy, we expect that LPEseq could better estimate the variances, resulting in improved performance. However, it was reported that the number of sample replicates is a more significant factor in accurate identification of DE genes in comparison to the sequencing depth [9,26]. A further detailed investigation of the effect of sequencing depth on the performance of DE test is desirable.
We considered the reproducibility score which represents how much the result made with replicated data is reproduced by the analysis with non-replicated or a smaller number of replicated data in our model development. Using this score, the threshold value for detecting outliers was suggested. By comparing the reproducibility of five methods, LPEseq showed the largest number of DEGs and the highest overlaps between the analyses with subset of samples and total samples, indicating the robust reproducibility regardless of the number of samples (S4 and S5 Figs).
It is worth noting that, even though LPEseq takes the reproducibility into account to the statistical testing without replicates, the scope of any conclusions drawn from it may be limited and need to be interpreted with extra cautions. For example, the outliers flagged in LPEseq may not really be outliers. They can be treated as outliers due to the mean difference between two groups, and this might affect the DE result. We have investigated this effect with MAQC dataset as follows. For MAQC data, there are 7 replicates in each condition. We analyzed all possible combinations of choosing 2 replicates from 7 in the same condition. We performed DE tests as if the two replicates had been obtained from different conditions. The average number of DE found by LPEseq was 5.095 (SD was 5.991) from 52580 transcripts. This analysis shows that the LPEseq method controls the false discovery rate when there are no outliers. Through the analysis with non-replicated data, we believe that the proposed method can be useful to obtain results that are comparable with those obtained from data containing replicates.
Note that the log-transformed count data such as negative binomial data would be an approximation to normal distributed data [27]. Although our LPEseq method treats the logtransformed count data as continuous ones, we believe that the LPE method taking into account the discrete random variables from the NB distribution would have advantage. This extension will be our next research topic.
The proposed idea can be easily extended to more than two comparison problems. For example, more complicated models such as analysis of variance (ANOVA), the "local-poolederror" and replicates assumption between classes can be applied to the ANOVA problems, with the exceptions that the residual mean sum of squares and residual degrees of freedom are evaluated using observations without outliers in a local bin. The method can also be applicable to test differential expression in the context of high-throughput experiments such as measuring protein expression or DNA methylation.

Author Contributions
Conceived and designed the experiments: TP JG.