speaq 2.0: A complete workflow for high-throughput 1D NMR spectra processing and quantification

doi:10.1371/journal.pcbi.1006018

Table 1.

An overview of open source NMR data processing solutions.

More »

Expand

Fig 1.

Possible workflows of speaq 2.0.

The newly presented methods are standalone (full black arrows) or can be used together with the CluPA alignment algorithm and BW quantification method that were made available in the first speaq implementation (v 1.0—1.2.3) [15] (dashed arrows). It is still possible to perform an analysis based on raw spectra alone, as per the classic speaq (v 1.0—1.2.3) analysis. With the new methods, raw data is converted to peaks, and every peak is summarized with ppm location and width, intensity and SNR. These peaks are subsequently grouped and optionally peak filled (missed peaks in samples are specifically searched for). The resulting data is converted to a feature matrix that contains intensities for each peak and sample combination. This matrix can then be used in statistical analysis with built-in or external methods.

More »

Expand

Fig 2.

Grouping algorithm pseudocode.

More »

Expand

Fig 3.

Visualization of Bonferroni corrected p-values.

Numerous features have a corrected p-value below the significance threshold of 0.05 indicating that there is a significant difference between red and white wine. An example of a significant feature (indicated with the red diamond) is represented on the right with its raw spectra (top), the data after peak detection (middle) and the data after grouping (bottom).

More »

Expand

Fig 4.

Raw spectral alignment methods and peak based grouping methods perform equally.

When the peak shifts between samples (caused by pH differences etc.) are less than the distance between adjacent peaks, all methods perform as expected. The raw spectra based methods (CluPA from the speaq v1.0—1.2.3 and icoshift) mitigate the differences in peak shifts and the peak based method groups the peaks accordingly.

More »

Expand

Fig 5.

Performance comparison workflow.

The default way of processing 1D NMR spectra is illustrated on the left. The case vs. control spectra are aligned and are then binned to produce features which can be used in statistical analysis. Note that the spectral alignment step can be skipped as the binning approach can correct for small shifts. This default processing approach is compared to our method shown on the right. The aim of both methods is to point the user to the peaks/intervals that discriminate between the two groups.

More »

Expand

Fig 6.

Performance comparison with ROC and P-R curves on a simulated dataset.

Binning raw unaligned spectra results in the worst performance. The two alignment tools (CluPA and icoshift) show an increase in performance compared to no alignment but are still hampered by the binning step. The new speaq 2.0 workflow has the highest performance on the ROC and P-R curve.

More »

Expand

Fig 7.

PCA analysis of onion mice data.

The onion mice data matrix is Pareto scaled and centered. There are no clear trends that follow the onion intake percentage present in the PCA results, this matches the results of Winning et al. [37].

More »

Expand

Fig 8.

Differential analysis results of onion intake in mice data.

(Left) the Bonferroni corrected p-values for the features resulting from the differential analysis and (right) one of the features with a significant p-value (indicated with the blue diamond on the left image): (top) raw spectra, (middle) data after peak detection and (bottom) data after grouping.

More »

Expand

Fig 9.

Correlation analysis of significant peaks.

The significant peaks, which are indicated by their peakIndex value, are clustered based on their Pearson correlation. The group of four peaks correspond to the 3-hydroxyphenylacetic acid biomarker, peak nr. 19723 corresponds to the dimethyl sulfone biomarker. Both biomarkers are also identified in the original analysis paper [37], but with only one peak for the first biomarker.

More »

Expand

Fig 10.

Workflow for comparing the results of MetaboAnalyst with speaq 2.0.

More »

Expand

Table 2.

Comparison between MetaboAnalyst and speaq 2.0 for the onion intake in mice dataset.

More »

Expand