Figure 1.
Illustration of the Benjamini-Hochberg procedure.
In this example, the number of hypotheses () is 10 and the false discovery proportion (
) is 0.2. The largest index of the hypotheses that is below the line is 6 (
). Therefore, the first six hypotheses are rejected as the predicted peaks.
Table 1.
Comparison of the missing peak rate of the fixed number-based method () and the Benjamini-Hochberg (B-H) algorithm with
on the 32 spectra of the eight proteins in the benchmark dataset as picked by WaVPeak.
Figure 2.
Original volume curves and the corresponding p-value curves.
(a) and (d): sorted volume curve (a) and the corresponding p-value curve (d) of peaks predicted by WaVPeak on the 2D 15N-HSQC spectrum of the protein ATC1776; (b) and (e): sorted volume curve (b) and the corresponding p-value curve (e) of peaks predicted by WaVPeak on the 3D HNCO spectrum of the protein VRAR; (c) and (f): sorted volume curve (c) and the corresponding p-value curve (f) of peaks predicted by WaVPeak on the 3D CBCA(CO)NH spectrum of the protein COILIN. In all figures, true peaks are shown in black and false ones are shown in cyan. In (d), (e) and (f), the decision boundaries of and B-H procedure are shown in black and magenta, respectively.
Figure 3.
Original intensity curves and the corresponding p-value curves.
(a) and (d): sorted intensity curve (a) and the corresponding p-value curve (d) of peaks predicted by PICKY on the 2D 15N-HSQC spectrum of the protein TM1112; (b) and (e): sorted intensity curve (b) and the corresponding p-value curve (e) of peaks predicted by PICKY on the 3D HNCO spectrum of the protein COILIN; (c) and (f): sorted intensity curve (c) and the corresponding p-value curve (f) of peaks predicted by PICKY on the 3D CBCA(CO)NH spectrum of the protein RP3384. In these figures, true peaks are shown in black and false ones are shown in cyan. In (d), (e) and (f), the decision boundaries of and the B-H procedure are shown in black and magenta, respectively.
Table 2.
Comparison of the missing peak rate of the fixed number-based method () and the Benjamini-Hochberg (B–H) algorithm with
on the 32 spectra of the eight proteins in the benchmark set picked by PICKY.
Table 3.
Comparison of the performance of different peak picking methods.
Figure 4.
Precision-recall curves for different peak picking methods and sensitivity analysis of B-H WaVPeak.
(a)–(e): precision-recall curves for different methods on 15N-HSQC, HNCO, HNCA, CBCA(CO)NH and NHCACB, respectively. The solid black curves are for B-H consensus method; the dashed black curves are for the 1.5 consensus method; the solid cyan curves are for B-H WaVPeak; the dashed cyan curves are for the original WaVPeak; the solid magenta curves are for B-H PICKY; and the dashed magenta curves are for the original PICKY. The relative area under curve (AUC) values are in legends, which are the area under curve over the total area of recall at least 0.7. (f): sensitivity analysis for different number of peaks. The precision and recall values of B-H WaVPeak are shown when
,
,
and
top peaks are used to calculate the p-values.