Extending the information content of the MALDI analysis of biological fluids via multi-million shot analysis

Introduction Reliable measurements of the protein content of biological fluids like serum or plasma can provide valuable input for the development of personalized medicine tests. Standard MALDI analysis typically only shows high abundance proteins, which limits its utility for test development. It also exhibits reproducibility issues with respect to quantitative measurements. In this paper we show how the sensitivity of MALDI profiling of intact proteins in unfractionated human serum can be substantially increased by exposing a sample to many more laser shots than are commonly used. Analytical reproducibility is also improved. Methods To assess what is theoretically achievable we utilized spectra from the same samples obtained over many years and combined them to generate MALDI spectral averages of up to 100,000,000 shots for a single sample, and up to 8,000,000 shots for a set of 40 different serum samples. Spectral attributes, such as number of peaks and spectral noise of such averaged spectra were investigated together with analytical reproducibility as a function of the number of shots. We confirmed that results were similar on MALDI instruments from different manufacturers. Results We observed an expected decrease of noise, roughly proportional to the square root of the number of shots, over the whole investigated range of the number of shots (5 orders of magnitude), resulting in an increase in the number of reliably detected peaks. The reproducibility of the amplitude of these peaks, measured by CV and concordance analysis also improves with very similar dependence on shot number, reaching median CVs below 2% for shot numbers > 4 million. Measures of analytical information content and association with biological processes increase with increasing number of shots. Conclusions We demonstrate that substantially increasing the number of laser shots in a MALDI-TOF analysis leads to more informative and reliable data on the protein content of unfractionated serum. This approach has already been used in the development of clinical tests in oncology.


Introduction
Plasma and serum proteomic profiling are valuable tools to assess the disease state of an organism [1][2][3], relating the relative abundance of circulating proteins to clinical data for diagnosis, prognosis, and treatment selection. We present a method for enhancing the sensitivity, reproducibility, and information content of measurements of the circulating proteome based on Matrix-Assisted Laser Desorption Ionization (MALDI) Time of Flight (TOF) mass spectrometry.
While there are many approaches attempting multiplexed measurements of protein abundance, for example, multiplexed immunoassays [4][5][6][7][8] and aptamer-based methods [9][10][11][12][13], most of these methodologies are targeted at a pre-defined set of known proteins assumed to be relevant for a particular disease state. In addition, circulating proteins are often post-translationally modified. Common modifications such as truncations, methylations, phosphorylations, splice isoforms, intrinsic oxidations etc., are not easily differentiable in classic antibodybased approaches [14][15][16]. These modifications can be important for the phenotypic state of disease [17], and disease specific effects may be missed when studies rely on measurements at the level of protein families. For example, in Wu et al [18] different modifications of serum amyloid A (SAA) were shown to be associated with gastric cancer when compared to gastritis and healthy patients. Differences in relative amounts of truncated forms of SAA have been observed in acute vs chronic inflammation [19] as well as in type 2 diabetes mellitus patients compared to non-diabetics [20].
In contrast to many other methods, mass spectrometry based proteomic profiling requires neither prior knowledge of disease mechanism nor a list of protein targets, and is capable of quantifying the relative abundance of hundreds of proteins simultaneously, including truncated and modified forms. A combination of mass spectral features (peaks) representing many different proteins/peptides can provide a robust way to discriminate between two clinical groups where individual features do not [21,22]. Successful application of multivariate data analysis and modern machine learning methods to mass spectrometry based proteomic data depends on the ability to simultaneously measure a large number of features in the mass spectra [23][24][25][26][27][28][29].
The use of proteome profiling of unfractionated serum with MALDI-TOF mass spectrometry provides several practical advantages. The required sample volume is very small (a few microliters of serum or plasma), enabling large scale experiments on archival sample sets where often only small volumes are available. Samples can be shipped either frozen or dried on paper cards, enabling the analysis of archival samples and providing an easy transport mechanism for potential clinical application. Data acquisition and analysis are high throughput. The same MALDI-TOF platform can be used for discovery, development and validation of tests, as well as for running the tests in the clinical setting.
The plasma and serum proteome is extremely complex, and its quantitative analysis presents unique challenges, mainly related to the wide range of protein concentrations, which can span more than 10 orders of magnitude [30][31][32]. The peak content of standard MALDI spectra of unfractionated serum is believed to be limited to about 150 peaks, associated with proteins (at masses above approximately 5 to 6 kDa) and peptides (at lower masses), including protein fragments and truncated forms, originating from highly abundant proteins [2]. An estimate of the range of protein abundances observable in standard MALDI-TOF experiments is about two to three orders of magnitude [33]. Quantitation of less abundant proteins is presumed difficult due to the limited dynamic range of MALDI-TOF [34], and is exacerbated by matrixrelated chemical noise [35] and ion suppression effects [36][37][38][39][40][41]. Analytical reproducibility in MALDI protein profiling also remains a significant challenge [34,42].
In this work we study serum proteome profiling in the m/z range from 3 to 30 kDa using linear mode MALDI-TOF instruments. As we do not perform protein digestion, the proteins outside this mass range (i.e. heavier than 30 kDa) can only be observed via their naturally occurring fragments and truncated forms. Regarding the feasibility of proteome profiling using other types of mass spectrometers, linear MALDI-TOF remains a mainstream option. The m/z that we are studying are too high for a reflectron MALDI-TOF. Another promising possibility is Fourier transform ion cyclotron resonance mass spectrometry (MALDI-FTICR MS). These instruments demonstrate extremely high resolution, which would be very beneficial for profiling purposes. Historically, MALDI-FTICR instruments could only be used for relatively low m/z, such as up to 2500 Da [61] or up to 4000 Da [62]. However, relatively recently, using the state of the art 15-Tesla MALDI-FTICR instrument, the m/z range has been extended to 6500 Da [63], then to about 15 kDa [64,65], and eventually to about 20 kDa [66]. It remains to be seen whether MALDI-FTICR becomes more widely used for proteome profiling. In this work we limit ourselves to improving the sensitivity, dynamic range and reproducibility of serum proteome profiling with MALDI-TOF MS, which remains highly relevant for discovery and validation of new biomarkers, as well as for clinical applications in personalized medicine where throughput is an important consideration [34]. One of our primary goals is to be able to acquire MALDI-TOF mass spectra that would provide a good starting point for further analysis with modern machine learning methods [23][24][25][26][27][28][29].
The problem of expanding the information content of MALDI-TOF proteomic profiling with respect to the accessible abundance range, e.g., number of detectable peaks, while retaining accuracy and reproducibility of quantitation, can be viewed as a problem of improving the signal-to-noise ratio (SNR) of peaks. This calls for reduction of noise in MALDI-TOF spectra, which can be achieved by averaging spectra from a very large number of laser shots.
Traditionally, MALDI-TOF applications using serum or plasma use around 2000 laser shots. Averaging tens of thousands of laser shots to improve signal-to-noise ratios has been done for MALDI-MS-MS fragmentation spectra [67][68][69][70][71]. Averaging 10 spectra, 500 laser shots each, to improve the accuracy of mass measurements of peptides, using reflectron MALDI-TOF MS, has been done in [72]. Summation of 20000 laser shots (reflectron MALDI-TOF, m/z range from 1000 to 5000 Da) was used in [73,74] to quantify N-glycans in human serum. We applied the spectrum averaging approach to linear MALDI-TOF and found that the method can be extended to use much higher numbers of laser shots-up to 10 8 shots.
In this paper, we describe the deep MALDI approach which enables acquisition of MALDI-TOF spectra with many more laser shots than conventionally used, by acquiring a large number of spectra from within and across sample spots and averaging them together. We show that this leads to reduction of noise and of the CVs of feature intensities, resulting in an increase of peak content, SNR, dynamic range, and quantitative reproducibility of MALDI-TOF spectra. These effects can also be observed in appropriate measures of spectral information content, and in association of spectral features with biological processes, computed using set enrichment techniques [75]. We present data from two different MALDI-TOF instruments: Bruker Ultraflextreme and SimulTOF100.

Samples and sample preparation
The spectra used for this study were acquired over multiple years as a part of the standard quality control process at Biodesix. Spectra of unfractionated human serum samples were acquired on MALDI-TOF instruments in linear mode. Peaks in the spectra reflect peptides and proteins originally present in the sample. For each batch of experimental samples, four separate preparations of a reference control sample were spotted: two at the beginning, and two at the end of each MALDI sample plate, resulting, in total, in acquisition of 248,350 raw spectra of reference samples. We used two reference samples: one with Ultraflextreme (we denote this sample by RS1 in the remainder of the paper) and another with SimulTOF100 (denoted by RS2). Each reference sample was created by pooling equal volumes of serum obtained from five healthy individuals, purchased from ProMedDx LLC (Norton, MA, USA).
To evaluate the performance of the proposed acquisition methods on a data set obtained from a diverse set of samples, we utilized spectral acquisitions from our mass spectrometer qualification procedure. This procedure uses a sample set consisting of 40 serum samples purchased from Oncology Metrics (Fort Worth, TX, USA), which were derived from the blood of colorectal cancer and lung cancer patients. This set is called the machine qualification set (MQS) in the remainder of the paper.
To evaluate the biological implications of the presented approach we used a set of samples with sufficient volume to obtain protein expression measurements for a panel of 1305 known proteins, the SOMAscan (SomaLogic, Boulder, Co). 100 serum samples were purchased from the commercial biobanks Conversant Bio (Huntsville, AL) and Oncology Metrics (Fort Worth, TX). Samples were collected under ethics-approved protocols according to the requirements of Conversant Bio and Oncology Metrics. This set is called biological reference set (BR) in the remainder of the paper.
All samples used in this study have been approved for use in this study. Sample preparation reagents acetonitrile (Burdick and Jackson), HPLC grade water (JT Baker), trifluoroacetic acid (EMD), and centrifugal filters were purchased from VWR International. Sinapinic acid was purchased from Sigma (St Louis, MO, USA) or Proteochem (Loves Park, IL, USA) and used without further purification. Serum cards and punches were purchased from Therapak (Claremont, CA, USA) and Acuderm (Ft Lauderdale, FL, USA), respectively, and Protein Calibration Standard I was purchased from Bruker Daltonics (Billerica, MA, USA).

Instruments and instrument qualification
Two MALDI TOF mass spectrometers from different manufacturers were used for serum analysis in this study: Ultraflextreme (Bruker Daltonics, Bremen, Germany) and SimulTOF100 (SimulTOF Systems, Marlborough, MA, USA).
In order to obtain comparable spectra on different instruments and over extended periods of time, we have established a procedure to evaluate instrument performance. This is necessary as instrument performance will inevitably vary with normal wear and tear, repairs, and cleaning. Briefly, spectra are acquired from the machine qualification set and the reference control sample, and processed following a standardized sample preparation protocol. (Details on these procedures are provided in the S1 Appendix: Sample preparation and spectral acquisition). Feature values (integrated peak intensities) from spectra of each qualification and reference sample are compared to baseline acquisitions or "gold standard" spectra. Instrument parameters are tuned or adjusted until settings produce feature values concordant with the gold standard baseline acquisition.

Spectral processing
Generation of averages. The raw data generated by the instrument is stored in the form of raw spectra, containing the sum of 800 laser shots each. In our experience, up to about 100 raw spectra can be acquired from each sample spot. Almost all spots allow acquisition of at least 50000 shots, before the sample is exhausted. To obtain average spectra for higher number of shots, we acquire raw spectra from multiple spots. This produces a pool of raw spectra which we align and use to obtain final average spectra. To generate averages without losing resolution, the raw spectra need to be aligned. A set of internal calibration points were selected that were detected in the majority of raw spectra using a SNR threshold of 3 for peak detection, and used to generate aligned spectra for averaging. Raw spectra that could not be properly aligned were excluded from further analysis. Average spectra were created by randomly selecting, without replacement, a fixed number of aligned raw spectra to achieve a predefined shot number. For example, to generate an 800,000 shot average, 1,000 raw spectra were included from the total pool of raw spectra acquired from multiple sample spots.
Spectral processing of averages. Preprocessing techniques were employed to allow comparison of averaged spectra, including background estimation and subtraction, alignment, and normalization.
Background was estimated using the convex hull method [76][77][78], and subtracted. Averaged spectra were re-aligned using peaks common to all spectra. Normalization was performed to adjust for overall intensity differences. We normalized spectra using the integrated intensity of background subtracted spectra over the union of three mass ranges: [6100, 7500], [8500, 10700], and [13300, 16400]. (All values in Da).
Each feature (typically containing a single peak) was defined by its left and right m/z boundary. Feature values are computed as the integrated intensity between the boundaries (sum of intensities of the mass spectral signal) for each feature and spectrum independently. Feature boundaries were designed to allow for variations in peak width and slight shifts in alignment. In this study, we predominantly focus on a set of features that are observable across all samples and acquisitions. This set contains 298 features listed in the S1 Appendix, unless otherwise stated.
Noise estimation. Noise in our mass spectra is defined as fluctuations around a mean value with a wavelength (much) smaller than the peak-width. For large numbers of laser shots the spectra become quite smooth, and we needed to use extra care to estimate these fluctuations. First, we isolate high-frequency noise, by computing the smoothed spectrum, using Savitzky-Golay smoothing [79] (window length = 29, polynomial order = 8), and subtracting the smoothed spectrum from the original spectrum. Then, to estimate noise at a given m/z, we consider all intensity values from data points within an m/z window of relative width 0.08 centered around this m/z value. For example, to estimate noise at 12 kDa, the m/z window is from 11520 to 12480 Da. We estimate the standard deviation of noise as the difference between the 50-th and the 25-th percentiles of this data, divided by 0.6745. This provides an estimate of the noise strength that (1) is robust to possible outliers in the data, and (2) in the special case of the normal distribution N(μ, σ 2 ) reproduces its standard deviation σ. Indeed, for the normal distribution N(μ, σ 2 ) the difference between the 50-th percentile z 0.5 and the 25-th percentile z 0.25 is where erf −1 (x) is the inverse error function, erf −1 (0.5) � 0.4769362762 [80]. Thus s ¼ z 0:5 À z 0:25 ffi ffi ffi 2 p erf À 1 ð0:5Þ � z 0:5 À z 0:25 0:6745 :

Analytical information measure
We have developed a measure of the information content of a feature, designed to characterize its ability to differentiate between different samples. With this goal in mind, we consider the ratio of variability between samples (biological variability) to variability in repeated measurements of the same sample (technical variability). If this ratio is low (close to one), the measurement cannot distinguish between samples, and thus we cannot expect to be able to extract from it any clinically useful information. Consider repeated mass spectrometric measurements ("runs") of a set of samples. We define the information content, S j , of a single feature, j, as follows. Here we use Matlab-inspired notation: f(:,j,:) is the collection of (number of samples) � (number of runs) values of feature j for (all samples, all runs), and f(i,j,:) is a collection of (number of runs) values of feature j for all runs of sample i. The total information content for a mass spectrum is then just the sum of S j over all features.

Association of peaks with biological processes
The strength of association of mass spectral features with biological processes was estimated by applying the commonly used bioinformatics tool, gene set enrichment analysis (GSEA) [75], to protein expression. The set enrichment approach determines the association of a measured quantity (in this case a mass spectral feature value) with a particular biological process by looking for a consistent pattern of associations with the quantity in question across a set of proteins (or genes) known to be related to that biological process. Hence, to be able to associate individual mass spectral features with biological processes, it is necessary to have matched protein expression data and mass spectral data for a reference sample set. Relative protein abundance measurements for a panel of 1305 proteins were obtained for the BR set using the aptamerbased 1.3k SOMAscan assay (SomaLogic, Boulder, CO). Mass spectral data from the same samples were also collected as described in "Materials and methods", and mass spectral feature values determined for each sample for a predefined set of 298 features (See "Spectral processing of averages").
Protein sets for various biological processes of interest were defined as follows. The Gen-eOntology database, GO, (Gene Ontology Consortium) [81,82] was queried using AmiGO [83,84] and EMBL-EBI QuickGO [85] web applications to perform ontology searches and create lists of gene products associated with biological processes of interest. Many processes are interrelated; for example, activation of the complement system and acute phase response are important parts of innate immunity, and some elements of these lists inevitably overlap. This redundancy reflects the common aspects of related pathways. Typically, we selected relationships to the annotated terms that included "is a", "part of", "occurs in", and "regulates"; however, when this choice seemed too broad, we used narrower relationships. Evidence was filtered to allow for all types of manually reviewed annotations, but to exclude "electronic" annotations (not manually reviewed; evidence code "IEA" [86]). The intersection of the set of proteins found to be associated with a GO biological function of interest and the proteins measured in the SOMAscan panel yielded the protein set for this particular biological function. A table of the biological functions considered and their associated protein sets is provided in S1 Appendix.
The protein set enrichment analysis (PSEA) approach [87] first determined the univariate correlation between the values of a mass spectral feature and each of the 1305 proteins measured by the SOMAscan panel within the BR set. These univariate associations were assessed using the Spearman correlation coefficient. From these correlations, an enrichment score was generated, which assessed the relative consistency of the univariate correlations for the proteins contained in the protein set for the biological process in question compared with that for proteins measured but not contained in the relevant protein set. The enrichment score was defined as in [88] as this approach provides increased power for the identification of associations compared with the standard GSEA method [75]. P-values of association between each mass spectral feature and the biological processes were obtained by comparing the enrichment score with the null distribution generated by random permutation of the features values across the sample set. This approach followed the standard GSEA method described in [75]. False discovery rates for this multiple testing problem were estimated using the method of Benjamini-Hochberg [89].

Results
The numbers of raw spectra (800 laser shots each) available for averaging and further analysis are summarized in Table 1. As described in "Materials and methods", we randomly selected fixed numbers of these raw spectra to generate averages for fixed numbers of laser shots up to 100 million (for RS2 on the SimulToF instrument).
The dependence of the averaged spectra on the number of shots for the RS2 acquisition acquired on the SimulTOF100 is shown in Fig 1A. While there are no distinguishable peaks in the 8000 shot spectrum (in the selected mass range), small peaks emerge from the noise as the number of shots is increased; the peaks become better defined and differentiable, and the noise decreases. The last point is better illustrated by comparing averaged spectra including different numbers of shots and zooming into the y-axis as shown in Fig 1B. As the number of shots increases from 400 thousand to 8 million shots, the noise is greatly reduced, enabling the detection of small peaks (e.g. at 8320 and 8380 Da) and the differentiation of close peaks (e.g. around 8140 Da). It is necessary to zoom into the intensity axis to see the small peaks due to the large range of protein abundances in human serum [30][31][32] visible in the deep MALDI averages.
Assuming that abundance and peak intensity are proportional, and neglecting possible ionsuppression [36][37][38][39][40][41], our estimate of the observable dynamic range in our acquisition is about 4 orders of magnitude, as measured by the ratio of the largest observable peak to the smallest observable peak. (For comparison, at 8000 laser shots the observable dynamic range is about 2 orders of magnitude). This shows that it is possible to directly measure low abundance proteins in the presence of high abundance proteins with MALDI-TOF without fractionation, as long as the respective peaks are well resolved in m/z.

Dependence of SNR and number of observable peaks on shot number
To further investigate the characteristics of the spectra as a function of number of shots, we analyzed how noise varies with increasing number of laser shots. According to the law of large numbers and assuming ideal experimental conditions, the noise should decrease as the square root of the number of laser shots. Indeed, consider the average spectrum � yðxÞ obtained by averaging n spectra y i (x): where x = m/z, and i = 1 . . .n is the index of the spectrum. Individual spectra y i (x) contain This does not require the distribution of r i to be Gaussian; however, due to the central limit theorem, for large n we expect the distribution of � r to be approximately Gaussian.
In Fig 2A, we show the estimated noise (see "Materials and methods") as a function of the number of shots for RS1 acquired on the Bruker Ultraflextreme and RS2 on the SimulTOF100. For acquisitions on either instrument, the noise decreases over the whole accessible range of numbers of shots with the expected inverse square root behavior, indicating that increasing the number of laser shots and using the described averaging procedure efficiently reduces the amount of noise present in the average spectra, independent of the instrument.
We are interested in measuring as many peaks as possible with reasonable SNR cutoffs. In Fig 2B, we show the increase in the number of observable peaks as a function of laser shots for four different SNR cut-offs for RS2 acquired on the SimulTOF100. As expected, the number of observable peaks increases with increasing number of shots, but then surprisingly reaches a plateau at about 800 peaks. As the noise continues to decrease (see Fig 2A), this effect is at first glance surprising. We believe that the limit on the number of observable peaks is related to the finite resolution of the instrument, and that we are observing the effect of "peak crowding". The masses of observable proteins are not uniformly distributed across the m/z axis and there are regions where there are more peaks that are too close together to be resolved by the instrument. Hence, we would not be able to distinguish peaks in these areas even if we had optimal sensitivity, and in our high-number-of-shots approach, the number of peaks as a function of shot number is primarily limited by the resolution of the instrument. This explanation is illustrated in Fig 2C which shows the density of peaks and compares this density with the estimated inverse peak width. Over the whole m/z range from 3 to 30 kDa the density of peaks appears to be proportional to the inverse of the peak width. This supports the idea that the number of observable peaks is limited by instrument resolution, rather than by its sensitivity. Of course, the underlying distribution of peaks depends on how many proteins are actually present in a sample in a given m/z interval. One would need to repeat these experiments using instruments with higher resolution to answer this question more definitively. Artificially reducing resolution, by smoothing the spectra using a moving average with window width of 41 points  S1 Fig we compare such a smoothed spectrum with the original), we observe that the plateau in Fig 2B is reduced from around 798 peaks to 442 peaks, indicating that the number of observable peaks in MALDI serum spectra is limited more by resolution effects than sensitivity, if one utilizes many laser shots. Note that this effect is a manifestation of a high complexity of the sample (serum contains thousands of proteins or protein isoforms), whereas samples with lower complexity, e.g. spiked proteins in water, are not expected to be affected by peak crowding.

Analytical reproducibility of peak intensity as a function of laser shots
For clinical applications it is important to have good analytical performance of the measurement process. Having demonstrated that substantially increasing the number of shots leads to a reduction in noise and an increase in the number of observable peaks, we now show how reproducibility improves with increasing number of shots. We needed to perform this experiment using a diverse set of samples to ensure that we were not confounded by peculiarities of a single sample. To demonstrate the improvement of analytical reproducibility with increasing shot number, we created two sets of averages for the 40 samples in MQS ranging from 2400 shots to 8 million shots (limited by the available number of raw spectra). We examined the reproducibility of the 298 feature values by comparing the two sets in concordance analyses.
We use linear regression analysis as a measure of concordance (perfect concordance would result in a slope of 1). In Fig 3A, we report the results of the concordance analysis showing the median of (1-Pearson's R) from fits to a straight line in the feature concordance as a function of the number of laser shots. We see that as the number of shots increases, the median Pearson's R becomes closer to one indicating that reproducibility as measured by feature concordance improves.
To obtain an additional measure for the analytical reproducibility as a function of number of shots, we also estimated the CVs of the 298 features using 20 replicate averages at different numbers of shots for the RS2. In Fig 3B we show the CV distribution of the features as a function of the number of shots. As the number of shots increases, both the range and the median of the distribution decrease systematically (see also Table 2), indicating that the reproducibility of mass spectral features improves with the number of laser shots. The median CV decreases as a power law in the number of shots with exponent 0.5, as expected.

Information content of mass spectra as a function of laser shots
One primary application of MALDI proteome profiling is the development of tests based on the measurements of the abundance of circulating proteins, without requiring prior selection of specific target proteins. Successful development of such tests depends on the richness of the information content of the underlying data. Here we attempt to assess the dependence of the information content of spectra on the number of laser shots in spectral acquisition, both from an analytical and a biological perspective.
The observed reduction of the CVs of features with increasing number of laser shots reflects the decrease of noise-related random errors in the measurements of feature values. As can be seen in Fig 4, this is accompanied by an increase of the information content of spectra. Hence increasing the number of laser shots allows for more reliable differentiation of serum samples. squares), 60 (red diamonds)). C) Density of detected peaks at a SNR cut-off of 10 obtained by counting the number of peaks in equal-width m/z bins (the m/z range from 3200 to 30000 Da was divided into 60 bins), for the 100 million shot spectrum as a function of m/z. The dotted blue line is a smoothed version of this density, and the red line is proportional to the inverse peak width estimated from several most prominent peaks in the spectrum. https://doi.org/10.1371/journal.pone.0226012.g002 Extending the information content of the MALDI analysis of biological fluids Extending the information content of the MALDI analysis of biological fluids

Association of peaks with biological processes
The question arises whether the increase in analytical information content with increasing number of laser shots described above leads to an increased ability to detect biologically important phenomena. We address this question in the framework of set enrichment analysis, which estimates the association of individual features (peaks) with a set of biological processes (see "Materials and methods"). We analyzed how this association depends on the number of shots, using spectra up to 400K shots from the BR set. Asking which peaks are associated with a biological process we decide on a p-value cutoff of p<0.01 and set a false discovery rate (FDR) cut-off of 5%. The number of peaks meeting these criteria for different numbers of laser shots and for all investigated biological processes is shown in Table 3. For some processes,  Extending the information content of the MALDI analysis of biological fluids e.g. acute inflammatory response, there are many features associated even at low shot numbers (with fluctuations within the FDR), while for other processes, e.g. innate immune response and immune tolerance, the number of associated features increases with shot number indicating that increasing shot number allows for a deeper view of biology in serum profiling. As this study is devoted to the role of the number of laser shots in MALDI-MS profiling of unfractionated serum, and, in particular, to improvements that can be achieved by increasing the number of shots, we have adopted an approach to analysis of the association of MALDI peaks with biological processes that does not require the assignment of peaks to specific proteins and their fragments. Remarkably, set enrichment analysis approach [75,87] makes this possible. Data on assignment of some MALDI peaks to specific proteins does exist in the literature [1][2][3]33], but most of the peaks that we observe remain unassigned. This is a separate important problem which is outside of the scope of this study, and can be addressed by methods such as tandem mass spectrometry.

Discussion
We have presented a method for improving the sensitivity of MALDI-TOF mass spectrometry by increasing the signal-to-noise ratio of the measurements leading to an increase in the number of measurable circulating proteins from human serum samples. The same approach can be performed without modification for plasma. We observed that high-frequency noise in the spectra decreased approximately as an inverse square root of the number of shots, all the way up to 10 8 laser shots. This led to an increase of the observable abundance range to about 4 orders of magnitude (compared to about 2 orders of magnitude for 8000 laser shots) and the number of clearly observable and quantitatively useful peaks in MALDI-TOF mass spectra of unfractionated serum to about 800 peaks.
The extremely high number of laser shots (in the order of millions) presented here is not practical in high throughput operations, and for routine applications one needs to select a number of shots that is practically possible and retains the advantages of using many laser shots. We have decided to use averages generated from 400,000 laser shots for the routine generation of tests to be used in the clinical setting. On the SimulTOF100 mass spectrometer this requires spotting the sample onto eight separate MALDI spots (prepared as described in the section "Materials and methods") and consumes about 3 μl of serum. Reserving 32 spots for four reference samples, this results in a batch size of 43 samples using a 384 well plate. With current qualified instrument settings it takes about 30 hours to run a batch of 47 samples including the 4 reference samples. For these 400,000 shot spectra the median CV of the CV distributions is 2.31%, the number of observed peaks at a SNR cut-off of 15 is 677 for the reference sample, and the mean number of observed peaks for samples in the machine qualification set at the same SNR cut-off is 646. In order to achieve the presented decrease in noise, the resulting increase in SNR and in the number of observable peaks, and the corresponding improvement in reproducibility, much care has to be taken, especially with regards to spectral pre-processing. The alignment of the spectra to be averaged is of principal importance because even slight inaccuracies in this part can lead to peak broadening in the averaged spectra, and this loss in resolution limits the number of observable peaks, in addition to the limit imposed by the instrument resolution.
While the results presented here open the theoretical possibility of probing much deeper into the proteome than previously considered possible, they represent an idealized setting. As we randomly sampled raw spectra from acquisitions over many years, batch effects could be ignored. In clinical practice, we do not have a large reservoir of spectra available for individual samples spanning many batches. Instead a sample is prepared and collected within a single batch. To compensate for batch effects, we spot the reference sample at the beginning and at the end of each MALDI plate, and apply additional batch correction processing to map to previously acquired batches serving as baselines for clinical tests. We have established rigorous instrument qualification procedures to minimize batch effects and ensure test reproducibility, based on running a plate containing the MQS set of samples and confirming concordance with the "gold standard" MQS acquisition.
In general, the peaks we observe should be related to proteins of the classical plasma proteome as described in [30], but there could be other proteins visible in certain sample sets that are not usually described in the literature. We could increase the mass range of our measurements beyond the 3-30kDa range to further extend the number of detected proteins with this method. However, in the high mass range the resolution of the MALDI TOF instruments we used in this study becomes very poor. In the m/z range 30-70 kDa we have observed only a handful of very broad peaks. Thus we decided to limit the m/z range in this study to 30 kDa. In the low mass region highly variable metabolomic decay products may confound our ability to reliably detect peptides.

Conclusions
The results demonstrate that increasing the number of laser shots increases the number of measurable peaks in human serum samples without requiring fractionation steps. This holds true over a large dynamic range and appears to be limited by instrument resolution rather than sensitivity.
Increasing the number of laser shots leads to an increase of information content, both from an analytical and biological perspective. Assuming that subtle questions related to drug efficacy and toxicity, especially in oncology in the era of immunotherapy and early detection, require the detection and measurement of complicated regulatory processes, it is possible that the presented approach can lead to more reliable test discoveries, especially in the context of multivariate tests using modern machine learning methods.
This approach has been successfully applied to multiple test development efforts related to the development of prognostic and predictive tests in the area of oncology. Of particular relevance are the validated results obtained for a pre-treatment test identifying patients with metastatic cancer who are resistant to checkpoint inhibition [90][91][92][93][94]. Immune oncology should be a fertile ground for multivariate methods investigating the circulating proteome, given the interplay between tumor biology and the host immune system.
In summary, we have presented a method that significantly increases the useful information that can be mined from mass spectrometry-based profiling of serum samples. The method extends the observable dynamic range in a single workflow on MALDI-TOF platforms and could lead to the development of many more clinically useful and validated tests.