Quantitation and Identification of Intact Major Milk Proteins for High-Throughput LC-ESI-Q-TOF MS Analyses

Cow’s milk is an important source of proteins in human nutrition. On average, cow’s milk contains 3.5% protein. The most abundant proteins in bovine milk are caseins and some of the whey proteins, namely beta-lactoglobulin, alpha-lactalbumin, and serum albumin. A number of allelic variants and post-translationally modified forms of these proteins have been identified. Their occurrence varies with breed, individuality, stage of lactation, and health and nutritional status of the animal. It is therefore essential to have reliable methods of detection and quantitation of these proteins. Traditionally, major milk proteins are quantified using liquid chromatography (LC) and ultra violet detection method. However, as these protein variants co-elute to some degree, another dimension of separation is beneficial to accurately measure their amounts. Mass spectrometry (MS) offers such a tool. In this study, we tested several RP-HPLC and MS parameters to optimise the analysis of intact bovine proteins from milk. From our tests, we developed an optimum method that includes a 20-28-40% phase B gradient with 0.02% TFA in both mobile phases, at 0.2 mL/min flow rate, using 75°C for the C8 column temperature, scanning every 3 sec over a 600–3000 m/z window. The optimisations were performed using external standards commercially purchased for which ionisation efficiency, linearity of calibration, LOD, LOQ, sensitivity, selectivity, precision, reproducibility, and mass accuracy were demonstrated. From the MS analysis, we can use extracted ion chromatograms (EICs) of specific ion series of known proteins and integrate peaks at defined retention time (RT) window for quantitation purposes. This optimum quantitative method was successfully applied to two bulk milk samples from different breeds, Holstein-Friesian and Jersey, to assess differences in protein variant levels.


Introduction
Bovine milk has been consumed by humans for as long as 8000 years in some regions of the globe; now human consumption of cow milk is world-wide, crosses all age groups, but is particularly prevalent during childhood as a result in part of promotional marketing especially in Asia where drinking milk is not part of the culture [1]. Becauseof bovine milk's nutritional and economical values, dairycattle breeds have been efficiently selected and successfully bred for increased milk production for centuries [2]. Through the combined effects of breeding, improved nutrition and husbandry practices, milk production of the modern dairy cow now far exceeds offspring requirements [3]. This milk excess is then offered on commercial marketsfor human nutrition as fresh pasteurised liquid milk, or further processed into yogurt, butter, cream, cheese, cream cheese, ice cream, powdered milk etc. . . to name a few of the mainstream dairy products. Breed is recognised as one of the main factors affecting milk composition and properties. Cattle breeds of the species Bostaurus, produce 85% of all milk commercially sold [2]; examples of these main breeds includeHolstein and Jersey. A nation-wide study comprising 90.1% Holstein and 5.3% Jersey of the 2009 United States dairy herd revealed that on average Holstein and Jersey cows daily produced 29.1 and 20.9 kg of milk, respectively with an average protein content of 3.1 and 3.7% [4]. In a different study, it was reported that although Jersey milk had greater gross value than Holstein's due to higher protein content, total volume of milk produced by Holstein cows offset this difference [5].
On average, cow's milk contains about 3.5% protein;however this level can vary with breed, individuality, stage of lactation, and health and nutritional status of the animal. The functional properties of milk proteins have been reviewed [6]. Caseins represent about 80% of total bovine milk proteins and whey proteins about 18% [2]. There arefive different types of caseins: alpha-S1-casein (aS1CN), alpha-S2-casein (aS2CN), beta-casein (bCN), kappa-casein (kCN), and gamma-casein (gCN)the latter being breakdown products cleaved from bCN by the major milk proteolytic enzyme plasmin [3]. The aS1-, aS2-, b-, and k-caseins are on average found at the following proportions in cow's milk, 38, 10, 35, and 12%, respectively. Caseins are of relatively small molecular weight (20)(21)(22)(23)(24)(25). The four most abundant whey proteins are beta-lactoglobulin (bLG), alpha-lactalbumin (aLA), bovine serum albumin (BSA), and immunoglobulins (Igs), which represent approximately 60, 20, 10, and 10% of total whey proteins, respectively. BSA is a leakage protein from blood which bears no biological or technological significance in milk [2]. These major milk proteins are encoded by highly polymorphic genes for which non synonymous and synonymous mutations have been reported, thus giving rise to 53 naturally occurring protein variants. The list, features and sequence information of all variants for aS1CN, aS2CN, bCN, kCN, aLA and bLG proteins has been summarised [7], and further updated [8][9][10]. There are currently 9 aS1CN variants, 4 aS2CN variants, 13 bCN variants, 13 kCN variants, 3 aLA variants and 11 bLG variants that have been described. These genetic variations mainly result in AA exchanges or deletions within the coding sequences thereby impacting the function of the encoded protein. Mutations within the noncoding sequences have been shown to affect protein expression and, in turn, milk composition which bears consequences on subsequent manufacturing steps, for example cheese making. The study of milk protein variants can be applied to breed characterization, diversity, and phylogeny. Furthermore, because milk proteins are involved in various aspects of human diet, characterising the occurrence of alleles associated with a reduced content of different caseins might be exploited for the production of hypoallergenic milk [8]. Beside allelic variations, major milk proteins are heavily posttranslationally modified with varying levels of phosphorylation of serine or threonine and/or gylcosylation of threonine residues, proteolysis by the indigenous milk enzymes, and oxidation of cysteine to disulfide bonds [9]. The number of phosphorylated groups (P) attached to caseins is variable, from 1P to 3P on kCNs, 4P to 5P on bCNs, 8P to 9P on aS1CN, and 10P to 13P on aS2CN [7,[10][11]. Through these phosphorylation sites, caseins bond to the hydrated calcium phosphate entities present in the casein micelles, thus stabilising their structure [9]. About half of the kCNs are glycosylated with short oligosaccharide chains at one or several threonine sites, and most of the kCNs are phosphorylated at Ser149 [9]; casein micelle size has been correlated with the presence of glycosylation on kCN [12].
The fractionation and isolation of intact milk proteins for their subsequent analysis depend on the intrinsic physicochemical properties of the individual proteins. Owing to the aggregating nature of proteins, a denaturing reaction is required prior to separation. Chaotropes and reducing reagents are commonly employed; for instance, the denaturant guanidine hydrochloride in combination with the reductant dithiothreitol (DTT) have often been used [13][14][15][16][17]. Alternatively, urea combined with mercaptoethanolhas also been frequently employed [18][19][20][21][22]. Among the diverse chromatographic and electrophoretic fractionation strategies that exist [10, 23 for review], liquid chromatography (LC) remains the most commonly employed for analytical purposes, and in particular reversed-phase high performance liquid chromatography (RP-HPLC) which separates compounds based on their hydrophobicity. The stationary phase of RP-HPLC separation columns is nonpolar and typically made of silanized silica with C4, C8 or C18 groups coupled to the silanol groups [23]. For instance, C18 columns [13,22,24], C8 columns [15,21], and C4 columns [14,[16][17][18]25] have all been employed for milk protein analysis. More recently, a C4 HPLC column was compared to a monolithic capillary HPLC column, with the latter displaying a greater resolving power [19]. Bobe et al [13] introduced a standard protocol for intact milk proteins separation by gradient elution at low pH with 0.1% trifluoroacetic acid (TFA) added to the mobile phases, thus avoiding aggregation and non-specific interactions of milk proteins and improving both protein solubilisation and chromatographic resolution. This 0.1% TFA concentration has since often been employed to study intact milk proteins [14-19, 22, 24-25]. Whilst at low concentrations, TFA helps recover larger proteins by enhancing their solubilisation; at high concentrations (!0.1%), TFA is known to suppress ionization of analytes in the electrospray ionisation (ESI) source of the mass spectrometer. Therefore, in the afore-mentioned studies, the proteins were only detected and quantified online by measuring ultraviolet (UV) absorbance at 210-220 nm, and not using a mass spectrometer. If chromatographic separation is compatible with MS, then the analysis of proteins using a mass spectrometer adds another orthogonal separation dimension to the LC, further separating proteins by their mass which not only improves the selectivity of the analysis but also gives access to protein identities. Details of the published masses of bovine milk proteins obtained using MS can be found in the Supplementary information (S1 File).
The aim of the present study was not to optimise the preparation of milk samples for intact protein analysis as it has been well established [13][14][15][16][17][24][25]. Rather, this works aims at optimising HPLC separation and MS analysis to identify and quantify cow milk proteins in a highthroughput manner. Fig 1 outlines the experimental design of the study. We have first optimised HPLC and MS settings using milk protein external and internal standards by assessing the linearity of calibration, matrix effect, sensitivity, reproducibility, selectivity, precision and mass accuracy. We also compared UV chromatograms and Base Peak Chromatograms (BPCs) to Extracted Ion Chromatograms (EICs). We then applied our optimum parameters to bulk milk samples from two bovine breeds, Hosltein-Friesian and Jersey, to validate the quantitative method.

Skim milk sample preparation
Milk sampling and skimming has been described [26]. The pasture-fed Holstein-Friesian and Jersey cows (Gippsland region, Victoria, Australia) were cared for in accordance with the Australian Code of Practice for the Care and Use of Animals for Scientific Purposes (www.nhmrc. gov.au). The experiment received animal ethics approval from the Agricultural Research and Extension Animal Ethics Committee of the Department of Economic Development, Jobs, Transport and Resources (Victoria, Australia). No particular steps were needed to ameliorate pain and suffering of the animals because cows were not subjected to any pain inducing procedures. Cows were exposed to the same type of handling, management and milk sampling that occurs on Australian commercial dairy farms. Proportional samplers (DeLaval International, Tumba, Sweden) were used to collect a sample of milk from each cow at each milking. Cows were milked twice daily, at 6:00 and 15:00, and milk was bulked into containers. A 50 mL aliquot of bulk milk samples from Jersey cows and from Holstein-Friesian cows were separately collected on 6 November 2014 and stored on ice at the respective dairy farms and during transport. A total of 440 Holstein-Friesian cows contributed to the vat on that date and cows averaged 139 days in milk. A total of 215 Jersey cows contributed to the vat on that date and cows averaged 140 days in milk. Three 2.0 mL milk samples were aliquoted from each bulk sample and stored at -80C until use. Milk protein extracts were prepared following method from [13] with modifications. A 0.5 mL volume of cold skim milk was transferred into a 1.5mL tube and 0.5 mL of Solution A (0.1 M Bis-Tris, 6 M Gdn-HCl, 5.37 mM sodium citrate tribasic dehydrate, and 20 mM DTT) was added. The mixture was vortexed for 1 min and left to incubate at room temperature for 50 min. A 0.02 mL volume of 50% acetic acid (1% acetic acid final concentration, pH 5.8) was then added to the milk/Solution A mixture. The tube was vortexed for 1 min and left to incubate at room temperature for 10 min. A 0.1 mL aliquot of the milk protein extract was transferred into a 100μL glass insert placed into a 2mL glass vial for immediate analysis.

Bovine external standard preparation and internal standard
In order to optimise HPLC separation, bovine protein standards were purchased from Sigma. The protein standards include: α-casein (aCN) from bovine milk (C6780-250MG, 70% pure), β-casein (bCN) from bovine milk (C6905-250MG, 98% pure), κ-casein (kCN) from bovine milk (C0406-250MG, 70% pure), α-lactalbumin(aLA) from bovine milk (L5385-25MG, 85% pure), β-lactoglobulin(bLG) from bovine milk (L3908-250MG, contains lactoglobulins A and B, 90% pure), albumin from bovine serum (BSA, A7906-10G, 98% pure). These lyophilised protein standards were fully solubilised at a 10mg/mL concentration in 50% solution 1/50% MilliQ H 2 O. Standards were dissolved by vortexing for 1 min and sonication for 5 min followed by another 1 min vortexing. Solubilised standards were left for 50 min at room temperature. A volume of 50% acetic acid to reach 1% acetic acid final concentration was added to the standards. Care was taken not to lower the pH below 4.6 as it would precipitate caseins; under our conditions pH was 5.5. Standards were vortexed for 1 min and left to incubate at room temperature for 10 min. A 0.1 mL aliquot of the solubilised standard was transferred into a 100 μL glass insert placed into a 2 mL glass vial for immediate analysis.
Myoglobin (Myo) from horse skeletal muscle was purchased from Sigma (M0630-250MG, 95-100% pure, essentially salt-free) and spiked as an internal standard (IS). A 10mg/mL myoglobin solution was prepared as described above. A 98μL milk protein extract was spiked with 2μL myoglobin solution (0.2mg/mL Myoglobin final concentration).

HPLC separation
Prior to analysis by MS, bovine milk proteins and standards were chromatographically separated using the UHPLC 1290 Infinity Binary LC system (Agilent). For method optimisation purpose, a series of parameters were modified as described in Fig 1 and detailed in the Supplementary information (S1 File).
The settings common to all tests are listed hereafter. The injection volume was 3μL (with needle wash). The diode array detector (DAD) spectrum was acquired from 190 to 400 nm. The pressure limit was set at 600 bars. The total duration of the HPLC separation was 40 min, with the first 2.5 min switched to waste to allow for online desalting and infusion of the internal calibrant (Na formate solution composed of 1M NaOH in 50% isopropanol (IPA)/0.1% formic acid (FA)) into the mass spectrometer.

MS analysis
HPLC and MS parameters were set using microToF 3.4, ESI Compass 1.3 andHyStarPP 3.2SR4 software (Bruker DaltonikGmbh). Following HPLC separation, milk proteins were analysed using a maXis HD UHR-Qq-ToF (60,000 resolution) with an ESI source (BrukerDaltonikGmbh). The MS was calibrated weekly and auto-tuned monthly using the ESI-L low concentration tuning mix (Agilent).
To ensure mass accuracy, a Na-formate solution was infused continuously at a 0.1 mL/h and the first 2.5 min of each run were used to re-calibrate masses post-acquisition. Each 40 min run was thus segmented as follows: 2.5 min to waste and the following 37.5 min to source. Capillary voltage was set at 4500V. The nebulizer was set at 1.5 bar. The dry gas was set at 8 L/ min. The dry temperature was set at 190°C. The transfer funnel RF and multipole RF were set at 400Vpp, no ISCID energy was applied. The quadrupole ion energy was 5eV, the collision cell energy was 10eV and the collision RF 1800Vpp. The ion cooler transfer time was 120 μs, with a prepulse storage of 10 μs, and a RF of 400Vpp. The ion polarity was positive and scan mode was MS. The rolling average mode was activated and set at 2.
Details of the MS parameters tested and mass spectra deconvolution can be found in the Supplementary information (S1 File). Extracted ion chromatograms (EICs) were produced for each standard using the ion series indicated in Table A in S2 File and a +/-0.1 m/z tolerance. For a given standard and a given dilution, the peak areas of each individual protein variant were summed as a proxy for the standard response. Peak areas were integrated using the retention times (RT) indicated in Table A in S2 File with a 4 min window. The S1 File also explain how the linearity of calibration, sensitivity LOD, LOQ, working ranges, matrix effect, reproducibility, precision and selectivity of the standards and milk proteins were computed.
Accession number, AA sequences and processing information of the milk protein standards were retrieved from UniprotKB knowledge database (last modified 28 August 2015; http:// www.uniprot.org/). AA sequence were then manually modified to account for protein maturation processes including signal peptide cleavage and post-translational modifications (PTMs) such as phosphorylation, glycosylations, and allelic variations using information from both UniprotKB and report from [7]. This investigation is summarised in Table B in S2 File.
All relevant data are within the paper and the stable public repository MassIVE. Data at MassIVE are hosted at the following URL with corresponding Accession Number: URL: http:// massive.ucsd.edu/ProteoSAFe/datasets.jsp Accession Number: MSV000080036. 3.1.1. Impact of gradient, flow rate, and composition of the mobile phase. In our study, three gradients were tested in which not only the starting conditions differed (3, 20 or 24% phase B) but also the ramping steps during which protein elution occurred (3-40%, 28-45%, or 28-40% phase B). When the 3-40% B gradient was employed, most proteins eluted during the second half of the separation run (15-32 min, Fig 2). One exception was kCN which displayed the earliest RT and eluted throughout the run. Also worth noting is the highest base peak from kCN standard was 4.5 more intense under 3-40% B gradient than when the other two gradients were applied. This gradient usually applies to peptide separation by RP-HPLC [26]. Because whole proteins are much longer than peptides therefore more hydrophobic, eluting them from the stationary phase therefore requires higher organic solvent concentrations. As more than half the separation time was not exploited, 3-40% B gradient was deemed unsuitable. The 24-28-45% B gradient was based on method fromBobe et al. [13] and applied more concentrated solvent condition both at the start of the run and the end of the separation step than that of 3-40% B gradient. This gradient generated HPLC profiles in our hands comparable to [13], except for aLA which eluted earlier under our conditions. The protein standards mostly eluted during the first half of the separation run (2.5-20 min) and therefore the second half of the run was not efficiently exploited. Consequently we did not select this 24-28-45% B gradient. The 20-28-40% B gradient was a variation of 24-28-45% B gradient in which solvent concentration both at the start of the run and the end of the separation step was slightly lowered to slow protein elution down. Indeed, overall elution with 20-28-40% B gradient occurred from 5 to 25 min and peaks were visually more intense and narrower than those under 24-28-45% B gradient. Based on these results, 20-28-40% B gradient was selected for our HPLC method.

Optimisation of HPLC separation of bovine protein external standards
Applying 20-28-40% B gradient, three flow rates were evaluated at 0.1, 0.2, or 0.3 mL/min. As expected, the greater the flow rate, the quicker the elution of protein standards (Fig 2). Furthermore, the quickest flow rate compromised peak intensity, whilst the slowest flow rate Visual summary of the HPLC optimisation. HPLC separation was optimised by modifying the gradient, flow rate and composition of the mobile phases as well as testing different temperatures and stationary phase chemistries of the separation column. The first column lists the various conditions tested and the following columns display the results for each external standard analysed in this study. Base Peak Chromatograms (BPCs) are displayed from 2.5 min to 32.5 min on the x axis (retention time). The same intensity scale was displayed on the y axis for a given standard and parameter. Yellow arrows on the right hand side point to optimum conditions for each parameter tested. aCN, alpha casein; bCN, beta casein; kCN, kappa casein; aLA, alpha lactalbumin; bLG, beta lactoglobulin; BSA, bovine serum albumin.
doi:10.1371/journal.pone.0163471.g002 negatively affected peak shape and narrowness. Therefore, the intermediate flow rate of 0.2 mL/min was selected for our method. Our rationale was to minimise the volume of solvent used to enhance ionisation and also reduce the cost of the analysis but without compromising the quality of protein separation. Apart from [14] and [20] who applied a 0.25 and 0.20 mL/ min flow rate respectively, generally, faster flow rates have been applied from 0.4 mL/min [21], 0.5 mL/min [15], 0.8 mL/min [18], 1.2 mL/min [13] to 3.0 mL/min [22]. For the comparison to be accurate, HPLC column dimensions, and particle and pore sizes must also be taken into account (Table C in S2 File). Column efficiency is often used to compare the performance of different columns. Efficiency ranged from 25 to 14% in the articles cited here, with our C8 sitting in the middle with an efficiency of 20.8%.
Applying 20-28-40% B gradient and 0.2 mL/min flow rates, we tested the addition of TFA to our mobile phases A (H 2 O/0.1% FA) and B (ACN/0.1%FA). Three concentrations were employed 0, 0.02, and 0.1% TFA. Signal intensity was systematically the lowest with 0.1% TFA, symptomatic of in-source ion suppression, for all standards; moreover elution was delayed by several minutes (Fig 2). When 0.02% TFA was added to the mobile phases, the intensity of bCN and bLG was affected, with the intensity of the other standards remaining unchanged. Consistently, peak shape and narrowness were greatly improved with 0.02% TFA compared to no TFA at all, and RTs were not affected. When no TFA was added to the mobile phases, proteins eluted during the first half of the separation run. Based on these observations, it was decided to include 0.02% TFA to our mobile phases for all subsequent LC-MS run. Traditionally intact milk proteins have been detected by chromatography where high concentrations of TFA (0.1%) in both mobiles phases A and B have been used (Table C in S2 File; [14][15][16][17][18][19]22]). TFA, a strong pairing agent that mitigates cation exchanges during HPLC separation, improves the chromatographic separation of proteins by increasing the solubility of eluted proteins in ACN [27]. High concentrations of TFA are not recommended when MS analyses are to be performed as this strong acid severely suppresses analyte ionisation in the ESI source. Aware of this phenomenon, TFA concentration in mobiles phases was dropped to 0.01% [21], thus ensuring successful identification of aS1CN and bCN variants by MS.

Effect of column temperature and chemistry.
Applying 20-28-40% B gradient, 0.2 mL/min flow rates, and 0.02% TFA to the mobile phases, we then turned our attention to the separation column by first testing three distinct oven temperatures: 45, 60 and 75°C. As expected, the higher the temperature, the quicker the elution of protein standards, particularly when 75°C was applied (Fig 2). Both peak intensities and shapes were superior at 75°C relative to 45 or 60°C. Therefore, 75°C was selected as our optimum temperature. The temperature of the RP-HPLC column plays an important role in the separation of intact proteins as it affects both protein conformation and mass transfer kinetics; high temperatures maintain protein denatured states [28]. The column we used offered a broad range of temperatures, being stable at up to 90°C. Previous publications did not apply such high temperatures (Table C in S2 File); column temperature ranged from ambient [13,18], 35°C [16,17], 40°C [14,22], 45°C [15], to 50°C [21].
For determination of separation efficiency based on column chemistry, two different stationary phases from the same supplier were evaluated; a C18 column usually applied to peptide separation and a C8 column, more commonly used for intact protein separation. Not only were these columns packed with distinct stationary phases, but also had different particle size; thus bearing different column efficiencies (N = 20.8% for the C8 column and N = 44.1% for the C18 column), hence different resolutions. Their dimensions and pore sizes were the same, thus displaying equivalent interstitial or dwell volumes (286 μL). With the temperature set at 75°C, the columns produced vastly different separation profiles (Fig 2). This was expected considering the C18 column displays more than twice the resolving power of the C8 column. Using a C18 column changed the BPCs of the standards to such an extent, in particular for bCN, kCN, and bLG, that we could no longer compare them to published results [13-18, 21,24]. Moreover, deconvoluted masses of these additional peaks obtained using the DISSECT, Maximum Entropy and SNAP algorithms did not correspond to known proteins (data not shown). Identifying such proteins would require top-down sequencing experiments, which is beyond the scope of this paper. Consequently, we selected C8 chemistry. When a C18 chemistry was employed with UV detection to separate milk major intact proteins (Table C in S2 File), aS1CN-9P phosphorylated form and bCN A1 and A2 variants could not be resolved [13,22]. A C8 chemistry (Zorbax 300SB-C8 RP, 3.5 μm particle, 300 Å pores, 150 × 4.6 mm, Agilent Technologies) with UV detection was successfully employed to resolve all major casein variants [15], in an elution order very similar to that described here, with the exception of aLA which eluted between bCN A2 and bLG B. The same Aeriswidepore XP-C8 chemistry employed here was also used [21] albeit with smaller column dimensions (2 x 100 mm) and identical particle size (3.6 μm), followed by MS analysis; aLA eluted between aS1CN-9P and bCN A1 similarly to our chromatograms.

Optimisation of MS analysis using bovine protein external and internal standards
3.2.1. Impact of the mass scanning rate and window. Two scanning rates were tested, 0.7 Hz (one scan every 1.5 seconds) or 0.3 Hz (one scan every 3.0 seconds). Peak intensities doubled when using the 0.3 Hz scanning rate which was at half the speed as the 0.7 Hz rate (Fig A  in S2 File). Another anticipated consequence was that the number of data points recorded along the chromatogram was halved at 0.3 Hz scanning rate relative to 0.7 Hz rate. As some standards do not ionise efficiently (e. g. aCN and BSA, Fig 3), thereby considerably diminishing peak intensity, the setting that favoured intensity over data point density was selected (i.e. 0.3 Hz) for our method. This method allowed a minimum of 20 data points to be collected across each peak (Fig 3), which was sufficient for quantitation.
Applying a 0.3 Hz scanning rate, mass scanning range of the MS was evaluated by scanning either from 600 to 6000 m/z or from 600 to 3000 m/z. In order to visualise the richness of the spectral signal along the whole mass range, for each standard the BPC was averaged from 5 to 25 min to produce an averaged mass spectrum. Examination of the 3000-6000 m/z range showed very little spectral signal with our MS ion transfer settings (Fig A in S2 File), therefore 600-3000 m/z scan range was selected for our method.

Mass resolution and accuracy, and identification of PTMs.
Isotopic patterns were obtained for all proteins of interest (Fig B in S2 File), except BSA whose high MW prevented isotope resolution and for which average mass was therefore retrieved. Deconvolution using the Maximum Entropy algorithm resulted in monoisotopic masses (except for BSA) with resolution ranking from 33522 (bCN I-5P) to 48843 (aS2CN A-13P), FWHM between 0.7145 (bCNI-5P) and 0.4114 (aLA B+G) ( Table 1). Based on these results, we can confidently conclude that the Q-TOF instrument employed in this study generated highly resolving mass spectra.
Theoretical masses were obtained using manually curated sequences (Table B in S2 File) and when compared to the observed masses, the mass difference was always less than one Dalton, bar aS2CN A-14P (Table 1) By cross-checking protein deconvoluted masses with public data sources (uniprotKB; [7][8][9][10]) we were able to reliably identify the milk protein variants and some of their Standards were prepared at the same concentration, run independently and overlaid to illustrate that ionisation efficiency varied from one protein to the other. All external standards purchased from Sigma (aCN, bCN, kCN, aLA, bLG, BSA, and myoglobin) were dissolved in 50% Solution A to a 10 mg/mL concentration. The coloured arrows in between the UV traces and the EICs represent the elution windows of the bovine protein standards.  (aS2CN A-14P). An ESI-Q-TOF MS platform was also employed in [18,20]. A simple TOF instrument was used to identify cow's milk proteins based on deconvoluted mass information [21]. Using the Microtof QII high resolution mass spectrometer with a 20,000 resolving power and 2 ppm mass accuracy (Bruker Daltonics Gmbh), monoisotopic masses of milk caseins were obtained and variants A1, A2, B and C of bCN using specific ions, such as 22+ charge state, to produce EICs were thus quantified [20].

Determination of protein ionisation efficiency.
UV traces of external standards made up at the same concentration (e.g. 10 mg/mL) and obtained during RP-HPLC separation were comparable across standards, with bLG and aS2CN displaying lower intensities (Fig 3,  top panel). However, EICs of external and internal standards made up at the same concentration did not produce similar intensity and peak shape patterns (Fig 3, bottom panel). This demonstrates that ionisation and transmission efficiencies vary from one protein to the other. Ionization efficiency is the effectiveness of producing gas-phase ions from analyte molecules in solution within the ESI source and transmission efficiency is the ability to transfer the charged species from atmospheric pressure of the ESI source to the low-pressure region of the mass analyzer [29]. The efficiency at which ions are being ionised varies with their mobility, which differs among ion species [30]. Based on obtained peak areas from their corresponding EICs, the proteins ranked as follows: The bovine protein most abundant in milk, aS1CN, displayed the least efficient ionisation under our ESI conditions. This illustrates one artefact of MS as ion chromatograms do not necessarily reflect the abundance of a given protein in a sample but rather how ionisable this component is. This is why calibration curves of external standards at increasing concentrations are essential to quantify known proteins using MS. The horizontal arrows in Fig 3 visually illustrate that due to extensive overlap of milk protein variants, UV trace alone is not suitable to reliably integrate their individual peak areas for quantitative purpose. By further discriminating intact proteins according to their m/z, MS offers an additional orthogonal separation level to HPLC, both of which complementing each other to individualise major bovine protein variants.

Method validation
3.3.1. Calibration, matrix effect and sensitivity. Using our optimum MS scanning rates and mass window, calibration curves were produced in duplicate along a 0.1-10.0 mg/mL concentration range for each external standard. Fig C in S2 File further exemplifies how ionisation efficiency varied from one standard to the other. Overall, linear curves were obtained and positively highly correlated with increasing concentrations of analytes (R 2 ranking from 0.97 for BSA to 0.99 for aCN). LODs ranked from 0.46 mg/mL (aCN) to 2.10 mg/mL (BSA) and LOQs ranked from 1.50 mg/mL (aCN) to 7.01 mg/mL (BSA) ( Table 2). Based on these results, the working ranges (0.9 to 10 mg/mL) covered most of the concentration range tested in our study in bovine protein standards.
The effect of the matrix was tested by spiking the internal standard protein myoglobin at increasing concentrations (0.1-10 mg/mL) into three different matrices: 1/ 50% Solution A which is used to prepare the milk samples for LC-MS analysis and was our control, 2/ a protein sample prepared from Jersey skim milk, and 3/ a protein sample prepared from Holstein skim milk. Trend lines on Fig D in S2 File demonstrated the linearity of myoglobin response along the concentration range, irrespective of the matrix used, with high reproducibility. High reproducibility was further confirmed numerically in Table 3 with a coefficient of variation (CV) well below 10% for both RTs and responses, irrespective of the matrix used.
Matrix effect was more pronounced for Jersey samples that for Holstein sample particularly at low concentrations (Table 3), averaging 11.5% and 6.3% respectively. Globally, matrix suppressed Myoglobin ion intensity.
Sensitivity was assessed using the obtained signal-to-noise ratio (S/N). Using triplicate blanks to assess the noise, our results showed a very high S/N, with a minimum of 934, well above the standard LOQ threshold of 10 (Table 3).
Based on Tables 3 and 4 reports the slope, SE, intercept, R 2 , LOD, LOQ and working range of myoglobin calibration curve within each matrix either over the entire concentration range (0-10 mg/mL) or over a range limited to 0-1 mg/mL. Statistics were improved at a lower concentration (Table 4 and inset in Fig D in S2 File) because the linear trend was then a better fit. We chose to spike myoglobin at a 0.2 mg/mL concentration into milk samples because of its low CV, linearity, and LOQ (Table 4). At this concentration, ion suppression was 7% in Holstein sample and 15% in Jersey samples (Table 3).

Reproducibility, selectivity, and precision.
External standards were run in triplicate, with and without spiked Myoglobin. The EICs of the proteins of interest overlaid very well across all six replicates, irrespective of the presence of IS or not (Fig E in S2 File), thus demonstrating good reproducibility. By using the ion series indicated in Table A in S2 File to produce EIC for each protein of interest, and limiting this EIC to the RT at which the standard is expected for peak integration (shaded area in Fig F in S2 File), we can selectively detect and quantify milk protein standards.
Excellent reproducibility levels are numerically confirmed in Table 5 with all CVs being below 6%. The response CV of external standards solubilised in 50% Solution A was overall smaller when the IS was not spiked into the external standards. Indeed, in the presence of IS, CV varied from 0.1 to 2.7% with an average of 2.3% (+/-1.4%), whereas in the absence of IS, External standards were run in duplicates. Based on the averaged results, the slope, standard error (SE), intercept, Pearson correlation coefficient (R 2 ) values, limits of detection (LOD) and quantitation (LOQ), and working range were computed. LOD for each standard was obtained using the following formula: 3*(standard error/slope). LOQ for each standard was obtained using the following formula: 10*(standard error/slope). The working range was the interval between the LOQ and the upper concentration of the analyte in the samples tested in this study (10 mg/mL) for which linearity was demonstrated.
doi:10.1371/journal.pone.0163471.t002 Table 3. Averaged RTs and response of myoglobin internal standard prepared in Solution A or spiked in milk matrices a cross 2 technical replicates. CV varied from 0.3 to 5.3%, with an average of 1.3% (+/-0.7%). As expected, normalising the protein response using an IS helped make the data more reproducible.

Application to cow's milk samples
The final validation step for our method was to apply theLC and MS parameters that were deemed optimum for the analysis of external standards on milk samples from two distinct cow breeds, Jersey and Holstein-Friesian (shorten to Holstein in Tables and figures for ease of reading). Milk samples spiked with myoglobin IS were run in triplicates. Milk proteins eluted from 5 to 25 min, with the RT from 10.0 to 14.5 min being protein-poor (Fig 4). All the proteins of interest identified using external standards were successfully detected in the milk samples from Jersey and Holstein-Friesian cows as evidenced by the EICs. UV traces, BPCs and EICs of these proteins were very similar across technical triplicates and therefore overlaid nicely. Fig 4 further illustrates that UV traces and BPCs alone are not sufficiently resolved to allow the quantitation of individual milk protein variants. Variants were individualised by extracting the chromatograms of their corresponding ions, and their abundances (i.e. response) in milk samples were inferred by integrating the peak areas of the EICs (Table 6). This strategy is schematised in Fig G in S2 File. When the EICs ofonereplicate from each breed was overlaid one on top of the other, all protein peaks were found and their intensities varied in a breed-specific manner. For instance, kCN B-1P, bCNB-5P, and bCNA2-5P levels were higher in the Jersey sample than in the Holstein sample. Conversely, the levels ofkCN A-1P and bCN A1-5P were increased in the Holstein milk than in the Jersey milk. Averaging mass spectra over the protein elution profile produced ion distributions mostly condensed around 900-1800 m/z, irrespective of the breed  (Fig 4). The proteins that varied the most across breeds, namely kCNs, were chosen to exemplify spectra quality (Fig 4). A well-defined charge envelope is visible for the proteins displaying isotopic resolution. Spectra were sufficiently resolved to achieve isotopic separation and therefore deconvoluted into accurate monoisotopic masses for all proteins with the exception of BSA for which average mass was obtained. Reproducibility levels were acceptable as assessed by the CV which ranged from1.7 to13.6% (Table 6). A Student t-test was performed to determine which protein variants differed between the breeds. If we arbitrarily consider a cut-off pvalue of 0.01, the expression levels of all variants were significantly different, except for aS2CN B-8P, aS2CN A-10P, bCN I-5P, bLG B and D, BSA, and kCN B-1P+G (Table 6). If we drop this cut-off to 0.001, then only aLA B, aS2CN A-11P, 12P, 13P and 14P, bCNA1-5P and B-5P, kCN A-1P and 2P, and kCN B-1P remained significantly affected, with all of them being more abundant in Holstein milk except bCN A1-5P, and all kCN safore-mentioned, whose levels were higher in Jersey milk. Based on these results, we can conclude that our optimum method can be successfully applied to quantify major known milk proteins from two distinct cow breeds.
In bovine milk, the phosphorylation of caseins (which have been extensively characterised) plays in important structural role in the stabilization of calcium phosphate nanoclusters in casein micelles. The total casein fraction in cow milk comprises up to 40% of aS1CN with commonly 8 or 9 phosphorylated serine residues; aS1CN-9P contains an additional phosphorylated serine residue at position 56 of the preprotein [11]. In our study, both phospho-proteoforms were identified; aS1CN B-8P (23600.2457 Da) was distinguished from aS1CN B-9P (23680.2289 Da) with a 79.9832 Da difference. They slightly co-eluted, their apex being less than two minutes apart. In bulk milk from Holstein-Friesian herd, aS1CN B-8P was 2.24 times more abundant than aS1CN B-9P; the trend was reversed in bulk milk from Jersey herd, as aS1CN B-9P was 2.26 times more abundant than aS1CN B-8P. Previous studies using capillary zone electrophoresis reported similar findings; aS1CN-8P form occurred in a 3-fold excess over the aS1CN-9P form in Hostein-Friesian breed [31]. A genome-wide association study revealed that aS1CN-8P and aS1CN-9P were not regulated by the same set of genes, and that lower concentrations of aS1CN-8P were genetically associated with the AA genotype of bLG [11]. Indeed, in our study, Jersey bulk milk which displayed a lesser aS1CN B-8P concentration relative to Holstein-Friesian bulk milk, also displayed a greater bLG A concentration.
Kappa-caseins provide a hydrophilic coating of casein micelles thus preventing micelle association and aggregation, and stabilizing their structure. Out of the 13 kCN variants reported so far, A and B variants dominate Jersey and Holstein-Friesian herds in Denmark [32]. Cows with predominant kCN variant B have been consistently associated with desirable coagulation properties in the cheese-making process. Two phosphorylation sites are known in kCNs; serine residues at positions 148 and 170 of the preprotein sequence; the latter being constitutively phosphorylated while the former is only partially phosphorylated [33]. In our study, both phospho-proteoforms were identified; kCN B-1P (18994.5907 Da) was distinguished from kCN B-2P (19075.5445 Da) with a 80.9538 Da difference. This variation in degree of phosphorylation also altered the elution of the protein with kCN B-1P eluting before kCN B-2P. A glycosylated form of kCN B-1P was also identified (19650.8391 Da) which, based on literature, would bear a GalNAc-Gal-(NeuAC) O-linked oligosaccharide group (656.2484 Da). Variant B of kCN would be more extensively glycosylated than variant A [12], as we observed. This kCN B-P+G variant was also identified using 2-DE and MALDI-TOF MS [33]; where phosphorylation at Ser170 and its glycosylation at Thr152 were characterised [34]. The other glycoprotein identified in our study was aLA B+G variant (14500.9142 Da) with a gain in mass of 324.1 Da, which would correspond to two mannose, and/or glucose and/or galactose residues. To our knowledge, this is the first time this glyco-form of aLA is reported. In fresh cow's milk, a small fraction of aLA molecules are N-glycosylated at Asn71 which is a rare protein N-glycosylation site [35]. Monosaccharide analyses of glyco-forms of aLA have revealed varying amounts of N-acetylglucosamine(GlcNAc), N-acetylgalactosamine (GalNAc), mannose (Man), galactose (Gal), fucose (Fuc), and N-acetylneuraminic acid (NeuAc), the quantities of Gal, Fuc, and NeuAc being relatively low [36]. This would suggest that the glyco-form of aLA identified in our study would possibly present two mannose residues; such findings evidently need to be confirmed by further experiments. The presence of 14 glycosylated forms of aLA, spanning from 15841.1 to 16685.3 Da have been evidenced in bovine milk [36].

Conclusions
In this study, we tested several RP-HPLC and MS parameters to optimise the analysis of intact proteins from bovine milk. The optimum quantitative method was successfully applied to two bulk milk samples from different breeds, Holstein-Friesian and Jersey to assess differences in protein concentration. For instance, kCN B-1P was significantly higher in Jersey milk relative to Holstein-Friesian milk; the trend was the opposite for kCN A-1-P. We are currently applying this method in a high-throughput fashion to numerous samples of milk from both breeds to study the impact of lactation cycle, diet regimes and genetic background. The method could also be incorporated into breeding programs to select cows displaying desirable protein variants and content for specific products e.g. cheese manufacturing or evaporated powdered milk purposes. Finally this method is not restricted to raw bovine milk since we have successfully applied it to UHT cow's milk, full cream goat milk and raw human milk samples (data not shown).
Supporting Information S1 File. Further background information pertaining to milk proteins, mass spectrometry and PTMs, along with the detailed methods employed in this study and the observation that the elution order of the protein variants does not follow the GRAVY index. (PDF)