Infrared Spectroscopy of Pollen Identifies Plant Species and Genus as Well as Environmental Conditions

Background It is imperative to have reliable and timely methodologies for analysis and monitoring of seed plants in order to determine climate-related plant processes. Moreover, impact of environment on plant fitness is predominantly based on studies of female functions, while the contribution of male gametophytes is mostly ignored due to missing data on pollen quality. We explored the use of infrared spectroscopy of pollen for an inexpensive and rapid characterization of plants. Methodology The study was based on measurement of pollen samples by two Fourier transform infrared techniques: single reflectance attenuated total reflectance and transmission measurement of sample pellets. The experimental set, with a total of 813 samples, included five pollination seasons and 300 different plant species belonging to all principal spermatophyte clades (conifers, monocotyledons, eudicots, and magnoliids). Results The spectroscopic-based methodology enables detection of phylogenetic variations, including the separation of confamiliar and congeneric species. Furthermore, the methodology enables measurement of phenotypic plasticity by the detection of inter-annual variations within the populations. The spectral differences related to environment and taxonomy are interpreted biochemically, specifically variations of pollen lipids, proteins, carbohydrates, and sporopollenins. The study shows large variations of absolute content of nutrients for congenital species pollinating in the same environmental conditions. Moreover, clear correlation between carbohydrate-to-protein ratio and pollination strategy has been detected. Infrared spectral database with respect to biochemical variation among the range of species, climate and biogeography will significantly improve comprehension of plant-environment interactions, including impact of global climate change on plant communities.


Estimation of pollen lipid content
The EMSC and baseline corrected spectra were used for the estimation of relative lipid content in pollen belonging to Pinus, Quercus and Iris genera. The analyses were based on the three vibrational bands: lipid band at ~1740 cm -1 , amide II band at ~1545 cm -1 , and sporopollenin band at 833 cm -1 . The average transmission spectra (based on three measurements per species) were scaled to amide II band (with sporopollenin band serving as a double-check for scaling), followed by spectral deconvolution (multipeak fitting of Gaussian and Lorentzian curves), and application of Beer-Lambert law, i.e. the linear correlation of absorbance (area under the curve with the lipid band as a centre) and quantity (Fig. S1). The obtained relative lipids content for the three genera are presented in Table 2.
The estimation of relative content of pollen lipids is based on the following approximations: 1) all congenital species have pollen grains with identical morphology, 2) sporopollenin and protein content in pollen for all congenital species is quantitatively invariant, 3) lipids content of pollen can be estimated by applying Beer-Lambert law on transmission spectra of the KBr sample pellets. Although the third approximation is quite far off due to diffuse reflection and saturated absorption, similar pollen morphology of congenital species will result in comparable deviations from Beer-Lambert law. The maximal values (Table 2) are probably underestimated regarding that saturated absorption is higher for samples with elevated lipids content. Different fitting parameters were assessed and the results typically varied approximately 10 % from the values reported in the Table 2.

Fig. S1
. Estimation of pollen lipid content for Iris pallida depicting bands for the normalization of transmission spectra (amide II band at 1547 cm -1 , and sporopollenin band at 833 cm -1 that was used as the secondary standard), spectral deconvolution (multipeak fitting of three Lorentzian and two Gaussian curves), and application of Beer-Lambert law, i.e. the linear correlation of absorbance (blue area under the curve with the peak center at ~1745 cm -1 ) and lipid quantity.

Pollen grain wall composition
Pollen grain has double-layered outer wall made of the cellulose-rich intine layer, and the highly resistant exine composed predominantly of a complex biopolymer sporopollenin. As a result of thick cellulose intine wall pollen grains of Cupressaceae family have unique and outlying spectral features (Fig. S3A). While Cupressaceae pollen grains have thick intine and thus increased relative amount of cellulose, pollens of Pinaceae family have a large hollow projection (saccus) from the central body of pollen grain composed only by the exine, resulting with increased relative amount of sporopollenin in the grains. For that reason the vibrational bands associated with the sporopollenin dominate to a larger extent in Pinaceae reflectance spectra than is the case with Cupressaceae spectra (Fig. S3D).

Triglycerides content of pollen
The spectral variability caused by the specific biochemicals can be extracted from the rest of the data (Fig. S7). That way, the influence of specific biochemical components on spectral variability, such as triglycerides, can be evaluated.
The data matrix containing complete set of transmission spectra was deflated by using either 1) the PC 1q vector (the first principal component obtained by PCA on quercus subset; i.e. set of transmission spectra obtained by measuring pollen samples of Quercus genus), or 2) reflectance spectra of tristearin (Fig. S7D). For the deflation the data matrix was centred while the vectors were normalized. The PC 1q vector accounts for the maximal spectral variance in congenital pollen samples of Quercus genus. Since both deflation procedures resulted with the similar reduced data sets, as seen by the PCA plots of this sets (Figs. S7B and C), it is clear that congenital differentiation of Quercus pollen, as well as Pinus and Iris pollen, can be based on the relative content of triglycerides. The impact of triglycerides on spectral variance can be more directly estimated from spectral differences of congenital pollen samples (Fig. S7E)  Fig. S7. PCA plots of transmission IR spectral data set (300 species, three spectra per specie; second derivative and EMSC corrected spectra) with depiction of plant genera: Pinus (blue), Iris (green) and Quercus (red). (A) Original data. The percent variances for the first five PCs are 23. 83, 16.13, 11.30, 6.11, and 5.24. (B) Modelling spectral contribution of triglycerides, approximated by tristearin spectrum. The percent variances for the first five PCs are 24.87, 14.48, 9.40, 7.17, and 5.91. (C) Modelling spectral contribution of PC 1q obtained by PCA on quercus subset. The percent variances for the first five PCs are 23. 19, 15.27, 7.43, 6.22, and 5.71. Note the similarity between figures B and C: It is clear that triglyceride content is responsible for the majority of congenital spectral variations. (D) reflectance IR spectra of tristearin (amorphous phase). (E) Transmission IR spectra of Quercus cerris (Turkey oak) and Quercus robur (pedunculate oak) pollen, and spectral differences of the two transmission spectra. For better viewing the spectra are offset.