Characterization of Lavandula spp. Honey Using Multivariate Techniques

Traditionally, melissopalynological and physicochemical analyses have been the most used to determine the botanical origin of honey. However, when performed individually, these analyses may provide less unambiguous results, making it difficult to discriminate between mono and multifloral honeys. In this context, with the aim of better characterizing this beehive product, a selection of 112 Lavandula spp. monofloral honey samples from several regions were evaluated by association of multivariate statistical techniques with physicochemical, melissopalynological and phenolic compounds analysis. All honey samples fulfilled the quality standards recommended by international legislation, except regarding sucrose content and diastase activity. The content of sucrose and the percentage of Lavandula spp. pollen have a strong positive association. In fact, it was found that higher amounts of sucrose in honey are related with highest percentage of pollen of Lavandula spp.. The samples were very similar for most of the physicochemical parameters, except for proline, flavonoids and phenols (bioactive factors). Concerning the pollen spectrum, the variation of Lavandula spp. pollen percentage in honey had little contribution to the formation of samples groups. The formation of two groups regarding the physicochemical parameters suggests that the presence of other pollen types in small percentages influences the factor termed as “bioactive”, which has been linked to diverse beneficial health effects.


Introduction
Lavender is the popular name for the plants of the genus Lavandula, Lamiaceae family. This genus contains many species, among which several are grown extensively in temperate climates for ornamental purposes, for use as aromatic herbs or for oil extraction. In beekeeping, the Lavandula honey is greatly appreciated by consumers due to its pleasant aroma and flavour. Recently, several research studies have studied the physicochemical and sensory properties of this honey, as well as those related to their bioactive compounds [1,2,3,4,5].
Monchique (n = 10), totalizing 112 Apis mellifera honey samples that were harvested by beekeepers and delivered to the laboratory, where these were kept at 25°C in the dark until analysis.

Samples' characterization
The physicochemical properties of honey samples were performed according to the methods previously described in detail [29,30]. The evaluated parameters were: moisture (%), ash (%), electrical conductivity (mS/cm), hydroxymethylfurfural content (HMF) (mg/kg), free acidity (meq/kg), diastase activity (Schade units/g), reducing sugars (%), apparent sucrose (%), pH and proline (mg/kg). The protein content (mg/kg) was determined according to the method described by Nogueira et al. [31]. The total phenolic content of honey samples was estimated following the Folin-Ciocalteau method [32]. For the total flavonoid determination, a method described by Kim et al. [33] and modified by Al et al. [34] for honey sample was used. For each honey sample we performed three replicates of each parameter (S1 Dataset).
All the samples were subjected to pollen analysis by acetolysis method [35]. The examination of the pollen slides was carried out with a Leitz Diaplan microscope (Leitz Messtechnik GmbH, Wetzlar, Germany) at 400× and 1000× in order to make a sound identification of the pollen types. A minimum of 1000 pollen grains were counted per sample. To recognize the pollen types, it was used the reference collection from the CIMO-Mountain Research Center (Agricultural College of Bragança, Polytechnic Institute of Bragança) and different pollen morphology guides. The following terms were used for pollen frequency classes: predominant pollen (P, more than 45% of pollen grains counted), secondary pollen (S, 16-45%), important minor pollen (IM, 3-15%) and minor pollen (M, 1-3%) [13] (S2 Dataset).

Statistical analyses
Mean, medians, percentiles, and standard errors of the means (SEM) for physicochemical parameters, bioactive compounds (total phenol and total flavonoid) and pollen data were calculated.
The data of the physicochemical variables, phenolic compounds and Lavandula spp. pollen percentage in honey were analysed by multivariate factor analysis. The pollen of Lavandula spp. was included in this analysis because not only was it found in all samples, but its relative frequency in honey was higher than 15%. Persano Oddo and Piro [1] and Gomes et al. [3] considered this percentage sufficient to characterize the honey as monofloral for Lavandula spp.
After we obtained the correlation matrix X'X, we performed a diagnosis of multicollinearity based on the condition number. The adequacy of the data for the multiple factor analysis was performed by using Bartlett's test of sphericity and the Kaiser-Meyer-Olkin measure of sampling adequacy. Regarding the technique of factor extraction, we used the principal component technique. The factors were extracted until we obtained the baseline of 60% of cumulative variance [36]. After we extracted the factor loadings, the factors were established by rotation by the varimax method.
The data obtained in the physicochemical and melissopalynological analyses of the honey were examined with Non-metric Multidimensional Scaling (NMDS), employing Euclidean distance after chord transformation. After we built the dissimilarity matrix with the normalized data, we used the command "metaMDS" to generate random and interactive processes to find the best solution possible. The goodness of fit measured of the NMDS was evaluated according to "stress" and Shepard diagrams. Next, we added the information from a result of clustering for ordering the NMDS. To do that, we calculated the clustering UPGMA of the dissimilarity matrix. For the estimate of the fitting between the dissimilarity matrix and the dendogram generated, we calculated the cophenetic correlation coefficient (CCC). After we identified the clusters, we tested the results of the physicochemical analyses using the t test of comparison of averages (p<0.05). All statistical analyses were performed using "R" statistical and programming environment version 3.0.2 [37].

Results and Discussion
To assess the quality of Portuguese honey samples and to determine their botanical origin, we carried out physicochemical analyses and determined the concentrations of the bioactive compounds. In the analyzed honey samples, all physicochemical parameters fulfilled the general honey quality standards established by the European legislation [38,39], apart from diastase and sucrose content. The diastase activity ranged from 4.55 to 25.21 Schade units (mean value of 12.02 ± 0.36 SEM), while the percentage of apparent sucrose was between 0.87 and 14.23% (mean ± SEM of 4.99 ± 0.31%). For the former parameter, 11% of the samples were not in agreement with the legislated limits; for the later the percentage was higher-24%.
In general, international legislation of honey determines that diastase activity should be not less than 8 Schade units and the sucrose content not more than 5% [38,39]. However, it must be noticed that for some particular honey types like monofloral from Lavandula spp., current norms and regulations allow values of diastase activity between 3 and 8 Schade units (since HMF content is inferior to 15 mg/kg) and contents of sucrose until 15% [38,39]. As such, it can be assumed that the assessed honey samples were in accordance with the legislation, because the HMF values were lower than 15 mg/kg and the relative frequency of Lavandula spp. pollen higher than 15%, which is sufficient to consider the botanical origin of the honey as monofloral of this plant [1,3].
The total phenols, flavonoids and proline concentrations presented great variations between samples, which can be confirmed by the highest estimates in the standard errors of the means. However, in spite of these differences, the content of proline was always bellow 180 mg/kg. This is important since according to Bogdanov [30] concentrations above that threshold could indicate adulteration or suggest premature honeys' harvest. Considering the other parameters, except for the maximum and minimum values, 90% of the samples had less variation (Table 1). In fact, the amount and type of bioactive compounds depends largely upon the floral source/ variety of the honey, seasonal and environmental factors, as well as conditions of processing and storage [40,41]. Factor analysis was performed to describe the original set of physicochemical variables in a smaller number of factors and to interpret through the factor loadings or model parameters the correlations between these and the original variables. The Kaiser-Meyer-Olkin sampling adequacy test was 0.61 and the Bartlett sphericity test was significant (p<0.001), indicating that the correlation matrix X'X is not an identity matrix and that there are significant sample correlations between the physicochemical variables, which is suitable for multivariate factor analysis [36]. The multicollinearity diagnosis indicated low collinearity (condition number < 100), so it was decided to not exclude any database variable.
Factor analysis indicated that 72.00% of the total variation of the physicochemical parameters can be explained by the overall effect of the main five factors. The first five factors have eigenvalues that correspond to 23.00%, 16.00%, 13.00%, 13.00% and 8.00% of the total variance (Table 2). Similar results were already reported by Marchini et al. [21] and Abadio-Finco et al. [22].
The final factor loadings, obtained by varimax rotation method, associated to each factor are presented in Table 3.
Factor 1, named "bioactive", was very strong and positively associated with the proline variable and very strong and negatively associated with the variables total phenols and flavonoids. Factor 2, designated as "minerals", was strong and positively associated with electrical conductivity, acidity and ashes variables. Factor 3, "botanical origin", had a higher positive association with the variables apparent sucrose and pollen of Lavandula spp. Factor 4, denominated "quality" was positively associated with the variables humidity, reducing sugars, pH and diastase. Finally, the factor 5, "other variables" had a strong positive association with HMF variable, and a negative association with the variables acidity and protein ( Table 3).
The factor analysis was not able to reduce the number of original physicochemical variables to produce ordinations of sampling sites in a two-dimensional graphic to classify honey samples. So, it was decided to use the NMDS to express the relation between variables, as well as between the sampling sites and the variables and, then, the cluster analysis.
The corresponding Shepard diagrams (Fig 1) indicate that as adjustment levels are raised, the distances portrayed in ordination space are more linearly related to those on which the calculations are based. In Fig 2 is presented a more similar group of samples, located on the right and with higher values for phenols and flavonoids, which are more correlated to each other. A second group of similar samples located on the left in the ordination presented higher values for the other physicochemical parameters, which are more correlated between them. In Fig 2, the associations between the variables were similar to the factor analysis (Table 3), indicating the similarity between the techniques for the obtained answers.
We extracted two groups using the criterion of the silhouette width from the result of cluster analysis of honey samples in relation to physicochemical parameters (Figs 3A and 4). The cophenetic correlation coefficient was 0.72, which is reasonable as a factor of hierarchical representativity [42].
In Fig 4 it can be observed that the green group on the right differed significantly from the red located on the left regarding the contents of flavonoids (p = 0.002) phenols (p = 0.0001) and proline (p = 0.0001). On the other hand, it were not detected differences amongst groups for the other physicochemical parameters, namely concerning protein content (p = 0.42) and sacarose (p = 0.30) ( Table 4). This highlights the great similarity amongst most of the samples, except for the "bioactive" factor extracted by factor analysis (Table 3). Also, it suggests that such differences on this factor may be due to the presence of other botanical families of pollen apart from Lavandula spp. (non significative, p = 0.273).
The botanical origin is one of the factors that most influences the content of phenolic compounds [43]. Even though we observed significant differences amongst groups regarding the "bioactives" factor, this does not collide with the prior characterization of the samples as  [10] also observed differences on the concentrations of phenols and flavonoids obtained for different samples harvested in different places from the same region and with identical botanical origin. In addition, Meda et al. [44] reported that the content of phenolics and proline differed significantly whether the samples were monofloral or multifloral.
Proline, which may constitute up to 70% of the free amino-acid pool in pollen grains, has long been regarded as playing a pivotal role to pollen vitality and fertility. Also, it has also been reported to be associated with pollination, since appear to have a strong preference for prolineenriched nectars [45]. In this context, the differences observed between groups regarding proline content (p = 0.002) also support the presence of other polinic types apart from Lavandula spp. that, as described above, may influence the concentration of bioactive compounds.   Table 4. Comparison between the groups extracted from the UPGMA cluster analysis of the chord distance matrix for physicochemical parameters of honey.

Variables
Grup 1 (n = 42) Grup 2 (n = 70) t p-value The concentration of bioactive compounds determined in this study for the samples of monofloral Lavandula spp. honey was high. Polyphenols are naturally occurring compounds found largely on plants that are generally involved in defense against ultraviolet radiation, oxidative stress or aggression by pathogens, presenting pleiotropic health beneficial effects and potential therapeutic applications [46].
The amino acid composition of food protein hydrolysates has been reported as a major determinant of their antioxidant properties. Indeed, hydrophobic amino acids, among which proline, leucine, valine and methionine, have been considered as very important in the protection against oxidative stress [47] reason by which their determination is an important complement whenever antioxidant activities of a food product are under assessment [44]. In addition, the assessment of the interaction between peptides and other chemical components also allows a more complete approach of the biological systems and their subsequent application on the diagnosis and treatment of human diseases where free radicals are implicated [48].
In order to determine the nectar source of the honey samples, we performed melissopalyinological analyses. We identified 25 pollen types out of 112 samples that we analysed, being Cistus spp., Echium spp. and Lavandula spp. identified in more than 50% of the samples. Eleven pollen types, Acacia spp., Cardus spp., Chamaespartium spp., Eucalyptus spp., Genista spp., Levatera spp., Medicago spp., Pinus spp., Quercus spp., Thimus spp. and Vicia spp. were detected in less than 10% of honey samples. The pollen of Lavandula spp. was detected in all samples, being its percentage higher than 45% in 23 samples and between 16-45% in the remaining 89 honey samples. The percentage of pollen grains that were not Lavandula spp. was greater than 45% in only one sample. In this honey, the percentage of pollen from Erica sp. was 52.56% (Table 5).
Although the pollen of Lavandula spp. was found in all samples, it had a higher percentage in those at the bottom of the ordination figure. The pollens of Castanea sativa, Erica spp. and Rubus spp. (secondary pollens in 13, 15 and 11 samples, respectively), and of Quercus spp. and Medicago spp. (detected in six and seven samples, respectively) are more closely associated with each other and with the samples located at the top of the ordination figure. Some pollen types, such as Acacia spp., Chamaespartium spp. and Pinus spp. had low frequency in the samples and lower contribution to the ordination (Fig 5).
We extracted two groups using the criterion of the silhouette width, from the result of cluster analysis of honey samples in relation to pollen types (%) found (Fig 3B). The cophenetic correlation coefficient was 0.75, which is reasonable as hierarchy representativeness factor (Sneath and Sokal, 1973). In Fig 6, it was verified a great similarity between the samples in respect to the percentage of pollen types found in honey, resulting in the formation of only two groups. A group of 18 samples, in green in Fig 6, represents the honey samples that had the lowest percentage of Lavandula spp. pollen, compared to the larger group in red. This same group had a greater association with Erica sp., Rubus sp. and Castanea sp., which were secondary pollens in many samples and pollen of Medicago sp. and Quercus sp., which were found in few samples and with a lower percentage in comparison to other pollen types (see Fig 5 and Table 5).

Conclusion
It was observed great similarity between honey samples for most of the physicochemical parameters with the exception of flavonoids, phenols and proline. The percentage of Lavandula spp. pollen grains determined evidenced that all the honey samples under study were monofloral for this specie. The higher percentage of flavonoids, phenols and proline registered in some honey samples may be due to the presence of other secondary pollen species in the samples, particularly Erica spp., Rubus spp. and Castanea spp. Additionally, higher percentages of Lavandula spp. pollen grains were associated with superior sucrose contents. Furthermore, the sucrose content may be a parameter to assist in determining the botanical source of honey.