Seasonality modulates the predictive skills of diatom based salinity transfer functions

The value of diatoms as bioindicators in contemporary and palaeolimnological studies through transfer function development has increased in the last decades. While such models represent a tremendous advance in (palaeo) ecology, they leave behind important sources of uncertainties that are often ignored. In the present study we tackle two of the most important sources of uncertainty in the development of diatom salinity inference models: the effect of secondary variables associated to seasonality and the comparison of conventional cross-validation methods with a validation based on independent datasets. Samples (diatoms and environmental variables) were taken in spring, summer and autumn in the freshwater and brackish ditches of the province of North Holland in 1993. Different locations of the same province were sampled again in 2008–2010 to validate the models. We found that the abundance of the dominant species significantly changed between the seasons, leading to inconsistent estimates of species optima and tolerances. A model covering intra-annual variability (all seasons combined) provides averages of species optima and tolerances, reduces the effect of secondary variables due to the seasonality effects, thus providing the strongest relationship between salinity and diatom species. In addition, the ¨all-season¨ model also reduces the edge effects usually found in all unimodal-based calibration methods. While based on cross-validation all four models seem to perform relatively well, a validation with an independent dataset emphasizes the importance of using models covering intra-annual variability to perform realistic reconstructions.


Introduction
Diatom assemblage changes provide an excellent basis for inferring environmental changes from seasonal to decadal or centennial scales given their sensitivity to a broad variety of habitat parameters. Available transfer functions cover nutrient status of freshwater bodies, temperature, as well as salinity dynamics [1][2][3][4][5]. Diatom-based models to infer salinity or tidal height have recently become an increasing focus in transfer function development as a potential tool in sea level reconstruction efforts [6][7][8]. These models require extremely high predictive precision given the often subtle changes they need to quantify. However, the application of transfer a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 functions to model a single variable (e.g. salinity) is problematic under multiple response triggers [9,10]. The effect of secondary gradients on model performance and importantly on the forthcoming reconstructions, has yet received surprisingly little attention [11]. Quantitative reconstructions are a form of space-for-time substitution [12], and as such require that the covariation between the modelled variable and potentially confounding underlying ecological factors is constant in both time and space. In most situations this is difficult to test, but nonetheless a highly unrealistic assumption. Firstly, this inherent problem of quantitative reconstructions is especially evident in relation to seasonality (referring to biological and chemical changes occurring in continental waters according to the different seasons in temperate climates). In diatom populations, seasonal succession is related to changes in nutrient concentrations, light [13], thermal stratification and predator-prey relationships [14,15]. In addition, diatom communities follow distinct seasonal succession patterns caused by changes in life-history traits [16,17] and nutrient stoichiometry [18]. Despite the high likelihood that a modern training set selected for building a transfer function will be influenced by the seasonality effects, the strength of such dependences is not known, yet. Only a limited number of diatombased studies have used contemporary data to facilitate interpretation of sediment core in terms of intra-annual diatom distribution [19]. The few studies available that have analyzed the effect of seasonality on diatom-based transfer functions generally focused on nutrient variables [19][20][21]. However, the effect of many other environmental variables can be obscured by seasonality, including salinity. The potential effect of seasonality is thereby extremely important, given the intra-annual variability of salinity that needs to be separated from the longterm changes [7].
A second limitation in the development of transfer functions is the lack of independent data sets to test the reliability of the reconstructions [22]. Usually the predictive ability of the transfer functions is only assessed by cross-validation methods. If the observations in the calibration set are not independent, because of autocorrelation or other types of pseudo-replication, performance statistics based on cross-validation will be over-optimistic [23]. Therefore, the ideal way of finding unbiased transfer-function performances is the use of an independent test set [11].
In the present study, we tackle two important uncertainty sources in diatom based transfer function development following the recommendations set outlined in Juggins [11]. We evaluate (1) the effect of seasonality when modeling a single variable, e.g.salinity, regarding the "true" ecological meaning of statistically significant models and (2) compare model performance between conventional cross validation methods and independent validation dataset. Both approaches aim to detect confounding environmental factors on the primary variable of interest.

Study area and datasets
The study was conducted in the province of North Holland, the Netherlands. The samples were collected in modified wetlands that have been reclaimed for agriculture representing a system of shallow brackish and freshwater ditches draining water from low-lying areas. Water tables in the three areas are kept within strict limits and the banks are bordered by reed belts dominated by Phragmites australis. Most of the diatom species and macrophytes found in these modified wetlands are also found in mesotrophic and eutrophic European lakes. Detailed description of the study area is provided in [24].
The dataset for the development of the diatom inference models is referred as¨training setä nd consists of a total of 96 samples from 32 locations sampled in spring (March), summer (June) and autumn (September) 1993. The samples were collected randomly in the province of North Holland spanning a salinity gradient from 200-9000 mg/l chloride. As a result, salinity represents the most important gradient in the dataset. Water samples were collected monthly at the same locations as the diatom samples and measured for: surface water oxygen, pH, conductivity, chloride, sulphates, transparency (Secchi depth), total nitrogen, total phosphorus and chlorophyll-a (results are summarized in Table 1). In this dataset, 400 diatom valves were identified per sample.
The dataset used for testing the performance of the models (henceforth referred to as "validation dataset") consists of 90 samples collected in drainage ditches within the framework of a long-term monitoring program in North-Holland carried out by the local water authority during spring 2008-2010. In this dataset, 200-300 diatom valves were identified per sample, according to the standard for routine monitoring in The Netherlands. Fig 1 shows the sampling locations for both datasets. In the validation dataset, water samples were collected monthly. The determination of chloride, nutrients and other environmental variables followed the standardized national protocols accredited by the Dutch Standards Institute (NEN-EN 13946). We used chloride average values of the spring months (i.e. March through May), assuming that these values reflect local conditions better than single measurements. A summary of the environmental variables measured in the validation data set can be found also in Table 1.
In both datasets, diatoms were sampled from reed stems (Phragmites australis), which is the most abundant and common emergent macrophyte in the drainage ditches. Reed is the recommended substratum for sampling periphyton in the Netherlands [25] in order to avoid differences caused by substratum heterogeneity. Attached diatoms were prepared using H 2 O 2 digestion and mounted on microscope slides with Permount Mounting Medium (Fischer Scientific, Pittsburgh). Taxonomic identification was based on the volumes of Krammer and Hoffmann [26,27] following standard protocols (NEN-EN 14407). Species percentage data were square-root transformed to reduce the weight of dominant species prior statistical analysis.

Data analysis
The effect of seasonality on diatom community structure. We performed two different tests to quantify the effect of seasonality on diatom community structure: 1) permutational multivariate analysis of variance (PERMANOVA) [28] to reflect changes in community composition (species identity and abundances) and 2) analysis of multivariate dispersions [29] that reflects changes in community heterogeneity or beta diversity.
PERMANOVA analyzes the variance of multivariate data explained by a set of explanatory factors on the basis of any dissimilarity measure of choice, thereby allowing for a wide range of empirical data distributions. The null hypothesis tested by PERMANOVA is that, under the assumption of exchangeability of the sample units among the groups:"the centroids of the groups are equivalent for all groups" [30]. The test of multivariate dispersions explicitly examines that ''the average within-group dispersion (measured by the average distance to group centroid) is equivalent among the groups". This test is equivalent to the popular Levene's test in univariate ANOVA but applied to the study of species assemblages by using dissimilarity indices [31].
The statistical significance of multivariate variance components were each tested using 9999 permutations of residuals under a reduced model [28] with an a priori chosen significance level of α = 0.05. All multivariate analyses were done on the basis of a Bray-Curtis dissimilarity matrix calculated from square-root transformed relative abundance data. To visualize multivariate patterns in assemblages across the three seasons, non-metric multidimensional scaling (NMDS) was used as an ordination method.
To determine which individual species contributed most to the differences between the seasons we used the species contribution to similarity method (SIMPER), which measures the percentage contribution of each species to average dissimilarity between two groups [7,32]. One Way Anova was performed to test if environmental variables differed between spring, summer and autumn. These analyses were performed in R with the vegan package [33].
The development of diatom based salinity transfer functions. The weighted optima of each species along the salinity gradient (chloride) were determined by averaging all values for each variable from the sites where the taxon occurred, weighted by its abundance at each site. The taxon's tolerance along each gradient was then calculated as an abundance-weighted standard deviation of the environmental variable [34].
We measured the explanatory strength of salinity as a predictor of diatom assemblage composition by calculating the ratio of the eigenvalue of the first (constrained) CCA axis (λ1) with salinity as a single explanatory variable with the first unconstrained axis (λ2). A value of λ1/λ2 greater than 1 indicates that the variable of interest represents an important ecological gradient in the training set and meets the criterion for a "useful calibration" [11]. To assess the potential confounding effect of other explanatory variables, we performed hierarchical partitioning ordination with the full suite of environmental variables [11]. The analysis were performed in Canoco 5.0. [35].
The WA-PLS regression [36] with "leave one out" cross-validation was used to develop statistical prediction models. This method combines the features of weighted averaging (WA) and partial least squares (PLS) and uses the residual correlation structure in the data to improve the fit between the biological data and environmental data in the training set [37]. The predictive abilities of transfer functions were assessed by examining the relationship between the observed and diatom-inferred values, as well as the observed and leave one outestimated values of the variables of interest in the training set (r 2 apparent and r 2 leave-one-out ), and evaluation of root mean square error of prediction (RMSEP) [38]. We have worked with the smallest number of 'useful' components, these are the ones giving a reduction of 5% or more in the cross-validated RMSE (compared with the RMSE for the 1 component model) [39].
We further evaluated the ability of the model to predict salinity through an independent validation: We used the data from the training set to develop transfer functions and tested their accuracy using the validation dataset. These analyses were performed using C2 version 1.7 software. [40].

Results
A total of 408 species were identified in the training set of which 179 species covered 98% of all observations and therefore only these were used to develop the models. In the validation dataset 253 species were identified of which 145 already covered 98% of the data set and were used for the reconstructions.
The PERMANOVA analysis revealed that diatom community composition is affected by seasonality, in terms of overall differences in species abundances: spring-summer p = 0.001; spring-autumn p < 0.001; summer-autumn p = 0.01 (Table 2 and Fig 2). The species that contributed the most to the differences between the seasons are common species present in all three sampling seasons, as it is revealed by the SIMPER method (Table 3). In contrast, the analysis of multivariate dispersions showed that seasonality does not affect beta diversity or community heterogeneity ( Table 2). In summary, these analyses indicate that the seasonal differences in diatom community structure are due to changes in species abundances and not to species identities. The optima and tolerances of species that contribute the most to community dissimilarity are shown on  [41]. Only two species have Hills N2 values lower than 5 and none of them belong to the all-season model: Mastogloia pumila 3.82; 3.51; 3.66 respectively in the Spring, Summer and Autumn models and Diatoma moniliformis 3.09; 1.66; 3.33 respectively in the Spring, Summer and Autumn models.
The environmental variables NO 3 , NH 4 , TN, TP. Cl and Chl a were log-transformed to achieve normality, in order to reduce skewed distribution of the response variable and hence reduce variance heterogeneity.
If instead of using averages of environmental variables of the summer months we have used whole year averages of environmental variables, none of them was sufficient in explaining diatom composition in any of the individual seasonal datasets. Therefore, environmental variables were calculated per season and the CCA were produced for the corresponding seasonal diatom dataset. Salinity thereby exhibits the strongest explanatory power (λ1/λ2>1) for diatom community composition in the training dataset: λ1/λ2 spring = 1.16; λ1/λ2 summer = 1.29; λ1/ λ2 autumn = 1.25 and λ1/λ2 all = 1.22 (Fig 4). Only two environmental variables showed statistical significant differences between the seasons according to One way ANOVA analysis: tN (F: 3.37; P = 0.038) and pH (F: 6.76; P = 0.002). Hierarchical partitioning revealed that in all seasons sampled, salinity is the variable that explained most of the variation on diatom community composition. From the total explained variation, salinity explained 25.8% in the spring model (unique is 9.5%), 27.8% in the summer model (unique is 9.7%), 25.2% in the autumn model (unique is 7.2%) and 35% in¨all-seasonm odel (unique is 9.7%) (Fig 4). The similarity and high percentage of total explained variation by chloride and sulfates indicates the correlation between these two variables. Also relatively high and expected correlation is found with conductivity (Table 4). Because chloride correlates to sulfates and conductivity in a different degree according to the model, it was not possible to use any rule of thumb to discard collinear variables because otherwise, models cannot be compared. Therefore, we decided to keep all environmental variables measured in all models to see the overall pattern of variance partitioning and be able to compare the three models.
The WA-PLS technique revealed that the most parsimonious models for salinity inference would be the one-component model for spring and autumn and the two-component model for summer and all seasons combined. The relationship between observed and predicted values of salinity were very strong with no apparent outliers (Table 5; Fig 5). The distribution of the most common species along the salinity gradient is shown in Fig 6. The validation of the model using an independent dataset is shown in Fig 7. The models to some extent overestimate salinity values at the low end of the gradients and underestimate the values at the high end, which was evident in the residuals vs. observed salinity plot (S1 Fig). The edge effect is clearly improved by the¨all-season¨model, providing the most accurate reconstruction out of the four models (Fig 7).

Discussion
Our study shows that seasonality effects were responsible for important shifts in species dominance in our training set, as it is evident from the highly significant PERMANOVA test results. Changes in species relative abundance during seasonal succession are usually advanced by the shift from early colonizers such as the adnate species Navicula perminuta [42] or poor competitors (Ctenophora pulchella, Gomphonema olivaceum) to species competing well under a resource limitation in late-successional communities (Cocconeis placentula, Gomphonema parvulum) [43][44][45]. Intra-species competition can be expected to play an important role in the nutrient-limited situation commonly occurring during late summer in freshwater and brackish environments [46]. Our data showed that total nitrogen significantly decreased in summer and autumn compared to spring. This could reflect changes in nutrient stoichiometry which might be driving differences in seasonal species abundances. Besides competition under resource limitation, inter-specific interactions, such as grazing, are also important factors driving seasonal shifts in species abundances [14]. As a result of these changes in species abundances, optima and tolerances of species are different if the samples are collected in spring, summer or autumn, even when salinity did not show statistically significant changes between the seasons (Fig 3). The species collected in summer tend to have higher optima estimation, the ones collected in autumn tend to have lower optima while the ones collected in spring showed both, higher or lower optima than the ones provided by the¨all-season¨model. Further investigation is needed in order to confirm if this     seasonality effect can be generalized for the variable chloride and how the seasonality effect will be regarding the optima estimation of other environmental variables. In our dataset, such differences in the contribution to seasonal community dissimilarity were observed for Cocconeis placentula and Rhocoisphenia abbreviata with optima ranging from 1300-2800 and 900-1600 mg/l mg/l chloride based on autumn and spring models, respectively. The effect of seasonality investigated here emphasizes a crucial aspect negatively affecting the accuracy of transfer functions. That is, that almost in every environmental variable measured, an important fraction of explained variation is shared with other environmental variables [11]. In our study, even when salinity has the highest contribution to the total percentage of explained variation, still around half of it is shared with other environmental variables, such as sulphates, conductivity, secchi depth, total phosphorus and temperature. As it is expected, high correlation between salt ions (chloride and sulphates) and between ions and conductivity were found in all training sets (spring, summer and autumn). Especially prone to be affected by secondary variables is the training data collected in summer and autumn where chloride concentration is significantly related to nutrient variables (total phosphorus), temperature and secchi depth (Table 4). In summer, phosphorus released from the sediment increases due to sulfate induced phosphorus mobilization [47], and this is reflected in the positive correlation between ions and total phosphorus. As a consequence of increased water nutrients, visibility is also affected. Both, the spring dataset and the dataset that combines the three seasons, are less prone to the effect of secondary variables (e.g. nutrients) and showed lower correlation with potentially nuisance variables. Moreover, the¨all-season¨model showed the highest percentage of explained variation by salinity (Fig 4). This model also improves the non-linear distortions at the end of the gradients (edge effects) that are an inherent problem of all unimodalbased calibration methods using weighted averaged estimations [7,36,39,48] (Fig 5). Although the weighted inverse deshrinking regression incorporated in WA-PLS reduces the edge effect, it has its own problems by "pulling" the predicted values towards the mean of the calibration set resulting in an inevitable bias with some over-estimation at low values and some underestimation at high values, as it is evident from S1 Fig [36 ,39,49]. In the¨all-season¨model this bias is reduced because the salinity signal is stronger (the total explained variation by salinity is much higher as compared to the other variables) and also because the number of samples has increased from 32 to 96 (which improved species optima estimation). The dataset size of thë all-season¨model falls into the range of � 100 suggested by   [50] for the greatest improvement of the RMSEP. To verify whether the good performance of the¨all-sea-son¨model resulted from the higher sample number as compared to the other models, we also built the all-season model by randomly selecting one sample out of the three seasons in all sampling locations. The result of the model performance is the same: r 2 leave-one-out = 0.86; RMSEP = 0.19.
When we tested the performance of the models using the completely independent validation dataset, we also found that the¨all-season¨model performed the best. In the reconstructions all models tend to underestimate the values at the high end of the gradient, and the underestimation is in increasing order from the summer, spring and autumn models (due to the decreasing trend in optima estimation). However, the underestimation of the high salinity values is the lowest in the¨all-season¨model. As a result, this model provides the most realistic reconstruction even though some edge effects remain (Fig 7).
The reasons for this offset are several: i. the effect of secondary variables in the training sets. The significant correlation between salinity and potential nuisance variables in summer and autumn (total phosphorus, secchi depth and temperature) lies between 0.32-0.54 R 2 values (Table 4). According to Juggins (2013), for training sets that exhibit a correlation 0.2< R <0.5 to a nuisance variable, a change in the latter can produce spurious fluctuations in the reconstructed variable typically greater than 10 units (regarding a simulated environmental variable ranging from 1 to 100).
In addition, even a low correlation to a nuisance variable (R <0.2) affects the reconstruction in at least 10 units. This corresponds to 0.1 of the gradient length studied in Juggins [11] which in this study will be equivalent to �880 mg/l of chloride change. In the independent validation we have found a bias of less than 500 mg/l chloride in around 68%-77% of the observations in the four models and a bias between 2000-4000 mg/l chloride in around 12% (autumn model) -4.5% (¨all-season¨model) of the observations.
ii. the ratio λ1/λ2 in the validation dataset is lower than in the models (λ1/λ2 validation = 0.88; S2 Fig). The significant correlation between salinity and nuisance variables in the validation dataset (total nitrogen and dissolved oxygen- Table 4) could be the reason of this lower ratio and of obscuring the effect of chloride in the distribution of diatom communities. When transfer functions are used to perform palaeoenvironmental reconstructions, the strength of the relationship between the species and the modelled variable is not known. We have in the other hand, ways of reducing the error of prediction by taking seasonality effects in to account when developing diatom based transfer functions.
iii. there are always some species mismatches between the transfer function and the reconstruction data which lessen the accuracy of palaeoenvironmental reconstructions. In our study, indicators of high salinity in the models (Mastogloia pumila, Rhopalodia brebissonii, Navicula arenaria) were not present or very scarce in the validation dataset. This could be due to problems in the taxonomic identification (e.g. species of the genus Navicula), differences in the number of individuals identified between the model and the reconstruction datasets, or due to ecological factors not taken into account (dispersal, grazing, etc). In addition, brackish species usually have very wide tolerances due to the fact that these species are well adapted to the fluctuating environmental conditions in shallow brackish habitats [51]. An example are species of the genus Diatoma [52]. Other species such as Rhoicosphenia abbreviata and Cocconeis placentula were present in the whole salinity gradient. These are pollution tolerant species [53] that do not respond to the salinity changes considered in this study and thus, may also introduce noise in the reconstructed values. This could be the reason why the greatest bias in model performance occur in the high end of the gradient. High salinity observations (>2000 mg/l chloride) represent only 12% of all the samples in the validation dataset and euhalobe or polyhalobe diatoms may be underrepresented. With the general aim to collect representative community data, not only the taxonomic identification but also the quality of the indicator may have a considerable impact in (palaeo) environmental reconstructions. It has been suggested that species with very wide tolerances to the environmental variable of interest may therefore be eliminated for transfer functions to increase model performance [54]. This is a very interesting field of research and deserves further investigation, so that appropriate recommendations can be made towards further standardization. The results presented in this study highlight the risk of developing transfer functions based on environmental variables and/or diatom samples collected in individual seasons, only. In palaeocological studies, sediment core samples contain diatom assemblages that have already been subject to temporal and spatial integration. However, the development of the transfer function always required a modern data set. Despite the high number of studies stressing the potential effect of confounding environmental variables and the application of temporal dependent calibrations [11,55,56] the effect of seasonality on the accuracy of diatom-based transfer functions has so far received only little attention. Environmental datasets based on samples taken during one season or calibrated against single measurement are very common [6,57,58] while in other cases sampling time is not specified [3]. Potential inaccuracies and low predictive skills of models based on single season data from sites with highly variable hydrology and water chemistry were proposed already [56]. Examples include diatom calibration data sets that use environmental measurements of annual means that do not necessarily reflect the specific concentrations present when diatoms are growing [20]. Regarding the diatom data, it is common to find in the scientific literature samples that do not integrate seasonal variability. For example when diatom samples are collected from substrates such as macrophytes, stones, pebbles or other hard substrates [7,59,60]. In addition, some of the data sets publicly available to use in transfer function development (European Diatom Database) are also composed by a mixed samples taken from epipelic, epiphytic or plankton samples, and used in several publications. If we compare the most accurate model (¨all-sea-son¨) with the less accurate (autumn model) in our study, we found an underestimation of chloride concentration about 42% and 70% respectively. The effect of seasonality is increasing the error and uncertainty in the model estimation, in addition to the error caused by nuisance variables. The results of our study, and specifically from the independent validation, demonstrate the importance of a comprehensive dataset that takes on board the differences in species abundances due to seasonality to improve the accuracy of diatom based transfer functions.

Conclusions
Assuming that diatom inference models are invariant in space and time has important consequences for the reliability of environmental reconstructions. The present study has shown that seasonality modulates the predictive skills of diatom transfer functions by driving changes in the abundance of the dominant species. Transfer function performance of a single season model (e.g. spring, summer or autumn) is reasonably good in the model cross-validation. Nonetheless the validation of the models with an independent dataset demonstrates that single season sampling leads to strong underestimation of salinity due to the confounding effect of other environmental variables such as temperature, secchi depth, nutrients and dissolved oxygen. The strength of salinity as the main environmental variable explaining the distribution of diatom communities is enhanced in the¨all-season¨model. The¨all-season¨model combines all data and provides the highest percentage of explained variation by salinity, increases the accuracy of species optima and tolerance estimations, and reduces the non-linear distortions at the end of the gradients.
One of the basic requirements for quantitative palaeoenvironmental reconstruction is the availability of a large, high-quality data set of modern species assemblages and contemporary environmental data. Given the strong effect of seasonality on the composition of diatom assemblages, the development of transfer functions capturing intra-annual variability of species abundances and independent model validation is strongly recommended.  different datasets (spring, summer, autumn, all seasons and validation datasets). The ratio of λ1/λ2 is also shown for each case. (JPG)