Predicting the Electron Requirement for Carbon Fixation in Seas and Oceans

Marine phytoplankton account for about 50% of all global net primary productivity (NPP). Active fluorometry, mainly Fast Repetition Rate fluorometry (FRRf), has been advocated as means of providing high resolution estimates of NPP. However, not measuring CO2-fixation directly, FRRf instead provides photosynthetic quantum efficiency estimates from which electron transfer rates (ETR) and ultimately CO2-fixation rates can be derived. Consequently, conversions of ETRs to CO2-fixation requires knowledge of the electron requirement for carbon fixation (Φe,C, ETR/CO2 uptake rate) and its dependence on environmental gradients. Such knowledge is critical for large scale implementation of active fluorescence to better characterise CO2-uptake. Here we examine the variability of experimentally determined Φe,C values in relation to key environmental variables with the aim of developing new working algorithms for the calculation of Φe,C from environmental variables. Coincident FRRf and 14C-uptake and environmental data from 14 studies covering 12 marine regions were analysed via a meta-analytical, non-parametric, multivariate approach. Combining all studies, Φe,C varied between 1.15 and 54.2 mol e− (mol C)−1 with a mean of 10.9±6.91 mol e− mol C)−1. Although variability of Φe,C was related to environmental gradients at global scales, region-specific analyses provided far improved predictive capability. However, use of regional Φ e,C algorithms requires objective means of defining regions of interest, which remains challenging. Considering individual studies and specific small-scale regions, temperature, nutrient and light availability were correlated with Φ e,C albeit to varying degrees and depending on the study/region and the composition of the extant phytoplankton community. At the level of large biogeographic regions and distinct water masses, Φ e,C was related to nutrient availability, chlorophyll, as well as temperature and/or salinity in most regions, while light availability was also important in Baltic Sea and shelf waters. The novel Φ e,C algorithms provide a major step forward for widespread fluorometry-based NPP estimates and highlight the need for further studying the natural variability of Φe,C to verify and develop algorithms with improved accuracy.


Introduction
Accurately evaluating the impact of local environmental and global climate change upon trophic dynamics and biogeochemical nutrient cycling is fundamentally tied to how well primary productivity, defined here as carbon (CO 2 ) fixation, is characterised. Following the incorporation of inorganic radio-labelled 14 CO 2 into algal cells has become a standard method for quantifying primary productivity. However, to this day, there still exists considerable uncertainty as to whether 14 CO 2 uptake measures gross primary productivity (GPP) or net primary productivity (NPP). Following the traditional view, GPP refers to carbon fixation without accounting for any carbon losses due to respiration and/or excretion, while NPP represents the carbon uptake rate after subtracting out any CO 2 lost to oxidation of organic carbon over a diel cycle [1].
Of all global NPP, marine ecosystems account for ca. 50% [2], an amount equivalent to ca. 51?10 15 g of fixed carbon per year [2,3]; almost all of this productivity is from phytoplankton. However, marine ecosystem-scale productivity estimates contain a high degree of uncertainty, since they are ultimately derived by extrapolation from discrete measurements of NPP or GPP [4,5] with limited spatial and temporal resolution. Remote sensing (ocean colour) productivity algorithms are the most widely used tool for making these extrapolations in order to better characterise the nature and extent of variability in NPP in marine ecosystems. However, these productivity algorithms are explicitly dependent on relatively few discrete, surface ''truth'' 14 C primary productivity measurements. Many researchers have, therefore, turned to high resolution bio-optical-based approaches in order to meet this challenge.
Pulse Amplitude Modulated (PAM; [6]) and Fast Repetition Rate (FRR; [7,8]) fluorometry provide the potential to dramatically increase the number of estimates of NPP in marine ecosystems [8]. As with other bio-optical sensors, active fluorometry can be utilised in situ and thus, measurements of productivity can be linked directly to measurements of physical/chemical variables at the time of sampling [9][10][11]. In addition, data can be collected at high temporal (seconds) and spatial resolution and/or over large scales [12]. If direct, in situ measurements of NPP using active fluorescence could be achieved, the advantages of this approach would represent a step change in operational capacity, in terms of accuracy and resolution, compared to 'conventional' 14 C-based approaches, which are limited by the need to incubate relatively large water samples, often for long durations [13], with attendant potential problems due to 'bottle effects' [14,15]. Thus, the widespread implementation of active fluorometers on broadscale temporal or spatial sampling platforms, such as ships of opportunity and moorings would represent a major advance in evaluating the variability of primary productivity in seas and oceans.
Active fluorescence can be used to estimate the rate at which electrons flow from water through photosystem II to NADPH (the so-called linear photosynthetic electron transfer rate, ETR), and other electron acceptors. ETR is most closely related to the rate of gross O 2 evolution, and potentially less directly to the subsequent production of energy (ATP) and reductant (NADPH) used to fix CO 2 . Thus, measurements of ETR by active fluorescence, at best, provide accurate estimates of the rate of gross O 2 evolution from PSII [16,17] and, hence, GPP. Unfortunately, many applications (e.g. climate models, fisheries assessments) require that primary productivity be expressed not as ETR of O 2 evolution but rather in the photosynthetic currency (sensu Suggett et al. [18]) of fixed CO 2 , that is, NPP. As such, applicability of broad-scale active fluorescence-based measurements of ETR are potentially limited if ETRs cannot be easily converted to GPP and eventually NPP. Such conversion requires knowledge of the ETR to CO 2 fixation ratio, i.e. the electron requirement for carbon fixation (W e,C ) with units mol e 2 (mol C) 21 . With ETR and short-term (hours) CO 2 fixation incubations representing GPP rather than NPP [19,20], W e,C also captures gross rather than a net efficiency.
Since the introduction of active fluorescence to marine research, a number of studies have attempted to compare measurements of ETR with (quasi-)simultaneous measurements of CO 2 uptake, the latter representing GPP, NPP or something in between depending on incubation times (1 hour to a full diel cycle) [10,[21][22][23] ( Table 1, references therein). Initially, these exercises aimed to evaluate whether ETRs provided a robust quantification of GPP [32][33][34][35]. More recent studies have sought to understand the extent and nature of variation between the ETR and GPP and/or NPP [10,11,18,36]. Numerous biochemical processes other than CO 2 fixation can act to consume electrons, ATP and/or reductant; for example, the oxygenase activity of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) via photorespiration [32], chlororespiration via a plastid terminal oxidase (PTOX) [33,34], Mehler Ascorbate Peroxidase (MAP) activity [32] and nutrient assimilation [35,37]. All these processes are expected to exhibit both taxon-specific and environmental dependencies with corre- , Ariake Bay (Japan) and Bedford Basin (Canada). Methodological differences existed in the way data were corrected for spectral discrepancies between the fluorometer and 14 C-incubator light source (s PSIIspec ), in the estimates of the number of photosystem II (n PSII ), and in the approach used to compare electron transport rates (ETR) and 14 C fixation. + spectral corrections applied according to Moore et al. [11]. In some samples, a spectral correction factor of 1.75 was used while a spectral correction was n.a. denotes not applicable because primary productivity was measured on samples incubated in the cuvette holder of a FASTact Fluorometer. n PSII was assumed to be constant (0.0033 or 0.0020 mol RCII (mol chla) 21 ) or calculated either according to ({) n PSII = 500 (F v /F m )/0.65 or ({) Oxborough et al. [7]. ETRs were either measured in situ or on discrete samples using a 1 FAST trecka , or 2 FAST act FRR fluorometer, a 3 FIRe benchtop fluorometer or a 4 FRRF Diving Flash . Superscript 5 denotes photosynthesis irradiance curves measured on samples incubated for 1-4 hours using a photosynthetron [32] or equivalent incubator, 6  sponding variability of W e,C [11,18]; indeed, a recent compilation of published FRR-based W e,C data sets suggested the existence of some general taxonomic patterns in the variability of W e,C [38]. However, as yet, no single (global) systematic evaluation attempted to consider how environmental gradients regulate W e,C . Clearly, active fluorometry could become a much more powerful tool for estimating GPP (or even NPP) in carbon currency if generic relationships describing the dependency of W e,C upon routinely measured variables can be established. Therefore, we constructed a database of FRR-based measures of W e,C and associated environmental variables known to regulate primary productivity, (e.g. temperature, nutrients and light) from both previously published and unpublished data sets and used a metaanalytical approach to determine the predictability of W e,C from environmental variables. Specifically, we addressed the following three questions: (1) To what extent does W e,C vary within and between oceanic areas and water masses? (2) Can a 'global' W e,C ever be applied or should region-specific values always be used instead? (3) How is the variability of W e,C related to that of environmental factors or combinations thereof (light attenuation, nutrients, temperature, salinity, etc.)? Our analysis demonstrated that variability in W e,C was strongly correlated with the availability of light and nutrients, as well as temperature and/or salinity, albeit to varying degrees and depending on how data are organized into region-specific subsets. Predictive algorithms based on these relationships are presented. Treatment of data at the regional scale provided much improved predictive capability of these algorithms relative to large scale or global algorithms. Although many of the algorithms still need further verification and revision, this approach demonstrates strong relationships between W e,C and environmental gradients in many areas of the ocean.

Data compilation
A comprehensive assessment of the variability in W e,C from published literature and previously unpublished data was performed. Web of Science and JSTOR search engines were used to retrieve published data on coincident FRR fluorescence-based ETRs and carbon fixation rates from marine habitats, yielding 17 studies. However, only 7 of these studies reported key environmental variables in addition to W e,C or both ETRs and CO 2 -fixation rates (resembling either NPP or GPP depending on the incubation lengths and growth conditions), from which W e,C could be calculated. Previously unpublished data corresponding to parallel ETR and CO 2 fixation measurements from an additional 7 field campaigns in 2004-2011, some of which were undertaken as part of PROTOOL (http://www.protool-project.eu/), were also included (Table 1). Together, these data sets included ten research cruises and four time series studies covering a range of different geographical areas (Fig. 1), including the temperate, tropical and subtropical Atlantic Ocean (AMT6, AMT11, AMT 15), Massachusetts Bay (USA), Bedford Basin (Canada), Ariake Bay (Japan), the Celtic and Irish Sea (D246, JR98), the North Sea (CEND0811/PROTOOL, Gulf of Finland (including SUPRE-MO11/PROTOOL), the Baltic Sea (SYNTAX 2010/PRO-TOOL), UK and European shelf waters (D366/PROTOOL) and the Pacific Ocean (BIOSOPE).
A comprehensive data matrix with 333 different samples and their corresponding physico-chemical and methodological variables was created. Physico-chemical variables included salinity, temperature, nutrient concentrations (NO 3 2 and PO 4 32 ), the diffuse vertical attenuation coefficient (K d ) of photosynthetically active radiation (PAR), optical depth (f) and chlorophyll a (chla), with the latter being used as a proxy for phytoplankton biomass. We fully acknowledge that other environmental variables, such as the silicate or iron are often also key in determining phytoplankton community composition and physiological responses and, hence, influence W e,C . However, these data were either not collected as part of the studies included here or not available. Methodological variables included differences in FRR protocols and 14 C incubation techniques (see below). Locations of samples were characterized by latitude and longitude, while seasonal differences were characterized by converting sampling dates to Julian day. All the environmental data were either kindly provided by the authors of the published work, the British Oceanographic Data Centre (BODC, www.bodc.ac.uk) or were taken from the original publications and digitized (Plot Digitizer 2.5.0, Free Software Foundation) from relevant figures.
Values of K d (in units of m 21 ) were derived from vertical irradiance profiles for the AMT cruises, Bedford Basin, the Celtic Sea (JR98), Baltic Sea (SYNTAX2010), Ariake Bay, the Gulf of Finland [29], the UK-Ocean Acidification cruise (D366) and the Pacific Ocean (BIOSOPE2) dataset.
For the remaining studies, MODIS 4 km satellite products of euphotic zone depth (z eu ), here defined as 1% of surface PAR, produced by the Giovanni online data system (NASA Goddard Earth Science Data and Information Services Center) averaged over 8 days were used to calculate K d by solving for K d with E 0 set to 100%. Multiplying K d by the actual sampling depth then gives f (dimensionless), with f,4.6 corresponding to irradiance levels .1% [39]. For data collected prior to 2002 (Massachusetts Bay), SeaWIFS 9 km z eu products were used for calculating K d and subsequently f.

FRRF measurements
FRR fluorescence transients were either measured in situ or on discrete samples on-board using FAST tracka I or FAST tracka II fluorometers with FAST act systems (Chelsea Technologies Group, Ltd., West Molesey, UK), FIRe benchtop instruments (Satlantic, LP, Halifax, Canada) or a FRR Diving Flash fluorometer (Kimoto Electric Co., LTd., Osaka, Japan) (see Table 1). FRR fluorometers were routinely programmed to generate a standard protocol with 50-100 single turnover (ST) saturation flashlets of 1.1-3.3 ms duration at 1-3.6 ms intervals [18,40]. Each induction curve was separated by ,10 ms. To increase signal to noise, between 5 and 160 sequential induction curves were averaged per acquisition.
The biophysical model of Kolber et al. [41] was fitted to all fluorescence transients to derive the initial fluorescence (F 0 ) and maximal fluorescence (F m ) yields measured in the dark, the minimal (F 0 ' ), steady state (F' ) and maximal (F m ' ) fluorescence yield measured under ambient irradiance, as well as the functional absorption cross section of photosystem II (PSII) in the dark (s PSII ) and light (s PSII ' ) (in units of Å 2 quanta 21 ). For most studies, the photosynthetic electron transfer rate through PSII (units of mol e 2 (mg chla) 21 h 21 ) [42,43] was calculated as: where E is light intensity, n PSII the ratio of functional PSII reaction centres to chlorophyll (in units of mol RCII (mol chla) -1 ), W RC an assumed constant of 1 electron yielded from each reaction centre II (RCII) charge separation and 2.43 ? 10 -5 is the factor that accounts for the conversion of Å 2 quantum -1 to m 2 (mol RCII) -1 , mol chla to mg chla, seconds to hours and mmol quanta to mol quanta [13,29]. Measurements of s PSII ' account for transient nonphotochemical quenching in the antenna bed as a result of exposure to transient light [44], and thus, values of s PSII ' were typically taken from the FRRf dark chamber to increase signal to noise. The PSII efficiency factor, termed F q '/F v ' (was calculated as (F m '-F')/(F m '-F 0 ') either from fluorescence emissions measured sequentially on dark acclimated samples and then under actinic light, or estimated as the difference in the apparent PSII photochemical efficiency between the FRR light and dark chamber quasi-simultaneously, as in Suggett et al. [25]. In this case, no blank correction to the fluorescence yields is required because any contribution of background fluorescence (to F', F 0 ' or F m ') effectively cancels between light and dark chambers/ conditions [25,44]. For two studies [25,28], the electron transport rate was evaluated using the equivalent equation: where values of F v /F m and s PSII are measured in the dark or assumed dark values based on measurements from deeper in the water column, that is, from depths where E,E K and where nonphotochemical quenching can be assumed to be negligible [25]. In the latter case, the influence of both photochemical and nonphotochemical quenching are all accounted for by changes in F q '/ F m '. This slightly alternative approach was employed to enable ETRs to be estimated in the absence of a 'dark chamber', and consequently, corresponding light and dark acclimated samples from all measurement depths, which would be required in order to apply Eq. 2. Both, Eqns. 1and 2, appear to return consistent estimates for quenching [8], and hence, ETR. It should further be noted that derivation of F q '/F v ' requires a measurement to be made following brief dark exposure. Consequently, under circumstances where rapid reversal of certain components of NPQ occurs during the dark measurements, F q '/F v ' may be overestimated potentially causing overestimates of ETR evaluated using Eqn. (2) by up to 30% in phytoplankton cultures [8].
Apart from the blank correction of the absolute fluorescence yields (see above and [45]), the accuracy of the ETR also depends on how variable some assumed 'constants' are and how well certain corrections are applied; specifically, (1) n PSII [46], (2) the spectral correction of s PSII , and (3) differences between ETR-based and CO 2 -based values of light harvesting efficiency (a) and maximum photosynthesis rates (P max ). Firstly, n PSII is known to vary between taxa and environmental conditions by up to a factor of 5 [8] but is rarely measured in the context of FRR-based productivity studies. In fact, of the studies used here only Moore et al. [11] and Suggett & Forget (unpubl.) used direct measurements of n PSII . The majority of the other studies assumed values of 0.002 and 0.003 mol RCII (mol chla) 21 for populations dominated by eukaryotes and prokaryotes, respectively [47], or calculated n PSII using a newly developed algorithm, which will also have associated caveats [7]. In evaluating the use of this assumed constant n PSII in some of our data sets, we compared W e,C derived with constant n PSII to W e,C based on measured n PSII (see discussion). We return to the issue of spectral corrections in the section ''matching ETRs with C-uptake''.
Although the FRRf approach provides a general estimate of a and P max , the light saturation point (E K ) for the ETR may be lower than that from 14 C uptake because of, for example, electron consuming processes [48][49][50]. The latter can cause the turnover time for QA to decouple from that of the whole chain PSII turnover, so that the E K for the ETR may not be wholly indicative of where light is limiting or saturating for CO 2 uptake. 14 CO 2 fixation rates CO 2 fixation was measured by either in situ incubations [29], simulated in situ incubations [27,28,30,31] or photosynthesis versus irradiance (PE) relationships [11,25,26] in a 'photosynthetron'  Table 1 [51] or an equivalent thereof (see Table 1 for details). CO 2 fixation in Ariake Bay (Japan) was measured using the stable isotope 13 C-labelled NaH 13 CO 3 [31]. Incubation lengths varied between the different studies, with the vast majority incubating for a few (1-4) hours and thus capturing productivity somewhere between GPP and NPP [19,20].

Matching ETRs with C-uptake
Ideally, measurements of photosynthetic carbon fixation and ETR should be made simultaneously on the exact same sample to reduce errors arising from differences in sample treatment and handling [8,18]. Moreover, the methods used to determine CO 2 fixation differed considerably between studies ( Table 1). Some of the most recent studies in the Baltic Sea (SYNTAX2010) and North Sea (CEND0811) followed recommendations of Suggett et al. [8] and measured 14 C uptake by placing the radioactive sample directly in the cuvette holder of the FASTact fluorometer. This simultaneous incubation technique avoided discrepancies in the intensity and spectral quality of the actinic light sources. In this way, ETR and GPP were measured for 1 h on the same sample at in situ temperatures and at a light intensity corresponding to either half or twice the value of E K (as determined from FRRf rapid light curves); these light intensities relative to E K were chosen to yield a measure of W e,C corresponding to an irradiance for light-limited and light saturated photosynthesis.
For most studies, 14 C-specific PE experiments were performed on discrete water samples whilst FRR data (and hence ETRs) were determined from in situ casts or on deck (Table 1). Incubation lengths varied between the different studies, with the vast majority incubating for a few (1-4) hours and thus capturing productivity rates somewhere between GPP and NPP [19,20]. For this same reason, adhering to strict definitions of NPP and GPP throughout the manuscript is not always possible. We have therefore specified NPP or GPP where possible but otherwise use the terms CO 2 fixation or primary productivity. In situ FRR data were compared to the PE data from the same depth as the discrete water sample. For on-deck based ETR measurements during SUPREMO11, actinic light was provided by an external programmable light source. The light intensity measured during the FRR data acquisition (used to calculate the ETR) was then applied to the 14 C-PE equation to yield a measure of the instantaneous 14 C uptake for matching with the ETR. For some studies, simulated in situ 14 C data were collected (corrected using 14 C-PE to yield 'gross' 14 C uptake [27,28] (Table 1). Here, we adopted a similar approach but further normalised daily 14 C uptake rates to hourly rates based on knowledge of the light regime.
Direct comparison of the ETRs and 14 C uptake requires that the spectral quality of the actinic light of the FRRf and 14 C incubator is either equivalent or corrected for. The spectral values of s PSII (measured with a blue LED) also needed to be scaled to the light quality of the actinic source used for the 14 C incubations [40]. In most cases, in situ FRR values of s PSII and the 14 C actinic light source were both spectrally corrected to match the spectral quality corresponding to the sample depth [40]. In cases where both FRR-and 14 C-PE curves were measured on discrete samples, data were spectrally corrected following Moore et al. [11]. For the studies that did not employ a spectral correction (Table 1) we assumed a constant factor of s PSII /1.75 [11,31] and s PSII /1.5 (Prášil et al. unpubl.) based on approximate values from the other studies for the water types in question.
Taking into account the light history of samples from different studies was the greatest challenge in compiling the database due to considerable inconsistencies in the availability and quality of light data. Thus, irradiances were expressed as optical depths, and instantaneous light was standardised by normalizing E relative to the saturating light intensity, E K , as determined from 14 C-specific PE curves, i.e. E:E K (dimensionless). In this way, E can be considered relative to the light history to which cells are acclimated [52] providing information as to whether the values of W e,C correspond to light limited photosynthesis (E:E K ,1) or light saturated photosynthesis (E:E K .1). Note, however, that E comes from in situ PAR measurements, while E K was usually based on measurements made in the laboratory. In this case, the spectral quality of E and E K may differ and values of E/E K ratios from different studies must be treated with caution.
All CO 2 fixation rates (in mg C L 21 h 21 ) were normalised to the corresponding chla concentration to yield CO 2 uptake rates as mol CO 2 (mg chla) 21 h 21 ; thus values of the W e,C (in mol electrons mol CO 2 21 ) were determined as W e,C~E TR=CO 2 uptaker ate ð4Þ Given the large array of data sets and the various differences in approach for determining both 14 C uptake and FRR fluorescence ETRs [8,18], we recognise that a major assumption inherent to our analysis is that environmental influences outweigh methodological influences on W e,C , both, within and between studies. Fortunately, all studies employed similar FRRf protocols (above). Even so, information on core variables associated with the ETR determinations were initially included in our database and examined alongside the environmental variables to verify any role of method upon apparent variability in W e,C (see ''statistical approach'' section). Note that the 'methodological' data across studies is categorical rather than continuous, (e.g. spectral correction of s PSII applied or not applied); therefore, the available information was converted into a Boolean code with numbers one and zero representing use and non-use of a particular method, respectively. In total, 3 methodological variables were ultimately included in the initial data set: n PSII assumed or measured, spectral correction of s PSII assumed or measured, and E:E K .1 or E:E K ,1.
At this point it should also be noted that apart from environmental factors, differences in phytoplankton community composition may also influence W e,C . Unfortunately, taxonomic data was not consistently available for the majority of studies. However, we assume that phytoplankton community composition will partially reflect environmental characteristics and some influence of community composition on W e,C will likely be implicitly accounted for in our analyses purely based on environmental variables.

Statistical approach
Many of the following analyses were carried out on both the entire dataset or on individual subsets thereof, which, for example, represent different oceanic regions. Although the majority of our samples were collected in the Atlantic Ocean and adjacent shelf waters while other regions (e.g. Pacific, Indian Ocean, and Mediterranean Sea) are underrepresented/not represented at all, we still use the term 'global' for analyses carried out on the dataset as a whole.
Spearman Rank Order Correlation analysis on the individual data sets (i.e. samples grouped by study) were used to identify key variables that may be associated with W e,C . Correlations, evaluated in SPSS 15.0 (SPSS Inc., Chicago, IL, USA) were considered significant when p,0.05. These correlations were carried out i) on the individual data sets (i.e. samples grouped by study), ii) on the global dataset, iii) on studies pooled according to regions (e.g. North Sea, Baltic Sea, Atlantic Ocean etc.) and iv) according to shelf/oceanic waters, thus providing an overview of which variables may be important. All data were then combined into one large database to carry out all other statistical analyses using non-parametric multivariate techniques in PRIMER-E version 6 (PRIMER-E, Ltd. Ivybridge, Devon, UK) [53], unless noted otherwise.
Inter-correlated and right-skewed physico-chemical variables (e.g. salinity, temperature, chla, NO 3 2, and PO 4 32 ) were identified using Draftsman plots and then square-root transformed to stabilise the variance and ensure that Euclidean distance could be used as an appropriate similarity measure (see below). To account for differences in scales and units between different variables, the latter were normalized by subtracting the mean from each entry of a single variable and dividing it by the variable's standard deviation.
Non-metric Multidimensional Scaling (nMDS) was used as a means to map environmental characteristics of samples in a lowdimensional space based on a triangular resemblance matrix that was created by calculating Euclidean distances between every possible pair of samples. Hence, Euclidean distances between samples represented the dissimilarities in the suite of their physicochemical properties. Principal Component Analysis (PCA) was then used to identify variables accounting for the differences between samples. Variables initially included in the PCA were salinity, chla, NO 3 2 , PO 4 32 , temperature, K d , and f, as well as a Boolean-matrix of methodological differences with regard to n PSII , s PSII and E:E K (as described above, Table 1).
Inclusion of the methodological and location variables (latitude, longitude (absolute values) and Julian day) usually increased dissimilarities, i.e. Euclidean distances, between samples of different studies compared to analyses from which methodological and location variables had been excluded. Because the 'methodological choices', in particular, cannot be effectively incorporated into predictive algorithms for W e,C , the results we show in the main text here are based on PCAs without methodological and location variables. Nevertheless, the results of the PCA including these methodological and location variables have been included in Appendix S1 (Fig. S1). Only those variables with the highest eigenvectors were included in any further analysis to assess the variability in W e,C .
Both nMDS and PCA generate low dimensional ordinations. Here we only show the nMDS plots, which matched the PCA plots well, because nMDS better preserves the distances between samples when mapping them onto a 2D or 3D space [53]. In addition nMDS provides a measure, the stress value, of how well the low dimensional ordinations represent the distances between samples. In all cases the stress value of the nMDS was considerably lower in the 3D ordination relative to the 2D ordination. Thus, the 3D-images are presented here where the individual planes (x-y, xz, y-z) are separate panels.
Because environmental factors showed considerable spatial and temporal variation, which, in turn, influence physiological responses of phytoplankton and subsequently W e,C to varying degrees, hierarchical agglomerative cluster analysis and a similarity profile (SIMPROF) test were used to find and define groups of samples with similar physico-chemical properties [53]. SIMPROF tests were used as a stopping rule of the cluster analysis, so that successive partitions along the branches of the dendrogram are only permitted if the null hypothesis of 'no structure between samples' was rejected. To reduce the number of clusters to a manageable size and to ensure that a sufficient number of samples large enough for algorithm development were included in each cluster, the p-value for the SIMPEROF test was reduced to 0.005.
Thus, once a non-significant test result was obtained (p.0.005), samples below that similarity level were no longer partitioned into clusters and could be regarded as homogeneous [53].
All data were then re-grouped according to the results of the cluster analysis and SIMPROF test. The PRIMER-BEST match permutation test was then used to identify variables and variable combinations that best ''explain'' the variability in W e,C [53]. That is, the algorithm randomly permutes a resemblance matrix generated from W e,C (based on all possible pair-wise sample combinations) relative to a resemblance matrix generated from subsets of the environmental data matrix searching for high rank correlations between the two and generating a correlation coefficient (r). Repeated (99) permutations of randomly ordered environmental data follow to test the significance of the results at a level of p,0.01 [53].
One of the main goals of this meta-analysis was to generate algorithms to predict W e,C from environmental variables for specific regions. Therefore, mathematical relationships between the environmental data and W e,C were produced for each cluster that contained at least 5 data points using multiple linear regression (MLR). Only those variables and variable combinations that were significantly correlated with W e,C according to the PRIMER-BEST-test were entered into the MLR in SPPS.

Spatial and temporal variability in W e,C
Mean values for W e,C in all but four studies were ,10 mol e 2 (mol C) 21 , resulting in a global mean (6 standard deviation) of 10.966.91 mol e 2 (mol C) 21 . Values of W e,C less than the theoretical ratio of 5 mol e 2 (mol C) 21 (see discussion) were observed in the Gulf of Finland 2000 and in part of the AMT 15 data. Variability of W e,C within and between individual studies was considerable, with a total range of 1.15-54.2 mol e 2 (mol C) 21 for all studies combined (Fig. 2). Within-study variation of W e,C was largest in the Pacific Ocean (7.9-54.2 mol e 2 (mol C) 21 ), Massachusetts Bay (8.5-50.1 mol e 2 (mol C) 21 ), and the AMT 15 data (1.1-28.2 mol e 2 (mol C) 21 ); in contrast, least withinstudy variance was typically observed for time series' at a single location (Ariake Bay: 5.1-5.7 mol e 2 (mol C) 21 , Bedford Basin: 3.6-10.3 mol e 2 (mol C) 21 , and the Gulf of Finland 2000: 2.0-9.4 mol e 2 (mol C) 21 ). As there were no distinct differences in the number and types of s PSIIspec and n PSII corrections between the time series studies and the other cruises, it would appear that variability of W e,C is generally greater spatially than temporally within the included locations. The time series studies included here were of relatively short duration (1.5-7 months), however, temporal variability may further increase if long-term studies (multiple years) are included.
Spearman Rank Oder Correlations could be found between W e,C and every environmental variable included in the study, albeit to varying degrees and depending on how data were grouped. On a global scale, i.e. combining the entire data set, W e,C exhibited significant positive correlations with Julian day (i.e. time/season) and salinity, while significant negative correlations existed with latitude, longitude, chla concentrations K d and f (Table 2). These correlations explained at least 12% of the variance in all cases. To consider if the global data pool may obscure regional-scale trends, correlations were also carried out (i) on the individual studies and (ii) by pooling studies representative of particular regions, such as, the Baltic Sea (including the Gulf of Finland), the Atlantic Ocean (all AMT cruises combined) and by combining data from shelf and oceanic waters, respectively.
At the level of the individual study, different environmental variables showed varying degrees of correlation depending on the region. Multiple notable trends were evident between W e,C , sampling depth, K d , f and nutrient availability, including: 1) changes of W e,C with depth, which were observed during AMT15, in the Celtic Sea, the North Sea (NS-CEND0811) and the Gulf of Finland/Baltic Sea; 2) a decline in W e,C with increasing K d during AMT6 and in the Baltic Sea, and with increasing f during AMT15, in the Celtic Sea, Pacific Ocean and in the Gulf of Finland/Baltic Sea; and 3) a change in W e,C with nutrient concentrations, during the AMT cruises, one of the Celtic Sea cruises (JR98) and in the Gulf of Finland/Baltic Sea. In the Baltic Sea (SYNTAX2010) and Bedford Basin significant correlations existed between W e,C and temperature and salinity, respectively. No correlation of W e,C with common environmental variables were found in Massachusetts Bay or during the UK-OA cruise. Note also, that depth-dependent trends in W e,C could only be assessed where multiple depths had been sampled at each station across large proportions of the cruise transect/time series, such as in the open ocean studies where data were available up to a depth of ,200 m (e.g. AMT cruises, Pacific Ocean, Celtic Sea JR98) or the Gulf of Finland (SUPREMO11) studies where up to four depths were sampled across the euphotic zone at each station. Thus, a lack of relationship between W e,C and depth may simply reflect limited sampling depths or, in shallow waters with rapid vertical mixing, acclimation to an average water column irradiance.
A notable feature for the Pacific and Atlantic transects as well as the Celtic Sea (JR 98, data not shown) was a pronounced increase of W e,C surface waters with low NO 3 2 and/or PO 4 32 availability (Fig. 3, Table 2). In fact, the highest W e,C values across all studies corresponded to a surface water lens in the HNLC and Humboldt upwelling region of the Pacific Ocean. For the other Atlantic studies, AMT 6 and 11, W e,C also declined with increasing NO 3 2 and/or PO 4 32 , but as the depth of the surface mixed layer and/or nitracline varied between stations, the relationship between depth and W e,C was not linear.
Negative correlations between W e,C and f for AMT15, the Celtic Sea (JR98), the Pacific Ocean and the Gulf of Finland indicated that W e,C often increased towards depths of greater light availability in surface waters or with declining K d values from coastal to offshore regions. Negative correlations of W e,C with depth, K d and f were also found but only in Baltic Sea and shelf waters.

Regional differences in environmental conditions
The initial correlative exercise for each separate study demonstrated that the relationships between W e,C and environment are dependent upon how data within and between data sets are grouped; therefore 'choice' of grouping will inevitably influence the outcome of empirical algorithms generated to predict W e,C from physico-chemical variables. A major objective of this study was to identify environmental predictors of W e,C ; therefore PCA in combination with cluster analysis was subsequently used to 1) identify the principal differences in environmental condition between sites regardless of the study to which they belonged and 2) to form clusters of sites with similar environmental conditions, which could then be analysed further with respect to their association with W e,C .The first three principal components of the PCA combined accounted for 86% of the cumulative variation in environmental variables (PC1 47%, PC2 24% and PC3 15%). Temperature, chla and K d had the highest coefficients for the linear combination of variables comprising PC1, i.e. they were associated with the separation of samples along PC1 (Table 3). Salinity, NO 3 2 and PO 4 32 differentiated samples along PC2, while a single variable, f, had the highest Eigenvectors of PC3.
Cluster analysis in combination with a SIMPROF test generated 15 significantly different clusters (p,0.005) labelled alphabetically from a-o (p,0.005) across a range of Euclidean distances (representing dissimilarities between samples) from 0-4.4; these clusters often overlapped in the nMDS ordination of the Euclidean distance matrix derived from the environmental variables, indicating that there existed similarities in some of the environmental conditions between samples from different studies ( Table 4, Fig. 4).
Some of the resulting clusters corresponded to obvious biogeographic regions and/or seasonal (i.e. temperature dependent) groupings, while others represent more or less distinct water masses (Table 4). All the samples collected in the Baltic Sea and Gulf of Finland fell into biogeographically distinct clusters (b, c, f, g and h) characterised by low salinities (Fig. 5). The four Gulf of Finland clusters (b, c, g, h) clearly represented temporal changes in environmental conditions with samples in cluster b being characterised by extremely low temperatures during late winter (SUPREMO11 cruise) coupled with high nutrient availability. Low temperature and high nutrients set this cluster apart from spring cluster c with still relatively low temperatures(,5 uC) and clusters g and h, which contain all the summer and autumn samples characterized by warm temperatures (,15 uC), low NO 3  . SYNTAX2010 is a Baltic Sea cruise, GoF2011 is the SUPREMO2011 study and the Pacific Ocean study is the BIOSOPE-cruise to the Southeast Pacific (see Table 1, Figure 1 for details). doi:10.1371/journal.pone.0058137.g002 nutrient concentrations in cluster k sets this cluster apart from cluster i.
Samples from Bedford Basin fell into two clusters: cluster d containing samples collected from mid-March to mid-May when chlorophyll concentrations increased and nutrient concentrations and water clarity dropped from previous levels that characterized samples from cluster e, which were collected in late winter (February to mid-March). Interestingly, cluster e also contained almost all of the Massachusetts Bay samples, some North Sea samples from frontal regions as well as AMT 6 and Pacific Ocean samples from the Humboldt and Benguela upwelling areas, respectively. This represents similarities in physico-chemical properties of these different water masses rather than biogeographical regions. During late winter, salinity in the ice covered Bedford Basin was similar to that in off shore waters, while chlorophyll and nutrient concentrations as well as optical properties matched those of upwelling/frontal regions; hence, their grouping together.
The remaining clusters (j, l, m, n, o) also contained samples from a variety of different cruises/regions. Samples collected from the deep chlorophyll maximum (DCM) in off shore regions during the AMT cruises, the Pacific Ocean cruise and the Celtic Sea (JR98) fell into one cluster (j), which was characterized by high temperatures as well as higher nutrient availability and f relative to the other offshore samples. Cluster l contained samples from intermediate depths in temperate waters on the European shelf and the shelf edge, which had relatively low temperatures. What set cluster l apart from cluster n, which also contained samples from the European shelf, was its overall greater optical depth. Open ocean samples with high water clarity and temperatures, as Clearly, temperature, salinity, chlorophyll, nutrient concentrations and light availability were the main environmental factors responsible for the distinct grouping of samples into clusters. Given that cluster analysis in strongly stratified, off-shore waters resulted in distinct groupings corresponding to DCM, intermediate depth and surface samples and due to the pronounced effects of water column stratification on light and nutrient availability, we also plotted the mean W e,C , grouped according to samples from the surface mixed layer (SML) and DCM against depth and NO 3 2 for each biogeographic province sampled during the Pacific and AMT cruises (Fig. 6). As such, distinct differences in W e,C existed between the SML and DCM samples from the Pacific and AMT 15 cruise, where W e,C was much higher in the SML than at the DCM and the increase in W e,C coincided with a drop in nutrient availability or even depletion of nutrients in surface waters. During the AMT6 and AMT11 cruises, on the other hand, no such pronounced differences in W e,C and nutrient concentrations between SML and DCM was observed for the sampled locations in most of the biogeographic provinces.
In summary, there existed distinct groupings for many of the cruises and regions based on differences in environmental gradients within and between regions and water masses. This also GoF confirms that user specific differences in methodology did not appear to have a systematic influence.

Relationships between environmental conditions and W e,C
Clustering all available data based on the inherent environmental characteristics demonstrated that the dependence of W e,C on environmental gradients cannot be resolved at the level of the individual studies. Cluster analysis therefore enabled us to objectively identify how to pool data across the multidimensional environmental variable matrix (Table 4) to further examine variability of W e,C . Permutation tests conducted on the individual clusters (i.e. clusters a-q, Table 4) indeed showed that often multiple variables combined were significantly correlated with W e,C (BEST, p,0.05 or p,0.01), and that different variable combinations were driving W e,C in different clusters and regions (Table 5). These 'best' variable combinations were entered into multiple linear regression to derive algorithms for the prediction of W e,C ,     which, in many cases, resulted in significant relationships with W e,C (MLR, p,0.05) ( Table 5). Significant relationships (R 2 ,0.05) between W e,C and environmental variables existed in the Gulf of Finland (cluster b), Bedford Basin (cluster d), European shelf waters (cluster n) and offshore samples from intermediate depths (cluster m), surface waters (cluster n) and the deep chlorophyll maximum (DCM) (cluster j). Surprisingly, PO 4 32 , rather than NO 3 2 availability, was often part of the variable combinations showing the strongest association with W e,C , usually in addition to temperature and/or salinity. The strong relationships between W e,C and PO 4 32 and lack thereof with NO 3 2 may be due to NO 3 2 concentrations being close to or below the detection limit in open ocean waters and, during the summer months, often also in shelf waters, which compromises the detection of relationships with W e,C . Thus, NO 3 2 availability may still be an important determinant of W e,C in these waters, but with PO 4 32 being the only macronutrient left in our analysis, only the latter was pulled out. Despite the distinct clustering of samples from the frontal and upwelling regions, the Baltic Sea and European shelf edge, no significant MLR existed between W e,C and environmental variables in any of these regions (clusters e-i, k, l).
To develop region-specific algorithms, we also grouped the clusters according to meaningful water masses and biogeographic regions resulting in 11 significant algorithms covering the Gulf of Finland, the Baltic Sea as a whole (i.e. including the Gulf of Finland), European shelf waters (including the Northeast Atlantic), Northwest Atlantic shelf waters, the equatorial Atlantic and the South Atlantic Ocean. Although some of these region-specific relationships exhibited a low R 2 , they were highly significant. The relationships between W e,C and environmental variables were strongest in the Gulf of Finland and the Pacific Ocean (R 2 ,0.05), followed by the Northwest Atlantic shelf samples (i.e. Massachusetts Bay + Bedford Basin). Relationships for the Atlantic were much less pronounced (R 2 ,0.22), albeit highly significant (p,0.01). Once again, PO 4 32 rather than NO 3 2 availability appeared to play a greater role in these MLRs. On a global scale, MLR yielded an R 2 of 0.038, but with both NO 3 2 and PO 4 32 contributing to the variability in W e,C .  Table 4. Values are means and error bars are standard deviations (with n of 2-67, see Table 4) of W e,C (mol e 2 mol C 21 ), temperature (uC), salinity, chlorophyll a (Chl a, mg m 23 ), nitrate and phosphate (mmol L 21 ), the vertical attenuation coefficient of photosynthetically available radiation (K d , m 21 ), optical depth j (dimensionless) and sampling depth (z, in meters). doi:10.1371/journal.pone.0058137.g005

Reconciling electron transfer and carbon fixation
While empirical evidence demonstrates that fluorescence-based measures of the PSII photochemical efficiency, and hence ETRs, are linearly related to net or gross carbon fixation rates under many conditions [16,47,54], there are still considerable uncertainties about what drives W e,C as well as differences between NPP and GPP. Growth rate dependent differential allocation of fixed carbon and varying lifetimes of intermediate products may cause large discrepancies between NPP and GPP, which short term 14 CO 2 uptake measurements may not capture [19,20]. While it is generally accepted that short-term 14 C incubation techniques approximate GPP, Halsey et al. [19,20] demonstrated that this is only the case in fast growing, nutrient-replete phytoplankton. In nutrient limited cells, on the other hand, short-term 14 C incubations equate NPP or primary productivity rates somewhere between NPP and GPP. Failure to accurately quantify GPP will inevitably affect W e,C . Most data published in the past and also those used in our meta-analysis compared ETR to relatively short-term 14 CO 2 uptake rates, so that most W e,C values available to this day would provide a conversion from ETR to GPP or to something between NPP and GPP. Under most conditions, NPP differs from GPP by a factor of 2 to 2.5 [55,19,20], for which the current W e,C values cannot account. Conversion of ETR to NPP and assessments of the potential error in W e,C and subsequently NPP due to employing short term 14 C incubations will require further investigation and is the focus of ongoing work.
Under 'optimal' growth (where NPP < GPP), and accounting for electron sinks associated with nutrient reduction, the slope of the linear relationship between ETR and NPP should yield values for W e,C of 4-6 mol e 2 (mol C) 21 [16][17][18]. To date, most FRRfbased studies use an electron requirement for carbon fixation of 5 mol e 2 (mol C) 21 to convert ETRs to carbon fixation rates, assuming (i) that at least 4 e 2 transported through PSII are required per O 2 molecule produced and (ii) that 1-1.5 mol of O 2 is produced for each mol CO 2 fixed, i.e. the photosynthetic quotient takes values of 1-1.5 mol O 2 (mol CO 2 ) 21 [56]. Indeed, some studies here confirmed W e,C equal (or close to) 4-6 mol e 2 (mol C) 21 ). However, the vast majority of estimates for W e,C are considerably higher, sometimes reaching extremes of .50 mol e 2 (mol C) 21 (e.g. BIOSOPE Pacific Ocean Cruise) or alternatively W e,C of ,5 mol e 2 (mol C) 21 Gulf of Finland 2000 and AMT 15. Thus, for many cases, application of an assumed value of 4-6 mol e 2 (mol C) 21 to FRR data would yield erroneous estimates of Cuptake (within the limitations of the 14 C-uptake itself for quantifying C-uptake). Values of W e,C .5 mol e 2 (mol C) 21 have been observed previously and potentially reflect processes that act to decouple ETRs from C-fixation, such as photorespiration [32], chlororespiration via a plastid terminal oxidase (PTOX) [33,34] and Mehler reaction [32]. We return to this issue in the following sections. In contrast, values of W e,C ,5 mol e 2 (mol C) 21 are more difficult to reconcile with biophysical and physiological processes, potentially indicating the magnitude of remaining methodological discrepancies in deriving W e,C , such as incorrect assumptions concerning the value or variability of n PSII (see below), inadequate spectral correction or remaining non-systematic errors in both carbon fixation and fluorescence estimates. There were two studies with a high proportion of W e,C values ,5 mol e 2 (mol C) 21 : the AMT15 cruise (Hickman et al. unpublished) and the Gulf of Finland 2000 study. Unlike during the AMT15 cruise where ,5 mol e 2 (mol C) 21 occurred only in samples from the DCM, we could not identify any consistency in the occurrence of such low values that could be related to sample handling and processing in the Gulf of Finland 2000 data. Thus, the high proportion of low W e,C values in the Gulf of Finland might point to systematic errors in the ETR calculations, i.e. in the component values for s PSII ', n PSII and/or E, values that are often assumed (or not well measured). Whilst values of s PSII ' measured by FRRf have been shown to match independent bio-optical measurements well [46], we have to assume that the bio-optical instrumentation used to quantify both E and s PSII ' have been appropriately calibrated. In the Gulf of Finland [29], values of s PSII ' typically range from ca. 150-300 A 2 quantum 21 , which is on the lower end of the range of values expected for assemblages dominated by diatoms, dinoflagellates, cryptophytes and cyanobacteria (data not shown), but still within the range commonly observed for such species [38]. Absolute underestimations of s PSII ' as a result of erroneous instrument calibrations are therefore unlikely a significantly contributing factor. However, in cases when cyanobacteria dominate, ETRs may be underestimated due to the saturating light blue LED pulse being inefficient at driving reaction centre closure, resulting in low W e,C values. Furthermore, assumption of a constant n PSII will introduce errors due to taxonomic variability in physiological traits [46]. Moreover, phytoplankton cells tend to change the size of their photosynthetic units in response to light and nutrient availability [57][58][59] and vertical mixing [60]. Hence n PSII can vary among species by more than a factor of four, from 0.0010 to 0.0042 mol RC (mol Chla) 21 [46]. Raateoja et al. [29] assumed a constant n PSII of 0.002 mol RC (mol Chla) 21 , which is representative of the eukaryote phytoplankton community observed in their study (mostly diatoms, dinoflagellates and cryptophytes; but not typically when cyanobacteria dominate. Consequently, increasing assumed values of n PSII in the study of Rateeoja et al. [29] to account for the presence of cyanobacteria [8,46] would increase ETRs and hence potentially W e,C to values .5 mol e 2 (mol C) 21 .
We assessed the potential error in W e,C due to the use of a constant n PSII in some of the data sets presented in this study by comparing W e,C based on a constant n PSII with W e,C values calculated from ETR where n PSII was either measured via oxygen flash yields (Bedford Basin data) or calculated using a fluorescence based algorithm [7] (CEND0811, UK-OA D366). These comparisons showed that a constant n PSII may lead to an underestimate of W e,C by 28-47% (Fig. S2, Table S1 in Appendix S1), which could thus, at least in part, explain the low W e,C in the Gulf of Finland data. The n PSII values estimated for the North Sea spanned a range of 0.0018 to 0.0056 mol RC (mol chla) 21 , which is comparable with the range observed for eukaryotic and prokaryotic phytoplankton [46,47,58], but higher, by ca. a factor of 2, than n PSII directly measured by Moore et al. [11] in European shelf waters at a similar time of year. Clearly further direct measurements of n PSII and validation of algorithms proposed for estimating the value of this variable [7] are required before a more robust assessment of the error in calculated W e,C which is associated with n PSII can be achieved. Such measurements are the focus of ongoing work.
Other sources of discrepancies between electron transfer and carbon fixation may stem from differences among protocols for ETR-light response curves (rapid light curves vs. steady state light curves) [61], deviations in the timescales of in situ ETR and laboratory 14 C-photoynthesis measurements [43,61], discrepancies in the E K values derived from ETR and CO 2 uptake [52][53][54], or the effects of water column structure and subsequent changes in light availability on the 'shape' of ETR light response curves [8].
For the data included in the present study and the associated differences in methodology with regard to ETR calculations (s PSII ', n PSII ) and 14 C incubation techniques, we estimated that, in worst case scenarios, W e,C may be over-or underestimated by as much as 53% (data not shown). A detailed assessment of these methodological differences between techniques and studies is beyond the scope of this study, and we refer the reader to a previous comprehensive reviews of these issues [8,38]. However, we note that variability in estimated values of W e,C was lower than that observed across all studies (Fig. 2) for the two new studies included here, which were performed under the most controlled conditions (SYNTAX2010 and North Sea CEND0811), i.e. were ETR and CO 2 fixation were simultaneously measured on the same sample. Thus, the extent to which the extreme values and/or variability in W e,C , which is observed in some other studies, represents remaining errors in either carbon fixation or ETR estimates remains unclear.
Environmental regulation of W e,C Data presented in this study focused almost exclusively on the apparent effect of readily measurable environmental variables on W e,C . Information on phytoplankton community composition could not be routinely included and we assumed that any change in taxonomy was inherently accounted for in the environmental descriptors. Indeed, environment often leads to distinct structural differences of the photosynthetic apparatus and acclimation responses amongst phytoplankton from different biogeographic regions [34,[62][63][64][65][66][67] and, towards unique, often consistent fluorescence "signatures" of PSII [10,11,38,[68][69][70]. Future assessments of the inter-and intraspecific variability in W e,C will, of course, be necessary to fully resolve the mechanisms responsible for changes in W e,C and the derived algorithms (Table 5), which are purely based on statistical models and should therefore not be interpreted from a mechanistic perspective. From the relatively few available studies on phytoplankton cultures grown under various conditions to date, W e,C usually does not exceed values of ,25 mol e 2 (mol C) 21 as opposed to estimates of up to 54 mol e 2 mol C 21 from natural communities (reviewed in Suggett et al. [8]). Lower maximal values for phytoplankton cultures could indicate a less extreme range of growth environments tested in the laboratory as compared to natural conditions, such as, the high-light, lownutrient conditions of the surface open ocean that are difficult to reproduce in the laboratory. Furthermore, laboratory studies often examine cells that are in 'steady state' growth whereas natural communities may persist under (rapidly) changing environmental conditions. In the latter case, non-steady state conditions could lead to a strong and transient uncoupling of electron transfer and carbon fixation [10,71,72], which would suggest that the primary source of variation in W e,C is the environment rather than taxonomic variability. The effect of growth rate on our ability to measure GPP by short-term 14 C incubations under natural conditions may further contribute to the wider range of W e,C values from field observations relative to culture-based W e,C [19,20].
Considerable variability of W e,C existed not only spatially but also temporally. Given the large range of W e,C for the entire data set, it is notable that mean W e,C in all but one of the time series studies (Massachusetts Bay) never exceeded 10 mol e 2 (mol C) 21 . The greater variance in the Massachusetts Bay time series study could, on the one hand, have been related to its longer duration (9 month) relative to the other time series (1.5-6 month) and, on the other hand, to its topography; while the Gulf of Finland, Bedford Basin and Ariake Bay, are all surrounded by land forming at least partially land-enclosed basins, Massachusetts Bay is relatively exposed and more strongly influenced by exchange of water with the Atlantic Ocean. Nevertheless, seasonal changes appear to influence the variability in W e,C although many areas are still under-sampled.
In the Gulf of Finland, for example, vertical mixing and the depth of the surface mixed layer change across different seasons [73][74][75] and with it the optical and physicochemical properties of the water column, that is, light and nutrient availability of the extant phytoplankton community [29]. The clustering of the Gulf of Finland samples -into a late winter (b), spring (c), summer (g) and summer/early autumn (h) mirrors these physicochemical changes. However, the highly dynamic nature of this system makes the development of predictive algorithms for the entire Baltic Sea (including the Gulf of Finland) challenging. The best predictors of W e,C for the whole region were temperature, salinity and PO 4 32 , resulting in a relatively weak (R 2 = 0.266) albeit highly significant relationship. For the Gulf of Finland on its own, however, the relationships were much stronger (R 2 = 0.561, p,0.01), indicating that the Gulf of Finland should perhaps be treated as its own region.
Distinct temporal changes in environmental gradients also existed in Bedford Basin, where initial correlations (Table 2) showed significant relationships of W e,C with salinity, which may have both indirect (via alterations of density and water column stratification) and direct (osmotic) effects on W e,C . The combination of low nutrients coupled with sudden high light availability due to elevated freshwater discharge after snow melt and ice breakup could cause an up-regulation of electron, ATP and/or reductantconsuming pathways (e.g. photorespiration and Mehler reaction). Conceivably, similar changes in salinity could also directly affect W e,C by causing osmotic stress or an increase in respiration of the extant phytoplankton community [76]. Bedford Basin phytoplankton community composition often shifts from diatoms and dinoflagellates to chlorophytes after the snow melt (Suggett & Forget unpublished) [77], matching the increase of W e,C . Thus, in Bedford Basin multiple environmental factors (salinity, temperature, stratification) were likely at play, potentially exerting light and osmotic stress on the extant, but changing, phytoplankton community [77][78][79][80]. Resolving the underlying mechanisms responsible for seasonal changes in W e,C in remains a challenge and will require teasing apart daylight hour effects from temperature effects and other physicochemical variables.
Many samples from Massachusetts Bay and other open ocean areas fell into one cluster with the Bedford Basin data (Table 4, Fig. 5). The Bedford Basin spring cluster (d) was characterized by high chlorophyll concentrations, whereas samples in the late winter cluster (e) had low chlorophyll concentrations and temperatures more similar to those in frontal and upwelling regions of the North Sea, and the Atlantic and Pacific Oceans. The wide range of environmental gradients observed in these clusters may be the reason why MLR did not return any significant relationships between W e,C and environmental variables. Furthermore, such relationships may not necessarily be linear [81] and other models will have to be tested in the future.
Most of the other studies, also extended across multiple biogeographic provinces (sensu Longhurst [82]) or distinct environments water masses (SML/DCM), and although there seemed to be consistent overlaps between nutrient deplete regions and areas of high W e,C (Fig. 3), the absolute nutrient concentration was often not correlated with W e,C ( Table 2). Important examples are the open ocean regions, highlighting how environmental forcing can confound establishment of strong relationships between W e,C and measurable environmental variables at the basin scale: W e,C often declined with increasing depth and towards the nutricline; even so, W e,C was often not significantly correlated with measured nutrient concentration most likely reflecting differences in nutrient availability between the gyres, HNLC areas and coastal upwelling regions. Additionally, nutrient stocks are not necessarily a good index of the level of nutrient stress within a population, which will depend rather on the overall (re-)supply rates of the nutrient relative to the demands of the extant community. In the iron and nitrogen deplete SPG [12] and the low-iron HNLC region of the Pacific Ocean [83], W e,C declined when NO 3 2 concentrations were above the detection limit (usually at the deep chlorophyll maximum). Apart from the extreme outlier of W e,C .50 mol e 2 (mol C) 21 , highest W e,C values were observed in the upwelling region and the gyre corresponding to samples from a low-NO 3 2 surface water lens with high irradiances. Beneath this surface water lens, NO 3 2 increased while light levels dropped and W e,C declined.
Iron and nitrogen limitation may be expected to influence the photosynthetic apparatus in different ways, with iron deficiency causing a preferential decline in iron rich-cellular components (e.g. PS I, PSII and cytochrome b 6 f) [56,[84][85][86]. Thus, different forms of nutrient limitation can induce differences in the fluorescence signatures of the resident phytoplankton community [67,72] and subsequently in W e,C . Iron limited phytoplankton in HNLC regions, for instance, my express the chlorophyll-binding protein IsiA, which reduces the apparent PSII photosynthetic efficiency and primary productivity rates normalized to chlorophyll [87,88]. The latter would cause W e,C to increase. Nutrient limitation in combination with high-light stress in surface waters may hence have been responsible for the highest W e,C obtained in these regions, suggesting a high degree of uncoupling between electron transfer and carbon fixation. Similarly, variations in nitrogen assimilation and nitrogen fixation require large amounts of ATP also leading to an uncoupling of electron transport from carbon fixation [67,89,90].

Conclusions
Conversions of FRRf-based ETR estimates to carbon-specific rates of NPP depend on our ability to accurately predict W e,C for given environmental condition. This present study shows, for the first time, the extent of variability of W e,C , albeit within the past methodological constraints of accurate quantification of both ETRs and NPP. Most data on W e,C available to this day are based on short-term 14 C incubations which, depending on algal growth rates, may capture GPP rather than NPP. Further studies are needed to fully resolve the discrepancies between W e,C derived from GPP or NPP in both cultures and natural communities. Nevertheless, our work shows that some of the variability in W e,C can be linked to environmental variables that are routinely measured. Given the observed variability, it is highly unlikely that a global approach or algorithm can be produced which captures a high proportion of the variance in W e,C . However, with the present study we provide firm evidence that such algorithms can in fact be generated for biogeographic regions and distinct water masses, bringing us closer than ever to predicting carbon uptake from ETRs. Independent validation of these algorithms is still required, and, we may expect they will be refined as more data accumulate. Importantly, our study provides methodology for future data collection and integration to improve W e,C algorithms. Clearly, developing standardized protocols would greatly facilitate intercomparisons of studies and reduce some of the potential error due to methodological differences. Assessing the role of phytoplankton taxonomy, development of non-linear models and testing of the present algorithms on novel, independent data represent necessary future steps to improve the level of accuracy.
One of the major remaining challenges in utilising these algorithms is defining the (biogeographic) regions, while keeping in mind that their boundaries might actually shift in space and time. Hence caution is still required in widespread application of the current algorithms. Even so, the appearance of clear patterns and proof of predictive power of environmental variables in the present article provides strong support and much improved confidence for successful conversion of ETRs to carbon fixation rates at a much greater spatial and temporal resolution than current 14 C fixation approaches. Figure S1 Principal component analysis including the environmental data, location data and methodological information (A) and with the methodological differences excluded from the analysis (B).

Supporting Information
(TIF) Figure S2 Comparison of W e,C (in mol e 2 (mol C) 21 ) calculated with an n PSII = 0.0020 mol RC (mol chla) 21 and where n PSII was measured with oxygen flash yields (Bedford Basin) or by FRR fluorometry according to Oxborough et al. [11] (UK-OA D366 and North Sea CEND0811). Bold line represents the regression equation for all three studies combined: W e,C constant n PSII = 0.6176 (W e,C measured n PSII )+2.765 (R 2 = 0.807, n = 110, p,0.05). Regression coefficients are shown in Table S1 in Appendix S1. (TIF) Appendix S1 Results of PCA including methodological and location variables and assessment of the effect of variable nPSII on Phie,C. (DOCX)