Bigleaf—An R package for the calculation of physical and physiological ecosystem properties from eddy covariance data

We present the R package bigleaf (version 0.6.5), an open source toolset for the derivation of meteorological, aerodynamic, and physiological ecosystem properties from eddy covariance (EC) flux observations and concurrent meteorological measurements. A ‘big-leaf’ framework, in which vegetation is represented as a single, uniform layer, is employed to infer bulk ecosystem characteristics top-down from the measured fluxes. Central to the package is the calculation of a bulk surface/canopy conductance (Gs/Gc) and a bulk aerodynamic conductance (Ga), with the latter including formulations for the turbulent and canopy boundary layer components. The derivation of physical land surface characteristics such as surface roughness parameters, wind profile, aerodynamic and radiometric surface temperature, surface vapor pressure deficit (VPD), potential evapotranspiration (ET), imposed and equilibrium ET, as well as vegetation-atmosphere decoupling coefficients, is described. The package further provides calculation routines for physiological ecosytem properties (stomatal slope parameters, stomatal sensitivity to VPD, bulk intercellular CO2 concentration, canopy photosynthetic capacity), energy balance characteristics (closure, biochemical energy), ancillary meteorological variables (psychrometric constant, saturation vapor pressure, air density, etc.), customary unit interconversions and data filtering. The target variables can be calculated with a different degree of complexity, depending on the amount of available site-specific information. The utilities of the package are demonstrated for three single-level (above-canopy) eddy covariance sites representing a temperate grassland, a temperate needle-leaf forest, and a Mediterranean evergreen broadleaf forest. The routines are further tested for a two-level EC site (tree and grass layer) located in a Mediterranean oak savanna. The limitations and the ecophysiological interpretation of the derived ecosystem properties are discussed and practical guidelines are given. The package provides the basis for a consistent, physically sound, and reproducible characterization of biometeorological conditions and ecosystem physiology, and is applicable to EC sites across vegetation types and climatic conditions with minimal ancillary data requirements.

Introduction from a single, homogenous plane. This approach requires little site-specific ancillary information, is widely applicable across sites, and has been shown to give meaningful results within its limits of applicability and validity [35,36]. Bulk ecosystem properties derived with a top-down 'big-leaf' approach are thus commonly presented in EC studies and have proven useful in characterizing vegetation behavior in various ecosystems and under contrasting conditions [10,29,[37][38][39][40][41].
Despite their relevance for global change research and their widespread appearance, little effort has been put into the development of harmonized calculation protocols for these quantities, and as a consequence, calculated metrics are often not easily comparable, especially with respect to the wide variety of existing methodologies and formulations (e.g. [42]). In this paper, we describe the R package bigleaf, which provides functions to infer G a , G s and further physical as well as physiological bulk ecosystem properties from EC data and concurrent meteorological measurements in a consistent and standardized manner. In the following, the main equations are presented and their use is demonstrated for four contrasting EC sites. The limitations of the calculations, arising from methodological constraints and inherent limitations of the 'big-leaf' approach, as well as the consequences for the interpretation of the resulting variables, are discussed. The paper ends with practical guidelines on how to use the bigleaf package.

Package design and availability
The bigleaf package is entirely written in the open source software R [43]. The package is available as a stable version from CRAN (https://cran.r-project.org/web/packages/bigleaf) or as a development version (continously updated with git version control) from http://www. bitbucket.org/juergenknauer/bigleaf. This paper describes package version 0.6.5 (git commit: fcada22). An overview of the main functions is illustrated in Fig 1. In the following, the theory underlying the package's key functions is shortly presented. For technical details on the functions, the reader is directed to the functions' help pages and examples therein.

The 'big-leaf' framework
All functions provided in this package are based on the 'big-leaf' framework (Fig 1) [44], which assumes that a single plane located at height d + z 0h (d = displacement height, z 0h = roughness length for heat) is the single source and sink of all mass and energy fluxes, and that wind speed is zero at height d + z 0m (z 0m = roughness length for momentum) and increases exponentially with height. This approach does not distinguish fluxes from different compartments of the ecosystem (e.g. soil and vegetation), nor does it account for vertical variations within the canopy or horizontal heterogeneity due to e.g. different species. The derived quantities at the 'big-leaf' surface must thus be regarded as average (but representative) conditions of the tower footprint. The main principle of the bigleaf package is to derive ecosystem surface properties from the observations using a top-down (inversion) approach.
are only meaningful if certain meteorological conditions are met (e.g. daytime or rainfree periods, see below).
The package offers a basic data filtering routine (function filter.data()), which filters EC data based on the aforementioned criteria. The function consists of two parts: 1) Quality control: data points of bad quality (e.g. gap-filled with poor confidence) are discarded, and 2) Meteorological filtering: variables falling out of the (purpose-specific) accepted range (e.g. nighttime values, precipitation events) are filtered out. The filter.data() function returns the input data frame in wich time periods that do not fulfill the filter criteria are set to NA.
Constants, unit interconversions, and sign convention. The package combines all required constants into one list that can be evoked by calling bigleaf.constants(). This list is passed as a default argument to all functions that use one or more constants. Thus, individual constants do not have to be provided for any function call, but can be changed by calling the argument explicitly. As a basis for many calculation steps, common unit interconversions are provided: • Conductances between mass and molar units (m s −1 and mol m −2 s −1 ) • Water fluxes between mass and energy units (kg m −2 s −1 and W m −2 ) • Carbon fluxes between mass and molar units (g C m −2 d −1 and μmol CO 2 m −2 s −1 ) • Atmospheric humidity between vapor pressure deficit (kPa), vapor pressure (kPa), specific humidity (kg kg −1 ), and relative humidity • Radiation between energy and molar units (W m −2 and μmol m −2 s −1 ) The sign convention is that fluxes directed away from the surface are positive and those directed toward the surface are negative. Thus, negative net CO 2 ecosystem exchange (NEE) values indicate a net uptake of CO 2 by the ecosystem.
Meteorological variables. Most of the central functions in the bigleaf package require meteorological variables that are not commonly provided by the processed EC products, but which can be readily calculated from standard meteorological variables like air temperature, humidity, and atmospheric pressure. For reasons of space, the individual formulations are not presented here, instead the user is directed to the help page of the respective function and the references therein. All functions apply textbook calculations and include: • latent heat of vaporization: latent.heat.vaporization(T a ) • psychrometric constant: psychrometric.constant(T a , p) • saturation vapor pressure and slope of the saturation vapor pressure curve: Esat.slope (T a ) • air density: air.density(T a , p) • virtual temperature: virtual.temp(T a , q) • wet-bulb temperature: wetbulb.temp(T a , p, D a ) where T a is the air temperature (˚C), p is the atmospheric pressure (kPa), q is the specific humidity (kg kg −1 ), and D a is the vapor pressure deficit (kPa). If p is not available, it can be approximated by the hypsometric equation as a function of site elevation (pressure. from.elevation()).
Aerodynamic conductance. Aerodynamic conductance to heat transfer (G ah ) is central to the 'big-leaf' concept and multiple formulations have been proposed. G ah can be written as where R am is the aerodynamic resistance to momentum transfer with turbulence as the principal transport mechanism, and R bh is the canopy (quasi-laminar) boundary layer resistance ("excess resistance") to heat transfer, which is characterized by molecular diffusion as the dominant transport mechanism [46, 47]).
At EC sites, G am can be calculated directly as (e.g. [46, 48](aerodynamic.conductance()): where u Ã is friction velocity (m s −1 ) and u(z r ) is wind speed (m s −1 ) at reference (=measurement height)(m). Eq 2 implicitly accounts for the effects of atmospheric stability on G am . Nevertheless, an alternative and frequently used formulation is provided, which explicitly accounts for the effects of atmospheric stability ([46]): where k is the von Kármán constant (0.41), d is the zero plane displacement height (m), z 0m is the roughness length for momentum (m), and ψ h is the integrated form of the stability correction function for heat and water vapor. ψ h is a function of the atmospheric stability parameter z = (z r − d)/L, where L is the Monin-Obukhov length. The function stability. correction() can be used to calculate ψ h based on formulations suggested by [49] or [50]. The two roughness parameters d and z 0m have to be determined a priori. The function roughness.parameters() provides three options: 1) an empirical approach assuming d and z 0m as constant fractions of canopy height z h (by default d = 0.7z h and z 0m = 0.1z h ), 2) a semi-empirical approach estimating both z 0m and d based on z h and leaf area index (LAI) according to [51] for data presented in [52], and 3) an approach that calculates z 0m from the logarithmic wind profile equation with a prescribed d. Note that d and z 0m , as well as all other ancillary variables (e.g. LAI), can be provided as time-varying vectors with the same length as the input data frame. Multiple formulations have been suggested for the calculation of the canopy (quasi-laminar) boundary layer conductance to heat transfer (G bh ), which range from empirical to physically-based (see [53,54] for an overview).
[55] suggested a simple empirical relationship between G bh and u Ã (Gb.Thom()): Several further (semi-) empirical formulations have been suggested, but we restricted the functions to those best applicable to EC sites. In that respect, relationships based on the Reynolds number, which have been found to show a biphasic behavior [56], are currently not implemented. More mechanistic, but also parameter-rich approaches commonly require LAI and aerodynamically-relevant foliage characteristics (leaf width or leaf characteristic dimension). The formulation suggested by [51] is given by (Gb.Choudhury()): where α is an attenuation coefficient modeled in dependence on LAI according to data presented in [57], u(z h ) is wind speed (m s −1 ) at canopy height z h , and w is leaf width (m). Wind speed at height z h (or any other height z > d + z 0m can be estimated from the logarithmic wind profile equation (wind.profile()): where ψ m is the integrated form of the stability correction function for momentum (as calculated in stability.correction()).A third model currently implemented in the bigleaf package was developed by [47] and simplified by [58](Gb.Su()): where C d is a foliage drag coefficient (assumed constant with a value of 0.2 [47]), f c is fractional vegetation cover, C t is a heat transfer coefficient, and B À 1 s is the inverse Stanton number for bare soil surface [58]. C t mainly depends on the leaf characteristic dimension and the number of leaf sides participating in heat transfer, see [47] and [58] for details. The denominator of Eq 7 is often referred to as the kB À 1 h parameter (e.g. [54]), which is defined as: From Eq 8 the roughness length for heat (z 0h ) can be determined. Note that G am is identical for different scalars in the atmosphere (heat, water vapor, CO 2 , and other trace gases), whereas G b differs with respect to the quantity of interest. G b of quantity x can be calculated based on G bh [59]: where Pr is the Prandtl number (0.71), and Sc x is the Schmidt number for quantity x. For simplicity, the assumption is made that G b is identical for heat and water vapor transfer (i.e. G bh = G bw ). The more realistic difference of a few percent [59] is considered small compared to other uncertainties (see also [60]). Since the calculations of G am and G bh are independent, the bulk aerodynamic conductance to heat transfer (G ah ) can be calculated as the sum of the inverse versions of Eqs 2, 3 and 4-7. The main function aerodynamic.conductance() returns G am , G ah , G bh , G ac (aerodynamic conductance to CO 2 transfer), G bc , the corresponding resistances, and kB À 1 h , z, as well as ψ h . If one or more additional Schmidt numbers are provided, G a and G b are calculated for the respective quantities as well. Due to the modular structure of the functions, each of these components can also be calculated individually.
Surface conditions. EC measurements are accompanied by meteorological measurements taken at approximately the same height as the flux measurements, usually several meters above the canopy. If G a is determined, the bulk transfer relations can be inverted and solved for the surface variable [23, 29]((surface.conditions())): where H is the sensible heat flux (W m −2 ), ρ is the air density (kg m −3 ), c p is the heat capacity of dry air (J K −1 kg −1 ), e is vapor pressure (kPa), λE is the latent heat flux (W m −2 ), γ is the psychrometric constant (kPa K −1 ), e sat is the saturation vapor pressure, D is the vapor pressure deficit (kPa), and C is the CO 2 concentration. Subscripts a and s denote air and surface, respectively. Note that in Eqs 10-13 "surface conditions" refer to the notional canopy surface. It is also possible to infer conditions in the intercanopy airspace by replacing G ah in Eqs 10 and 11 or G ac in Eq 13 with G am . The function surface.conditions() returns T s , e sat (T s ), e s , D s , q s , rH s , and C s . This method can be applied to other atmospheric constituents measured at EC sites (e.g. methane, nitrogen oxides, ozone), provided that the corresponding G a is known (see above). An alternative estimate of surface temperature is based on the physical principle that any object emits longwave radiation in dependence of its temperature as described by the Stephan-Boltzmann relation. This radiometric surface temperature (T r , in Kelvin) is given by (e.g. [61], radiometric.surface.temp()): where LW " and LW # are longwave upward and longwave downward radiation (W m −2 ), respectively, σ is the Stefan-Boltzmann constant (W m −2 K −4 ), and is the emissivity of the surface. Surface conductance. Surface conductance to water vapor (G sw in m s −1 ), describes the conductance of the entire surface, i.e. including soil and plant canopy components. It is commonly calculated by inverting the Penman-Monteith (PM) equation (surface. conductance()): where s is the slope of the saturation vapor pressure curve (kPa K −1 ), R n is the net radiation (W m −2 ), G is the ground heat flux (W m −2 ), and S is the sum of all energy storage fluxes (W m −2 ). Eq 15 implicitly assumes that G a for water vapor equals G a for heat, i.e. G ah = G aw which corresponds to an amphistomatous vegetation where the transfer of both heat and water vapor occurs at both leaf sides. The hypostomatous case (water vapor transfer from one side only) is conceptually not straightforward at the canopy level [22,42], and is thus currently not implemented in this package. Eq 15 further assumes that the energy balance is closed (i.e. R n − G − S = λE + H). The derived G sw and all subsequent derivations are sensitive to violations of this assumption [23,62]. The function surface.conductance() offers the calculation of G sw according to Eq 15, and a simplified (but also less realistic) formulation based on a simple fluxgradient approach, which assumes infinite G ah : G sw = λE/λ/(D a /p). This formulation is equivalent to the one proposed by [63].
Vegetation-atmosphere decoupling. With both G ah and G sw available, the degree of aerodynamic decoupling between the land surface and the atmosphere can be assessed with the decoupling coefficient O, which takes values between 0 and 1. Low values indicate well-coupled conditions and a high degree of physiological control on ET. Values close to 1 indicate the opposite, i.e. poorly coupled conditions and a low sensitivity of ET to G sw [21,22]. In its simplest and most commonly used form, O is given by [21] (decoupling()): Eq 16 was modified by [64], who included the effects of radiative coupling between the vegetation and the atmosphere: where G r is the longwave radiative transfer conductance of the canopy (m s −1 ), calculated as G r ¼ 4sT 3 a LAI=c p (longwave.conductance()). Note that, as in the PM equation (Eq 15), Eqs 16 and 17 assume that the vegetation is amphistomatous [21].
Imposed and equilibrium evapotranspiration. The concept of decoupling is often used to characterize physiological and energy controls on transpiration. In addition it can help to quantify radiation and VPD controls on λE (e.g. [65]). λE can be written in an alternative way [21](equilibrium.imposed.ET()): where and Eqs 19 and 20 are derived directly from the PM equation by letting G ah approach 0 or 1, respectively. Thus, λE eq is the λE rate that would occur if the surface was completely decoupled from the atmosphere. In this case, λE is strongly controlled by R n . Likewise, λE imp can be interpreted as the λE rate that would occur under fully coupled conditions, in which case λE is mainly dependent on G sw and D a .
Potential evapotranspiration. Potential evapotranspiration (λE pot ) is frequently used to characterize atmospheric demand and the degree of climatic aridity (e.g. [11]). Here, λE pot is by default calculated from the Priestley-Taylor equation [66] (potential.ET()): where α is the Priestley-Taylor coefficient, which accounts for large-scale advection effects. Its value is usually set to 1.26, but it likely varies with surface conditions [67]). λE pot can further be calculated from the PM equation with a prescribed G sw [6], which may correspond to typical maximum values (e.g. 95% quantile) found in the ecosystem: Energy balance. The package contains basic functionalities to characterize energy balance closure at EC sites. The function energy.closure() quantifies the energy balance closure (R n − G − S = λE + H) with both the slope method and the energy balance ratio (EBR) as described in [68]. The package further enables the calculation of biochemical energy (S p ), a small and therefore often neglected component of the energy balance: S p = αNEE, where α = 0.422 J mol −1 denotes the biochemical energy taken up/released by photosynthesis/respiration per mole of CO 2 fixed/respired [69]. The function energy.use.efficiency() provides a simple estimate of the energy use efficiency (EUE) of the ecosystem: EUE = S p /R n .
Physiological ecosystem quantities. For ecosystems that have a largely closed vegetation cover, and under conditions when canopy and soil surfaces are not wet, the derived G s can be interpreted as a proxy for the canopy-integrated stomatal conductance (i.e. canopy conductance G c ) [36]. G s may then be used to calculate additional physiological quantities. The function stomatal.slope() returns an estimate of the stomatal slope parameter G 1 at ecosystem level, analogous to g 1 at leaf level [41] (Note that in this paper, uppercase and lowercase letters denote physiological quantities at ecosystem and leaf-level, respectively). G 1 is estimated using non-linear regression from the unified stomatal model (USO) [70]: where G 0 is the minimum canopy conductance (mol m −2 s −1 ), and GPP is gross primary productivity (μmol CO 2 m −2 s −1 ). D s and C s represent conditions at the notional 'big-leaf' surface in this case (Eqs 12 and 13, respectively), but they are often replaced by the measured values at instrument height (i.e. D a and C a [41]. G 0 can either be estimated along with G 1 , or fixed to a user-defined value (e.g. set to 0). In addition to Eq 23, G 1 can be calculated from the stomatal model proposed by [71], or from its modified version suggested by [72]. Note that absolute values and units of G 1 differ across models. GPP is not directly measured at EC sites but inferred from NEE-partitioning algorithms (e.g. [73,74]). GPP is further not directly analogous to leaflevel net photosynthesis (A n ), and ecosystem leaf day respiration, if available, may be subtracted from GPP to better represent canopy-level A n [24, 75]. The package further includes several alternative water-use efficiency (WUE) metrics (WUE.metrics()) which can be calculated more readily from the measured fluxes, but which contain less physiological information [23]. Examples are WUE (= GPP/ET), inherent WUE (IWUE = (GPP D a )/ET) [12], or underlying WUE (uWUE ¼ ðGPP ffiffiffiffiffi D a p Þ=ET)/ET) [13]. Stomatal sensitivity to VPD, a relevant indicator of vegetation water-use strategy, can be characterized with the following function [76] (stomatal.sensitivity()): where the two parameters m (mol m −2 s −1 ln(kPa) −1 ) and b (mol m −2 s −1 ) represent the sensitivity of G sw to D s (D a can be used alternatively) and the reference G sw at D s of 1 kPa, respectively [6,76]. Bulk canopy intercellular CO 2 concentration (C i in μmol mol −1 ) can be inferred from Fick's first law analogously to the calculation of c i at leaf level (see e.g [24, 77], intercellular.CO2()): where C s is the CO 2 concentration at the 'big-leaf' surface (μmol mol −1 ; Eq 13), which can also be approximated by C a . G sc denotes the surface conductance to CO 2 (mol CO 2 m −2 s −1 ) and is calculated as G sc = G sw /1.6.
With C i available, the 'big-leaf' concept may be further expanded to calculate an estimate of basic photosynthetic parameters such as the maximum carboxylation rate (V cmax ) and maximum electron transport rate (J max ) at canopy level (e.g. [24,26,78], photosynthetic. capacity()). The calculation is once more analogous to that at leaf level, where commonly the model developed by [79] is employed. Note however, that especially for V cmax and J max the interpretation differs from that at leaf level (see Discussion). From the Rubisco-limited photosynthesis rate (when carboxylation is the rate limiting process i.e. GPP = GPP c , usually under high radiation), V cmax (μmol m −2 s −1 ) can be calculated as: where K c (μmol mol −1 ) and K o (mmol mol −1 ) are the Michaelis-Menten constants for CO 2 and O 2 , respectively, O i (mol mol −1 ) is the O 2 concentration, and Γ Ã (μmol mol −1 ) is the photorespiratory CO 2 compensation point (μmol mol −1 ). All photosynthetic parameters and their temperature responses (activation energies) are taken from [80] and assume infinite mesophyll conductance to CO 2 transfer. Under conditions when Ribulose 1,5-bisphosphate (RuBP)regeneration is limiting photosynthesis (i.e. GPP = GPP j ), the electron transport rate J (μmol m −2 s −1 ) is given by: J max is then calculated from the following relation: where APPFD PSII is absorbed photosynthetic photon flux density (PPFD) by photosystem II (μmol m −2 s −1 ), and Θ is a curvature parameter. APPFD PSII is currently assumed to be a constant fraction of PPFD (by default APPFD PSII = 0.8PPFD), but a more realistic estimate of APPFD, depending on solar elevation angle and LAI, will be implemented in the future. Bulk canopy photosynthesis is assumed to be limited by either Rubisco activity (GPP = GPP c ) or RuBP-regeneration (GPP = GPP j ) at high and low radiation, respectively, and simple radiation thresholds are applied to separate the two limitation states. V cmax and J max are temperaturedependent and are normalized to the reference temperature of 25˚C (i.e. V cmax,25 and J max,25 ) using a modified Arrhenius equation as described in e.g. [81] with default parameter values from [80] and [82]. Ecosystem light response curves (LRCs) are useful to characterize both the CO 2 uptake rate at light saturation as well as the light utilization efficiency (i.e. the initial slope). The most frequently used model is the rectangular hyperbolic LRC, which can be written in a general form as [83] (light.response()): where α is the initial slope of the light-response curve (μmol CO 2 m −2 s −1 (μmol quanta m −2 s −1 ) −1 ), R eco is ecosystem respiration (μmol CO 2 m −2 s −1 ), and PPFD ref is the PPFD value at which GPP ref (μmol CO 2 m −2 s −1 ) is calculated (usually at saturating light, e.g. at 2000 μmol m −2 s −1 ). Additionally, a simple light-use efficiency (LUE) metric, defined as the ratio of cumulative GPP to cumulative PPFD, is available in the package (light.use.efficiency()).

Single-level EC sites
Three sites with EC measurements at a single level above the canopy were chosen for the demonstration of the formulations described above: AT-Neu (Neustift), a managed grassland in Austria [84], DE-Tha (Tharandt), a high-statured (mean canopy height = 26.5m) spruce forest in Eastern Germany [85], and FR-Pue (Puechabon), a Mediterranean evergreen oak forest in southern France, which is subject to seasonal water stress [86]. The location as well as basic ecosystem properties for these sites are listed in Table 1 . Seasonal courses of G s , G a and vegetation-atmosphere decoupling. We calculated seasonal dynamics of aerodynamic and surface conductance to water vapor, as well as the decoupling coefficient O (Fig 2). The results reveal that G ah is relatively constant over the course of the year, but differs in magnitude across sites. As expected, highest values can be found in the aerodynamically rough spruce forest DE-Tha, and lowest values in the meadow AT-Neu. FR-Pue shows intermediate values. Differences between the different G ah versions result from different models of the bulk boundary layer conductance (G bh ; Eqs 4-7). The different G bh formulations agree well for AT-Neu and FR-Pue, but lead to clear differences in estimated G ah for DE-Tha. This is likely because the Choudhury (Eq 5) and Su (Eq 7) models consider additional aerodynamic properties (e.g. leaf size, LAI) that are neglected in the Thom model (Eq 4). Thus, accounting for the low leaf characteristic dimension / leaf width and high LAI in DE-Tha leads to a higher G ah in the Su and especially in the Choudhury formulation compared to the Thom model. The differences in G ah among the formulations do not have strong effects on the derived G sw and O. G sw shows pronounced seasonal dynamics at all three sites. Lowest values correspond to inactive vegetation, as e.g. caused by soil water stress (DOY 190-240 in FR-Pue). The dynamics in G sw are clearly reflected in O, the magnitude of which differs considerably across sites. AT-Neu (grassland) is relatively poorly coupled, whereas DE-Tha (forest) shows a high degree of coupling. All three sites show typical values for the respective vegetation type [87]. The bigleaf R package Surface conditions. Fig 3 depicts mean diurnal courses of air temperature, vapor pressure, VPD, and CO 2 concentration and the respective surface variables as calculated from Eqs 10-13 for the summer months June, July, and August (JJA) of all available site years. At all three sites, aerodynamic surface temperature T s (Eq 10) exceeds air temperature at daytime and is lower at nighttime. T s -T a is largely parallel to the course of H throughout the day. The inferred temperature difference depends not only on the magnitude of H, but also on G ah . It follows that the grassland AT-Neu has a more pronounced temperature difference for the same H than the forest DE-Tha owing to its lower efficiency to transfer heat to the atmosphere (i.e. lower G ah ). Temperature gradients are most pronounced at FR-Pue (approx. 4˚C at midday) where a large fraction of the available energy goes into H. Radiometric surface temperature (T r ; Eq 14) generally agrees well with T s , but shows biases at some timeperiods (e.g. AT-Neu at night). Differences between T s and T r can be caused by inappropriate emissivity values, biases in the estimated G ah , or differences in the spatial representativeness of radiation (LW " ) and flux (H) measurements. The derived vapor pressure at the 'big-leaf' surface (e s ) exceeds the measured values at instrument height (e a ) at all three sites during daytime. The water vapor gradient at AT-Neu is significantly higher than at the other two sites, which is caused by the relatively high λE and low G ah . The high e s at AT-Neu leads to a decrease of surface VPD (D s ) compared to air VPD (D a ). In contrast, the temperature effect on VPD is stronger than the moisture effect in DE-Tha and FR-Pue, with the consequence that D s exceeds D a at daytime at these two sites. Future analyses should be directed to the question whether these patterns hold across sites and vegetation types.
The difference of CO 2 concentration at the 'big-leaf' surface (C s ) to the concentration in the atmosphere (C a ) follows the diurnal pattern of NEE (Fig 3j-3l). Daytime photosynthetic CO 2 uptake and nocturnal ecosystem respiration lead to lower or higher CO 2 concentrations, respectively, at the surface compared to the air. The absolute differences are generally low Relationship between G s and GPP. Fig 4 illustrates the relationship between G sw and the "stomatal index", i.e. GPP adjusted for VPD and CO 2 concentration [41] for the year 2012. The relationship between these two quantities characterizes intrinsic WUE (iWUE) at ecosystem level and provides essential information on the physiological basis of ecosystem WUE. The slope of the depicted relationship approximates the G 1,USO parameter ("stomatal slope") with higher slopes corresponding to a lower iWUE. Points in Fig 4 are colored according to the C i /C s ratio, which is again closely related to iWUE. High C i /C s correspond to high stomatal slopes and lower WUE, and the opposite is the case for low C i /C s . The relationship between G sw and the "stomatal index" shows large scatter, especially at AT-Neu, which indicates variations of iWUE throughout the growing season. Such variations within one year may be caused by changes in phenology, LAI (as e.g. caused by mowing) or the onset of water stress.

Two-level EC site
The package was further applied to data from the site ES-LMa (Majadas de Tietar), where fluxes and meteorology were measured at two different heights. The site (39˚56'N; 5˚46'W, 260 m a.s.l.) is an open woodland with a tree canopy cover (mainly Quercus ilex) of about 20% [88]. Ecosystem fluxes were measured at 15.5 m above ground (7 m above tree canopy height) and grass layer fluxes were measured with a second tower at 1.65 m height. Tree fluxes were derived as the differences of the ecosystem fluxes and the grass layer fluxes similar to [28,29]. G 1,USO and uWUE were calculated for a moving window of +/-3 weeks which was shifted by one week for each calculation. This procedure was done for the ecosystem, grass layer and trees. Minimum, maximum and mean of mean daily air temperature and soil water content were calculated for the same period. The bigleaf R package Differences in G 1,USO follow clear seasonal patterns (Fig 5) depending on water availability, VPD (which follows air temperature), and the associated growth and senescence of the grass layer. Ecosystem G 1,USO is relatively constant during the growing periods of 2016 and 2017 (winter and spring). G 1,USO of the grass layer is more variable as compared to the ecosystem. This mirrors the seasonal dynamics and fast responses of the grass layer to environmental conditions. For G 1,USO of the grass layer a pronounced increase is visible before G 1,USO drops during the summer drought. The increase is due to the rapid drop in GPP as the grasses start wilting due to drying of the top soil, while λE reduces much slower due to soil evaporation from deeper layers. The subsequent drop in G 1,USO is then caused by the continuous reduction in λE during the dry period as the deeper soil layers are also drying out. Q. ilex trees are rather isohydric and react to increasing VPD by closing their stomata to reduce water losses, which results in a decreasing G 1,USO . In 2017, G 1,USO of the trees decreases more slowly compared to 2016, which is most likely caused by several rain pulses that increased the water availability and reduced VPD as compared to the long lasting dry period in 2016. G 1,USO (Fig 5a) and the uWUE (Fig 5b) show strongly anti-correlated patterns. As G 1,USO increases the uWUE The bigleaf R package reduces and vice versa. The trees are able to strongly increase their uWUE as atmospheric humidity and soil water availability are reduced. Tables 2 and 3 present physical and physiological   , and rainfree periods (24h after rainfall excluded). Data were further filtered for D a > 0.01 kPa, λE > 0 W m −2 and T a > 5C. G ah was calculated according to Eqs 2 and 7, unless stated otherwise. More information on the ancillary data used for the calculations can be found under http://www.bitbucket.org/ juergenknauer/bigleaf/src/master/ancillary. Note that for this study, ancillary variables (e.g. LAI, z h , z r ) were assumed to be constant throughout all site years. In many cases, however, they vary across the growing season or among years. Thus, for a more realistic representation of the calculated ecosystem properties, required ancillary variables, if available, should be provided at an adequate temporal resolution. In general, computations in the bigleaf package are fast, e.g. with a state-of-the-art PC it takes < 0.1 seconds to calculate G s for 10 site years and 2-3 seconds to calculate all properties as shown in Tables 2 and 3.

Potential and limitations of the 'big-leaf' approach
All calculations implemented in the bigleaf package are based on the 'big-leaf' framework [35,44], which reduces the ecosystem to a single, uniform plane (Fig 1). This approach thus assumes that vegetation as well as meteorological conditions are vertically and horizontally homogenous. One advantage of the 'big-leaf' approach is that calculations require no additional information on the EC site or commonly available variables only (e.g. LAI, vegetation height). Ecosystem properties are inferred directly from EC measurements, with no assumptions on the underlying ecosystem structure. The 'big-leaf' approach is further applicable to both single-level and two-level EC systems. In the latter case ecosystem properties can be The bigleaf R package derived for two 'big leafs', e.g. whole ecosystem and understory [28,29] or whole ecosystem and grass layer (this study, Fig 5).
It is important to clarify that the bigleaf package exclusively applies a top-down approach, in which the 'big-leaf' framework is used to estimate ecosystem properties inversely from the measured fluxes. The package does not provide bottom-up model formulations, which apply a 'big-leaf' framework to up-scale simulated fluxes from leaf-to canopy-level. This up-scaling approach has been shown to be prone to integration errors [31,89]. However, this type of error does not apply to the calculations in the bigleaf package because the 'big-leaf' framework is solely used for the derivation of bulk ecosystem properties and no up-or downscaling is performed.
The fact that the top-down 'big-leaf' approach as applied in this package can only derive bulk ecosystem properties is also its most critical limitation. It is not possible to resolve the vertical distribution of the derived properties. For example, soil and vegetation components cannot be distinguished and the resulting properties will inevitably contain signals from both the soil and the vegetation. These drawbacks can only be circumvented by modeling approaches such as two-layer (soil/canopy) [33,51] or dual-source (sun/shade) models [31], which attempt to resolve the flux contribution of different canopy fractions or ecosystem compartments. These alternative modeling frameworks are more complex and consequently require additional site-specific information (e.g. canopy clumping, canopy nitrogen profiles, etc.). They are thus mostly applied to a few sites where these additional model parameters are sufficiently well known (e.g. [90,91]). The 'big-leaf' framework is thus most suitable for multi-site comparisons or for sites where little ancillary information is available, and where no detailed knowledge on the derived variable (e.g. canopy gradients) is required.

Interpretation of the derived physiological properties
The bigleaf package provides functions to calculate ecosystem-scale physiological variables such as G s , G 1 , C i , V cmax , J max , and GPP ref in the same manner as it is commonly done at leaflevel. Important in this context is that the interpretation of these bulk canopy variables is not as straightforward as that of their leaf-level analogues (see also [23]). This is due to 1) conceptual uncertainties (as discussed above), and 2) the presence of confounding physical factors. For instance, the intensity of the before-mentioned mixing of soil and vegetation signals increases with a decrease of vegetation density (i.e. LAI) of the ecosystem. [36] for instance showed that G c is substantially overestimated in ecosystems with an LAI less than approx. 2. This does not mean that the calculation of G s is meaningless in low-LAI ecosystems, but its physiological interpretation as G c is increasingly compromised as vegetation cover decreases. For ecosystems with an LAI lower than 2-3, the inversion of a soil/canopy model [33] is likely more appropriate than the inversion of the 'big-leaf' model for the derivation of physiological variables.
In all ecosystems, confounding physical factors, which are non-existent or negligible at leaflevel, must be taken into account in order to extract a meaningful physiological signal. For example, evaporation (i.e. water fluxes not under plant control) occurring after rainfall will lead to an overestimation of the stomatal slope parameter G 1 , and thus to an underestimation of WUE, if such time-periods are not filtered out (see [23] for an overview of confounding factors and their associated uncertainties).
In general, uncertainties of physiological variables propagate with each calculation step. For example, C i as calculated by Eq 25 is affected by uncertainties in both input variables G s and GPP. Photosynthetic parameters are affected by the same uncertainties and in addition by assumptions made for their calculation. It follows that with increasing number of calculation steps following the derivation of G s , uncertainties increase and the meaningfulness of the derived variables depends critically on the applied data filtering and the quality of the (original or partitioned) data.
As discussed above, all physiological variables are integrated over the entire canopy and represent bulk canopy properties (expressed in units per ground area instead of leaf area). They are thus not directly comparable to leaf-level measurements taken at a particular location in the canopy. The discrepancies between leaf and ecosystem values will be most pronounced for variables with a distinct profile within the canopy (e.g. V cmax and J max [31]), and probably less relevant for G 1 .

General package usage guidelines
Data filtering. For most applications, it is recommended to apply a basic data filter that removes unreliable measurements or certain meteorological conditions. The optimal type of filter depends on the purpose of the study and the variable of interest. For example, it is advisable to exclude negative λE values from the calculation of G s in order to minimize the occurrence of negative G s estimates which are not readily interpretable. Furthermore, periods outside the growing season or following rainfall should be removed if G s is interpreted in an ecophysiological context. G a and surface conditions on the other hand can in principle be calculated for all conditions. In general, data that do not fulfill the assumptions of the EC method, or that were gap-filled with low confidence, should be discarded. Depending on the filter settings and the conditions at the site, this can lead to a considerable fraction of missing values in the dataset. This is generally not a problem for the subsequent analyses in this package (missing input data simply return NA again), but some (regression-based) functions may require a minimum number of available data in order to return robust results.
Treatment of uncertainties. The derived variables in the bigleaf package are affected by several sources of uncertainty, which may be classified as 1) random errors in the measured fluxes [92,93], 2) systematic errors in the fluxes due to e.g. energy-balance non-closure, advection problems [94,95] and 3) conceptual uncertainties. The complex nature of uncertainties in EC measurements and the associated computational challenges to adequately account for and propagate all sources of uncertainty in the derived variables are the main reasons why the bigleaf package does not offer uncertainty estimates for each output interval. To account for one or more of the outlined sources of uncertainties, the use of wrapper functions is the most meaningful approach. These functions (often in specialized R packages) apply e.g. Monte Carlo (parameter sensitivity on the derived variables) or bootstrapping (random data sampling with replacement) techniques without the need to modify the functions in bigleaf. Some simple examples on the use of such wrapper functions are given in the vignette of the bigleaf package (accessible in R with browseVignettes("bigleaf").
Use of the derived properties. The majority of the derived properties in the bigleaf package are intended to be primarily diagnostic, i.e. results serve to provide a more mechanistic understanding of the observed fluxes, which enables a more comprehensive analysis and interpretation of ecosystem surface-atmosphere gas exchange. These diagnostics provide additional insights on the underlying physical or physiological processes and are often directly comparable across sites and climatic conditions. Some variables may further be helpful for the parameterization, calibration, or evaluation of bottom-up models. For that purpose, two major prerequisites must be fulfilled: (1) the variable of interest derived with a top-down (inversion) approach must be at the same organizational scale as the one calculated in the bottom-up model, and (2) the framework and the assumptions made in the two approaches must be consistent. For example, both the dynamics and magnitude of the simulated degree of atmosphere-canopy decoupling (O) by land surface models can be directly compared with the O values derived from this package [87]. This also applies to other characteristics such as G a , G s , or WUE and LUE metrics that are simulated as (emergent) bulk surface properties in models. In contrast, physiological bulk canopy parameters such as C i should not be compared to leaf-level c i values as simulated by multi-layer models. Likewise, bulk canopy or V cmax,25 cannot be used to parameterize leaf-level v cmax,25 in multi-layer models. In any case, it is imperative that uncertainties specific to the EC-method (as summarized in the previous section) are taken into account when derived properties are used for bottom-up modeling purposes.

Conclusions
The presented R package bigleaf provides a framework for the derivation of physical and physiological ecosystem properties at EC sites in a consistent and reproducible manner and with minimal requirements regarding ancillary site data. The package thus has the potential to increase the comparability of the provided calculations as well as their applicability across sites. The functions will be useful in complementing the analysis of land-atmosphere mass and energy fluxes by providing a basic level of process understanding. The availability of additional ecosystem surface characteristics as provided by the bigleaf package will be key in interpreting ever-increasing records of EC data and the responses of land-atmosphere exchange to global environmental change. The open source and version control environment further enable the continuous development of the package and encourage community input.