Fig 1.
Flowchart of the Landsat data extraction and post-processing.
SWIR = Shortwave Infrared. SR = Surface Reflectance. Monthly temporal gap-filled Landsat time series from 1982 to 2015 of the shortwave Infrared band are shown for AR-Vir and US-SO3 sites where, respectively, afforestation-reforestation and fire followed by a regrowth were reported in 2003. The solid and the dashed lines depict the real observations and the gap-filled data, respectively.
Fig 2.
Flowchart of the proposed LSTM approach.
Figure adapted from [61]. Each individual timestep is a monthly observation for the period 1982 to 2015. Landsat surface reflectances correspond to the seven spectral bands of the Landsat product; i.e. blue, green, red, near infrared, shortwave infrared 1, shortwave infrared 2, and thermal infrared. Climate corresponds to air temperature, precipitation, global radiation, and vapor pressure deficit. At each time step, an LSTM layer containing a set of cells or hidden neurons (10, 20, or 30) processes information based on the input of the current and of all previous observations. Predictions of net ecosystem exchange were performed at each monthly timestep by using information from both current and previous observations. The loss function was only calculated when net ecosystem exchange observations were available; i.e. measurement periods of LaThuile and FLUXNET2015 datasets.
Table 1.
Design of the factorial experiment.
X means that the variant was used to study the respective topic of each row. LSTM = LSTM model using the full depth of the Landsat time series and climate data; LSTMperm = LSTM model but the temporal patterns of both the predictive and the target variables were randomly permuted while instantaneous relationships between predictive and target variables were kept; LSTMmsc = LSTM model but the Landsat time series for each band were replaced by their mean seasonal cycle, while using the actual values of air temperature (Tair), precipitation (P), global radiation (Rg), and vapor pressure deficit (VPD); LSTMannual = LSTM model but the Landsat time series for each band were replaced by their annual mean, while using the actual values of Tair, P, Rg, and VPD, RF = Random Forest model using the actual values of the Landsat time series and climate data.
Fig 3.
Illustration of the different Landsat time series temporal architectures of the different LSTM model set-ups for the SWIR band only for the period 1990-2015.
SWIR = Short Wave Infrared. SR = Surface Reflectance. The US-SO3 site where fire followed by regrowth was reported in 2003 is shown.
Table 2.
List of predictors used in the different model set-ups.
Tair = air temperature, P = precipitation, Rg = global radiation, and VPD = vapor pressure deficit. LSTM = LSTM model using the full depth of the Landsat time series and climate data; LSTMperm = LSTM model but the temporal patterns of both the predictive and the target variables were randomly permuted while instantaneous relationships between predictive and target variables were kept; LSTMmsc = LSTM model but the Landsat time series for each band were replaced by their mean seasonal cycle, while using the actual values of air temperature (Tair), precipitation (P), global radiation (Rg), and vapor pressure deficit (VPD); LSTMannual = LSTM model but the Landsat time series for each band were replaced by their annual mean, while using the actual values of Tair, P, Rg, and VPD, RF = Random Forest model using the actual values of the Landsat time series and climate data.
Fig 4.
Scatterplots of observed data by eddy-covariance and the LSTM modeled fluxes for the seasonal cycle (Fig 4a), seasonal anomalies (Fig 4b), across-site variability (Fig 4c), and interannual anomalies (Fig 4d). The modeled estimates are derived from the mean ensemble of the 50 model runs.
Table 3.
Nash-Sutcliffe modeling efficiency of the LSTM setup per vegetation type and climate region from the ensemble mean ±sd estimate of the 50 runs.
Statistics for the anomalies were not calculated in the arid and tropical climate (i.e. NA) because there was no site with at least 3 years of complete data after data quality control. Savanna vegetation type includes both savanna and woody savanna sites.
Table 4.
Nash-Sutcliffe modeling efficiency of the proposed approach against the other model set-ups from the ensemble mean ±sd estimate of the 50 model runs.
LSTM = LSTM model using the full depth of the Landsat time series and climate data; LSTMperm = LSTM model but the temporal patterns of both the predictive and the target variables were randomly permuted while instantaneous relationships between predictive and target variables were kept; LSTMmsc = LSTM model but the Landsat time series for each band were replaced by their mean seasonal cycle, while using the actual values of air temperature (Tair), precipitation (P), global radiation (Rg), and vapor pressure deficit (VPD); LSTMannual = LSTM model but the Landsat time series for each band were replaced by their annual mean, while using the actual values of Tair, P, Rg, and VPD, RF = Random Forest model using the actual values of the Landsat time series and climate data.
Fig 5.
Effects on predicting monthly NEE by altering n year in the predictors for deciduous and evergreen forests.
Average of the absolute residuals calculated between predicted monthly NEE with 0 year altered in the predictors against predicted monthly NEE with yeari−n altered in the predictors for deciduous and evergreen forests (Fig 5a and b, respectively). The absolute residuals for the mean seasonal cycle were also reported (Fig 5c and d for deciduous and evergreen forests, respectively). “1 year” means that only the last year was altered, “2 years” means that the last two years were altered, and so on. Months for the sites located in the Southern hemisphere have been adjusted to match the seasonal cycle of the sites in the Northern hemisphere.
Fig 6.
Nash-Sutcliffe modeling efficiency comparison between the proposed LSTM-based models and the other model set-ups for (a) deciduous and (b) evergreen forests. Nash-Sutcliffe modeling efficiency values have been calculated based on the mean ensemble ±sd of the 50 model runs. LSTM = LSTM model using the full depth of the Landsat time series and climate data; LSTMperm = LSTM model but the temporal patterns of both the predictive and the target variables were randomly permuted while instantaneous relationships between predictive and target variables were kept; LSTMmsc = LSTM model but the Landsat time series for each band were replaced by their mean seasonal cycle, while using the actual values of air temperature (Tair), precipitation (P), global radiation (Rg), and vapor pressure deficit (VPD); LSTMannual = LSTM model but the Landsat time series for each band were replaced by their annual mean, while using the actual values of Tair, P, Rg, and VPD, RF = Random Forest model using the actual values of the Landsat time series and climate data.
Fig 7.
Mean seasonal variation of NEE residuals for LSTM, LSTMperm, LSTMmsc, and LSTMannual models for (a) deciduous and (b) evergreen forests. NEE residuals = [NEE observedi,j − mean(NEE observedi)] − [NEE predictedi,j − mean(NEE predictedi)], where i is a unique Fluxnet site and j is a monthly observation. Residual estimates have been calculated based on the mean ensemble ±sd of the 50 model runs. LSTM = LSTM model using the full depth of the Landsat time series and climate data; LSTMperm = LSTM model but the temporal patterns of both the predictive and the target variables were randomly permuted while instantaneous relationships between predictive and target variables were kept; LSTMmsc = LSTM model but the Landsat time series for each band were replaced by their mean seasonal cycle, while using the actual values of air temperature (Tair), precipitation (P), global radiation (Rg), and vapor pressure deficit (VPD); LSTMannual = LSTM model but the Landsat time series for each band were replaced by their annual mean, while using the actual values of Tair, P, Rg, and VPD, RF = Random Forest model using the actual values of the Landsat time series and climate data. Months for the sites located in the Southern hemisphere have been adjusted to match the seasonal cycle of the sites in the Northern hemisphere.
Fig 8.
Model residuals per age class for LSTM, LSTMperm, LSTMmsc, LSTMannual, and RF models based on site-average NEE.
LSTM = LSTM model using the full depth of the Landsat time series and climate data; LSTMperm = LSTM model but the temporal patterns of both the predictive and the target variables were randomly permuted while instantaneous relationships between predictive and target variables were kept; LSTMmsc = LSTM model but the Landsat time series for each band were replaced by their mean seasonal cycle, while using the actual values of air temperature (Tair), precipitation (P), global radiation (Rg), and vapor pressure deficit (VPD); LSTMannual = LSTM model but the Landsat time series for each band were replaced by their annual mean, while using the actual values of Tair, P, Rg, and VPD, RF = Random Forest model using the actual values of the Landsat time series and climate data.; LSTMage = LSTM + forest age as a predictive variable; LSTMyoung = LSTM only trained with forests younger than 40 years.