Predictive Models of Primary Tropical Forest Structure from Geomorphometric Variables Based on SRTM in the Tapajós Region, Brazilian Amazon

Surveying primary tropical forest over large regions is challenging. Indirect methods of relating terrain information or other external spatial datasets to forest biophysical parameters can provide forest structural maps at large scales but the inherent uncertainties need to be evaluated fully. The goal of the present study was to evaluate relief characteristics, measured through geomorphometric variables, as predictors of forest structural characteristics such as average tree basal area (BA) and height (H) and average percentage canopy openness (CO). Our hypothesis is that geomorphometric variables are good predictors of the structure of primary tropical forest, even in areas, with low altitude variation. The study was performed at the Tapajós National Forest, located in the Western State of Pará, Brazil. Forty-three plots were sampled. Predictive models for BA, H and CO were parameterized based on geomorphometric variables using multiple linear regression. Validation of the models with nine independent sample plots revealed a Root Mean Square Error (RMSE) of 3.73 m2/ha (20%) for BA, 1.70 m (12%) for H, and 1.78% (21%) for CO. The coefficient of determination between observed and predicted values were r2 = 0.32 for CO, r2 = 0.26 for H and r2 = 0.52 for BA. The models obtained were able to adequately estimate BA and CO. In summary, it can be concluded that relief variables are good predictors of vegetation structure and enable the creation of forest structure maps in primary tropical rainforest with an acceptable uncertainty.


Introduction
Deforestation and forest fragmentation by human activities are some of the main environmental problems of tropical forests. The destruction and degradation of the forest cover decreases its ecological integrity, decreasing biodiversity and negatively impacting the functioning of these systems [1,2]. Therefore, the search for indicators and models capable of detecting spatial and temporal variations of forest structure in natural environments is highly relevant, enabling natural variability to be distinguished from changes of anthropic origin. Models based on reference conditions (natural areas or with minimum impact) allow estimation of the expected vegetation over a gradient of different vegetation types in undisturbed forests. The difference between the expected characteristics of the natural vegetation for a given region and those observed may be considered an indicator of the intensity of anthropic impact on the forest cover.
Forest structure can affect several ecosystem processes [3,4,5] and biodiversity [6,7,8], as it affects nutrient cycling [9] and the availability of niches for several species [10,11]. Therefore, forest structure characteristics are likely good indicators of forest functioning and biodiversity [8,9,12,13]. In addition, vegetation structure characteristics may give information regarding biomass, which is essential for carbon stock quantification, and its monitoring allows the evaluation of the effectiveness of measures for reducing emissions from deforestation and forest degradation (REDD+).
Average tree basal area (BA) and height (H) of trees and average percentage of canopy openness (CO) are often used to describe forest structure. Together, these parameters can offer a general view of the vegetation structure, which can be determined from different environmental factors, namely relief, soil, temperature, water availability, geological structure or fire incidence [14,15,16,17,18,19,20,21]. Among these factors, the relief underneath a forest canopy is usually not changed significantly by anthropic activities and is therefore thought to be very effective for the construction of predictive models for the estimation of vegetation characteristics in natural areas. Moreover, relief may affect other equally important factors (e.g. soil characteristics); hence, it may have direct or indirect effects on vegetation.
In the present study relief was represented by local geomorphometric characteristics, such as elevation, slope gradient and slope aspect, which strongly affect vegetation structure [16,22,23]. Elevation (height of land relative to sea level) and its variation are usually related to local microclimate. Slope gradient and slope aspect determine the intensity and direction of flows of matter and insolation, respectively, and affect the local water and energy regimes. In addition to these variables, the basic set of characteristics for local land characterization should also include the plan and profile curvatures [22,24,25,26]. The plan curvature corresponds to the variation among the convex, straight and concave landforms of the terrain and the profile curvature corresponds to the variation among the divergent, neutral and convergent landforms of the terrain [27]. These two variables combined characterize the land surface, which is directly associated with hydrological and transport properties and may influence vegetation indirectly [26,28].
Several studies have analysed the influence of environmental variability on spatial differentiation of vegetation. However, few have rigorously tested the effects of local geomorphometric variation on the structural characteristics of vegetation [22,29,30], such as BA, H and CO, especially when the relief descriptor variables are derived from remote sensing data. The topography can affect the vegetation structure both directly and indirectly. For example, Webb et al. [31] analysed the direct effects of topography on forest structure in an area of American Samoa and observed higher tree density in ridge forests, with trees of wide diameter and low height at higher altitudes (usually in ridges). In contrast, relief also influences other variables that are important for vegetation, such as humidity, nutrient availability, solar radiation, temperature and soil characteristics, which may have indirect effects on vegetation [21,22,32,33,34,35].
As geomorphometric variables are available from satellite missions such as the Shuttle Radar Topography Mission (SRTM), they offer a methodological approach for indirect estimation and mapping of forest structural variables at larger scales, if robust relationships between the two are confirmed. The magnitude of the uncertainties in forest structure maps from geomorphometric proxies needs to be carefully quantified. Thus, because relief characteristics are related to vegetation structure [29,36], the goal of the present study was to evaluate topographic characteristics through measurement of local geomorphometric variables as predictors of structural characteristics of vegetation (BA, H and CO). Our hypothesis is that geomorphometric variables are good predictors of the vegetation structure of a primary tropical forest, even in an area with low altitude variation. If this relationship is confirmed, models will be built and used to generate maps.

Study area
The study area belongs to the Tapajós National Forest (TNF), between coordinates 2°35' to 4°2 0'S and 54°40' to 55°40' W, located in the Lower Amazon River mesoregion, Western of Pará State, Brazil (Fig 1). The TNF is an area of environmental protection with approximately 545.000 ha representative of Amazon Forest.
The region's climate is Ami type, according to the Köppen climate classification, with approximately 25°C annual average temperature, 1800 mm annual average rainfall, and higher rainfall concentration between January and May [37]. The predominant soil types at the area are Yellow Oxisol and Red-Yellow Ultisol. The vegetation is mostly classified as Dense Ombrophilous Forest and Open Ombrophilous Forest [38].

Field data
A survey of the forest structure was performed in 43 plots with 25 x 100 m 2 , in a total area of 10.75 ha. In the present study, the surveyed plots presented different topographic positions and phyto-ecological classes, defined according to floristic, lithological and geomorphological characteristics of the TNF [39].
The vegetation typology in this area consist mostly of dense tropical rain forest and a small portion of open tropical rain forest in the southwest area of TNF. Within the 10.75 ha total sampled area, the elevation varies between 81 and 218 m, whereas slope varies from 2 to 27%. In total, 4163 arboreal individuals (49 plant families and 232 species) were found during the field work. Bispo et al. [28] in their study about the effects of the geomorphometric characteristics of the local terrain on floristic composition in the central Brazilian Amazon showed more details about the floristic of the study area.
The geographic position of diameter at breast height (DBH) and total height (H) were measured for all tree individuals with DBH ! 10 cm in each plot. Averages diameter at breast height, averages at height and canopy openness (CO) were therefore recorded for all plots. Tree basal area (BA) was calculated using DBH according to the following equation: Tree height (H) was visually assessed by a trained specialist. To improve visual estimates, these data were adjusted using the equation developed by Gonçalves et al. [40], which was based on the measurement of 277 trees at the study area and relates visual assessment of tree height with measurements performed using an electronic clinometer: Hce ¼ e 0:1845 Hev 0:9480 Where Hce represent the tree height measured using an electronic clinometer and Hev represent tree height visually assessed.
DBH was measured using a measuring tape. CO was estimated from hemispherical images obtained using a digital camera with 10.2 Megapixel resolution (Nikon D60) coupled to a wide-angle lens (fisheye) (Soligor 0.25x52 mm). The images were captured 1.8 m from the ground, with a 90 degree angle relative to the ground, under the canopy and avoiding direct sunlight, i.e., under conditions of uniform sky before sunrise (from 6:30 to 8:30 am) or after sunset (3.30 to 5:00 pm). These images were taken along central axis transects (25 x 100 m), every 20 m (following the same method used by Galvão et al. [41]), totaling in 5 images per transect. Images were analysed using Gap Light Analyser software (GLA) [42], and the fraction of canopy openness (Gap fraction) was calculated. Gap fraction is defined as the percentage of canopy gaps, i.e., the probability of sunlight passing through the forest canopy without meeting leaves or another plant part [43]. The GLA package uses an image classifier based on pixel thresholds to calculate the fraction of canopy openness. Therefore, a threshold needs to be defined for each image, which can vary with sun exposure. The images of the canopy and sky were adequately converted and separated into black and white pixels. The resulting images in binary format were used for determining the fraction of canopy openness (Fig 2).
To facilitate data integration, analysis and to enable the performance of spatial queries, the dendrometric parameter data were structured using the Geographic Information System (GIS).

Geomorphometric data derived from SRTM/TOPODATA
The geomorphometric data used (i.e., elevation, slope gradient, slope aspect, plan and profile curvature) were obtained from the Brazilian Geomorphometric Data Bank (TOPODATA) [44]. For the analysis, we used the TOPODATA that are based on the SRTM (Shuttle Radar Topography Mission-version 1, NASA, 2006) () refined for the Brazilian territory from the original resolution of 3 arc seconds to 1 arc second using a geostatistical approach [45]. Different neighborhood operations were applied to calculate geomorphometric variables [46].
There was a preference to use TOPODATA instead of the original SRTM. as it includes the derivations calculated with algorithms specially developed for the SRTM characteristics, which were widely tested on varied reliefs in order to run uniformly on the full extension of the Brazilian territory [46] In addition, TOPODATA is also free and their layers are easily accessible for their use in GIS (http://www.dsr.inpe.br/topodata/acesso.php). The geomorphometric variables obtained from TOPODATA are presented in Table 1. Subsequently, values for the

Processing of information layers, data extraction and pairing
Geomorphometric data layers included elevation, slope, aspect and profile and plan curvatures, as described. Since aspect is a circular variable, its linear components given by sine/cosine transformation were used for regression analyses instead.
Local geomorphometric data were obtained from the TOPODATA in GeoTIFF format, from the elevation image (Digital Elevation Model-DEM refined) and its derivations. For each variable, the grids (1°x 1.5°each) covering the study area were concatenated into an image mosaic.
The two datasets (vegetation and geomorphometric data) were overlaid after matching their respective georeference systems by reprojecting the raster images and plot polygons to a common projection. Geomorphometric data were averaged within plot extents and extracted in correspondence to vegetation data, resulting in paired datasets for further analysis in statistical software environments.

Statistical modeling of structural data
Predictive models were generated for each variable (BA, H and CO) based on local geomorphometric variables using multiple regression analyses [47]. Independent variables were selected using the forward stepwise selection procedure. A preliminary analysis showed the presence of four outliers in the observation group. These outliers were excluded from the analyses, and a new evaluation using Cook's distance was performed, showing the absence of additional Slope gradient (G) Inclination angle of the local surface; has a direct effect on the balance between soil water infiltration and surface runoff and controls the intensity of flows of matter and insolation. This set of factors results in environments with different physical and biological characteristics, allowing the establishment of different types of vegetation.
Slope aspect (A) Terrain alignment relative to the sun; it is the horizontal angle relative to the expected direction of surface runoff, usually expressed in azimuth. Among several aspects (i.e., relationships with the distribution of different substrates, ecological niches, etc.), this variable is related with the degree of shade or light in the terrain, selecting more appropriate environments for the development of certain types of vegetation. The slope aspect corresponds to the angle from 0°to 360°. Since aspect is a circular variable, it was converted in two linear components given by sine/cosine transformation generating two new variables Sine A (Sine of slope aspect) and Cosine A (Cosine of the slope aspect) which were used for regression analyses instead.
Profile curvature (kv) Concave/convex character of the terrain. This characterizes the land surface, which is directly associated with hydrological and transport properties and may influence the distribution and development of vegetation indirectly.
Plan curvature (kh) Divergent/convergent character of flows of matter on the ground when analysed on a horizontal projection. As the profile curvature, the plan curvature characterizes the land surface, which is directly associated with hydrological and transport properties and may influence vegetation indirectly. doi:10.1371/journal.pone.0152009.t001 outliers. Fitting of regression models and subsequent steps were performed following the recommendation that the total number of observations should be 10 times the number of variables [47]. In the present study, this result corresponded to a maximum of 3 explanatory variables. Analyses were performed to ensure that there were no violations of the assumptions of normality and homoscedasticity, and of no spatial autocorrelation. The presence of multicollinearity was tested using the variance inflation factor (VIF) and the autocorrelation was tested using Moran's Index [48]. The models were built using thirty observations and were validated using nine independent observations. Model validation was performed by comparing observed and model-predicted values using the RMSE (Root Mean Square Error) [47] and the coefficient of determination (r 2 ).

Results
The models obtained were statistically significant. The data analysed showed VIF < 10, indicating the absence of serious problems with multicollinearity [47] ( Table 2). The residuals showed homogeneity of variance (Levene test, p>0.05) and did not present spatial autocorrelation for any distance class according to Moran's Index (p>0.05). The plot of residuals against predicted values showed that the linearity and homoscedasticity assumptions were met.
The model for estimating BA presented explained 68% of the variance in basal area (r 2 = 0.68, Adjusted r 2 = 0.66, F 2.27 = 29.809, p<0.001 and Std. Error of Estimate = 4.1207) and included the explanatory variables elevation and slope gradient ( Table 1). The model of H, despite being significant, explained only 20% of the variance (r 2 = 0.20, Adjusted r 2 = 0.14, F 2.27 = 3.44, p<0.047 and Std. Error of estimate = 1.236) and included the variables profile curvature and slope gradient ( Table 1).The model of CO presented explained 60% of the variance (r 2 = 0.60, Adjusted r 2 = 0.56; F 3.26 = 13.46, p<0.001 and Std. Error of estimate = 3.088) and included the variables elevation, cosine of the orientation slope, and profile curvature (Table 1).
Model validation with 9 independent sample plots showed an RMSE of 3.73 m 2 /ha (20%) for BA, 1.70 m (12%) for H, and 1.78% (21%) for CO relative to the average values. The Table 2. Multiple regression analysis of the relationships between BA (basal area), CO (canopy openness) and H (height) and local geomorphometric variables selected by stepwise method (h: elevation, Cosine A: Cosine of slope aspect, G: slope gradient and kv: profile curvature). As the coefficients β had p < 0.1 for all variables selected, thus all of them were included in equations models. Predictive Models of Forest Structure from Geomorphometric Variables coefficients of determination from the analyses between observed and predicted values were r 2 = 0.32 for CO, r 2 = 0.26 for H and r 2 = 0.52 for BA. Because the coefficient of determination of the modelled H with 30 samples was low and that one generated with the independent observations to this structural variable was not significant, we considered the obtained model inadequate for estimation of this variable. Following validation, the models for BA and CO were applied, and maps for these two variables were generated for TNF (Fig 3). The generated models and maps indicated that individuals with a higher basal area were located in areas with higher elevation and slope gradient. More closed canopies were associated with areas with higher elevation and slope aspect mostly facing South (lower slope aspect cosine) (Table 1, Fig 3).

Discussion
Our data have revealed that topographic relief is a good predictor of certain forest structural characteristics. Of the three geomorphometric characteristics measured, average basal area (BA) and average canopy openness (CO) were adequately predicted by relief, although comparing the results we can notice that the model generate by BA is adjusted better than CO. For average height (H), although the error was low (RMSE = 12%), the r 2 for the model was only approximately 26%, meaning that only a small proportion of the variability in H was explained by the model. In this case, considering the 9 independent validation sample plots, the coefficient of determination between observed and predicted H values was not significant. Therefore it can be considered that the multiple regression models were not satisfactory for this variable. In summary, we conclude that the relief variables used were only sufficient to adequately estimate BA and CO. In the case of H, including additional variables in the model, such as the type of soil, geological structure and water availability, may improve its predictive capacity.
The basal area estimates included the variables altitude and slope gradient, whether the canopy openness estimates included altitude, profile curvature and cosine of slope aspect. Of these variables, altitude may be considered the one variable with higher predictive capacity of vegetation structure, since its variation was highly consistent with the spatial variation of the basal area and canopy openness estimates. The estimates generated by the models revealed that sites presenting higher CO also presented tree individuals with lower BA and that higher CO and lower BA occurred in areas with lower altitude. Therefore, we suggest that altitude is the main predictor variable of spatial macro-variation of forest structure in the study area. The remaining variables included in the models (slope gradient, cosine of slope aspect and and profile curvature) seem to be related with minor components of this variation.
Relief characteristics, such as those evaluated through geomorphometric variables in the present study, have previously been considered as determinants of vegetation structure in forest ecosystems [28,29,36,49]. These characteristics may affect vegetation structure directly or indirectly, for example by affecting the control of temperature [50] and incident sunlight, general soil characteristics [51,52], flow of water and organic matter, and nutrient cycling [53]. Because the altitude variation in the TNF is low (approximately 300 m), the indirect effects of relief on soil characteristics must have been more important for determination of vegetation structure than the direct effects. The TNF topography seems to influence soil texture, with higher areas presenting predominance of clay and, consequently, higher humidity and carbon and nitrogen concentrations [54]. Moreover, the lower altitude terrains close to the Tapajós River are sandier [54]. Therefore the interaction between relief and soil is also likely to play an important part in determining the structural variation of vegetation within the TNF through edaphic processes that determine the soil catena.
Regions closer to the Tapajós River present Dystrophic Yellow Oxisol with medium texture, from clayey to sandy (dystrophic Entisols), dense forest/savanna transition, and flat to mildly hilly relief. Due to the dense forest/savanna transition, these areas present a slightly lower tree density. The models generated in the present study showed higher CO and individuals with lower BA in these areas. These forest characteristics can also be found in areas more to the southwest and extreme south of the TNF, with the occurrence of more open forests.
In contrast, TNF areas with higher altitude generally presented higher BA and lower CO. As previously stated, these areas present more clayey soils and, consequently, higher carbon and nitrogen concentrations, which enables higher vegetation development. These results are corroborated by Castilho et al. [49], who studied the Adolpho Ducke Forest Reserve, located in the Central Amazon, and observed positive correlations between forest biomass and the soil texture gradient (clay concentration) and altitude. These authors observed that the biomass of larger trees tends to be higher in clay-rich soils (more frequent in flat areas with higher altitudes) than in more sandy soils (low altitudes and bottom lands). In this area of the Central Amazon, the soil clay content is highly related to altitude (r = 0.94). As the clay content decreases, the sand content increases along the terrain succession from plateaus to valleys [55].
The relationships discussed above show that most effects of relief on vegetation can be explained by their relationships with the soil type, especially when different altitudes are compared. Different soil types present important nutritional differences, which could partly explain the vegetation structure variation. For example, in general, greater forest structure or biomass is expected in more fertile soils, independently of the species composition, simply because there are more resources available to support plant growth [56]. These factors could explain the observed higher basal area and lower canopy openness in sites with higher altitude, where soil tends to be more clayey in the TNF. Furthermore, different geological substrates are disposed in strata according to altitude, established during the sedimentation process, which could partly explain the relationship between soil and elevation.
The intensity of the impact of human activities on vegetation structure can be evaluated by comparing pre-existing and current vegetation [57,58]. Considering the complexity of the Amazon tropical forest, developing models that can predict expected vegetation based on reference conditions (natural environments or with minimum impact) is a challenge, especially with the goal of large-scale mapping. Therefore, we suggest the construction of more models of this type, following an approach of compartmentalization according to environmental mesoregions. These models are more useful if they are based on predictive variables that are little impacted by anthropogenic activities, such as relief, allowing the prediction of forest structure prior to disturbance. The present study shows that the use of relief variables results in good predictive models of the tropical forest structure, especially for canopy openness and tree diameter characteristics in the case of the FNT, allowing the elaboration of maps with acceptable uncertainty intervals ofabout 20%. In the Tapajós National Forest the uncertainty levels found for modeling biomass, forest height and volume has ranged from 10-25% [32,43,59]. In our study, for the canopy openness the level of uncertainty found was 21% and to the basal area was 20%.
Future steps to refine this variable integration for other Amazon mesoregions are needed to increase knowledge and to supply tools to help with control and supervision policies for the environmental monitoring organs of this region. The availability of satellite-derived global digital terrain models such as SRTM and the TanDEM-X DEM, which is currently in production, are a valuable resource for the indirect mapping of forest structural parameters in primary tropical forest. These maps will be useful for improving regional or global monitoring of forest structure and biodiversity in Amazon Biome.

Authorization for the field work
The study was carried out in the Tapajós National Forest (TNF) and dendrometric measurements (diameter at breast height and height) as well as botanical identification of the trees were done, just inside of this area. The authorization to carry out the field work at TNF was provided by the Instituto Chico Mendes de Conservação da Biodiversidade-ICMBio/MMA (SISBIO n. 20591-1). This study did not involve endangered or protected species and no biological samples were taken.