Comparing Generalized Linear Models and random forest to model vascular plant species richness using LiDAR data in a natural forest in central Chile
Abstract
Biodiversity is considered to be an essential element of the Earth system, driving important ecosystem services. However, the conservation of biodiversity in a quickly changing world is a challenging task which requires cost-efficient and precise monitoring systems. In the present study, the suitability of airborne discrete-return LiDAR data for the mapping of vascular plant species richness within a Sub-Mediterranean second growth native forest ecosystem was examined. The vascular plant richness of four different layers (total, tree, shrub and herb richness) was modeled using twelve LiDAR-derived variables. As species richness values are typically count data, the corresponding asymmetry and heteroscedasticity in the error distribution has to be considered. In this context, we compared the suitability of random forest (RF) and a Generalized Linear Model (GLM) with a negative binomial error distribution. Both models were coupled with a feature selection approach to identify the most relevant LiDAR predictors and keep the models parsimonious. The results of RF and GLM agreed that the three most important predictors for all four layers were altitude above sea level, standard deviation of slope and mean canopy height. This was consistent with the preconception of LiDAR's suitability for estimating species richness, which is its capacity to capture three types of information: micro-topographical, macro-topographical and canopy structural. Generalized Linear Models showed higher performances (r(2): 0.66, 050, 052, 0.50; nRMSE: 16.29%, 19.08%, 17.89%, 2131% for total, tree, shrub and herb richness respectively) than RF (r(2): 0.55, 0.33, 0.45, 0.46; nRMSE: 18.30%, 21.90%, 18.95%, 21.00% for total, tree, shrub and herb richness, respectively). Furthermore, the results of the best GLM were more parsimonious (three predictors) and less biased than the best RF models (twelve predictors). We think that this is due to the mentioned non-symmetric error distribution of the species richness values, which RF is unable to properly capture.
From an ecological perspective, the predicted patterns agreed well with the known vegetation composition of the area. We found especially high species numbers at low elevations and along riversides. In these areas, overlapping distributions of thermopile sderophyllos species, water demanding Valdivian evergreen species and species growing in Nothofagus obliqua forests occur.
The three main condusions of the study are: 1) appropriate model selection is crucial when working with biodiversity count data; 2) the application of RF for data with non-symmetric error distributions is questionable; and 3) structural and topographic information derived from LiDAR data is useful for predicting local plant species richness.
Origen
Remote Sensing of Environment 173 (2016) 200–210DOI: 10.1016/j.rse.2015.11.029
https://repositorio.uchile.cl/handle/2250/138507
Documento no disponible en formato digital. Consultar en biblioteca INFOR: Contacto

Related items
Showing items related by title, author, creator and subject.
-
Comparison of Airborne LiDAR and Satellite Hyperspectral Remote Sensing to Estimate Vascular Plant ...
Ceballos, Andrés; Hernández Palma, Héctor; Corvalán Vera, Carlos; Galleguillos Torres, Mauricio (MDPI AG, 2015)The Andes foothills of central Chile are characterized by high levels of floristic diversity in a scenario, which offers little protection by public protected areas. Knowledge of the spatial ... -
Conceptos y métodos para identificación y delimitación de árboles para Pino radiata a partir de imágenes ...
Hernández Palma, Hector Jaime ([s.n.], 2004) -
Geographic patterns of vascular plant diversity and endemism using different taxonomic and spatial units
Luebert Bruron, Federico José; Fuentes Castillo, Taryn; Pliscoff Varas, Patricio; García Berguecio, Nicolás; Román Ayo, María José; Vera Aravena, Diego; Scherson Vicencio, Rosa (MDPI, 2022)Estimation of biodiversity patterns in poorly known areas is hampered by data availability and biased collecting efforts. To overcome the former, patterns can be estimated at higher taxonomic ...