The importance of classification modeling based on physiological characteristics
Most orchids are in an actively evolving and specializing process from the biologically evolutionary aspect and are generally regarded as the flag group for biodiversity conservation (Luo et al., 2003). Their diversity hotspot was proven to correspond to other taxon distribution centers (Anderson et al., 2008; Gaskett & Gallagher, 2018; Seaton et al., 2010). Consequently, analysis of orchids’ geographical distribution via SDMs makes it possible to understand regional fundamental geographical distribution patterns and identify priority conservation in a biodiversity hotspot (Crain & Fernandez, 2020; Souza Rocha & Luiz Waechter, 2010; Xing & Ree, 2017). SDMs are mathematical models established by the targeted species occurrences as well as environmental data that estimate the ecological niche requirements of species based on statistical information provided by sampling sites and mapped to specific spatial and temporal regions to reflect the degree of habitat preference of species in a probabilistic form (Araújo & Guisan, 2006; Dyderski et al., 2018; Elith & Leathwick, 2009; Guillera-Arroita et al., 2015; Guisan & Thuiller, 2005; Guo et al., 2020; Ranc et al., 2017). The model results are the response to their suitable habitat distributions. However, the orchid family has shown their wide ecological suitability (Souza Rocha & Luiz Waechter, 2010) and significant physiological characteristics among different lifeforms (McCormick & Jacquemyn, 2014; Zhang et al., 2018). From the statistical point of view of SDMs, when we do not take measures to pretreatment the orchid occurrences and directly input models, this would expand the environmental information provided by the sampling sites and may obtain an inaccurate and rough ecological requirement for orchids, thus affecting the model accuracy and suitability maps.
This has been confirmed in this study. Different modeling strategies and verification methods were adopted to test the physiological characteristic’s effect on orchid SDMs. The result indicates that the models’ accuracy would improve significantly when we confront and manage the physiological features, especially in epiphytic and mycoheterotrophic orchids. It is possible that the environmental relationship and dependence of these two types can be better represented by modeling separately. Another situation also proves the above conjecture that, without pretreatment for orchids, it may erroneously expand ecological niche requirements. In most of our model experiments, the predicted suitability area of unclassified tended to be higher than that results by the classification models.
Uncertainty in species distribution data is a factor that affects SDMs, which commonly includes uncertainty in the location of species occurrence, incomplete sampling, and selective bias (Guo et al., 2020). In this study, we put forward another situation that will cause the increase in model uncertainty: ignoring the pretreatment of targeted species occurrences data with inherent physiological differences. Not only limited to orchids, but the more precise matching of species occurrence with environmental information is also essential for species with distinct ecological preferences, which is more common in dynamic SDMs studies of migratory animals in the ocean (El-Gabbas A et al., 2021). We emphasize that when serving the prediction of suitable habitats for target species using SDMs, in addition to optimizing the model structure, adjusting the model parameters, and improving the spatial resolution of the environment to improve the performance of models, it is necessary and efficient to pre-process the data with the physiological differences embedded in the occurrence.