Abstract: Amazon droughts in 2005 and 2010 have raised serious concern about the future of the rainforest. Amazon forests are crucial because of their role as the largest carbon sink in the world which would effect the global warming phenomena with decreased photosynthesis activity. Especially, after a decline in plant growth in 1.68 million km2 forest area during the once-in-a-century severe drought in 2010, it is of primary importance to understand the relationship between different climatic variables and vegetation. In an earlier study, we have shown that non-linear models are better at capturing the relation dynamics of vegetation and climate variables such as temperature and precipitation, compared to linear models. In this research, we learn precise models between vegetation and climatic variables (temperature, precipitation) for normal conditions in the Amazon region using genetic programming based symbolic regression. This is done by removing high elevation and drought affected areas and also considering the slope of the region as one of the important factors while building the model. The model learned reveals new and interesting ways historical and current climate variables affect the vegetation at any location. MAIAC data has been used as a vegetation surrogate in our study. For temperature and precipitation, we have used TRMM and MODIS Land Surface Temperature data sets while learning the non-linear regression model. However, to generalize the model to make it independent of the data source, we perform transfer learning where we regress a regularized least squares to learn the parameters of the non-linear model using other data sources such as the precipitation and temperature from the Climatic Research Center (CRU). This new model is very similar in structure and performance compared to the original learned model and verifies the same claims about the nature of dependency between these climate variables and the vegetation in the Amazon region. As a result of this study, we are able to learn, for the very first time how exactly different climate factors influence vegetation at any location in the Amazon rainforests, independent of the specific sources from which the data has been obtained.
Abstract: n recent years, a number of methods have been proposed that attempt to improve the performance of genetic programming by exploiting information about program semantics. One of the most important developments in this area is semantic backpropagation. The key idea of this method is to decompose a program into two parts—a subprogram and a context—and calculate the desired semantics of the subprogram that would make the entire program correct, assuming that the context remains unchanged. In this paper we introduce Forward Propagation Mutation, a novel operator that relies on the opposite assumption—instead of preserving the context, it retains the subprogram and attempts to place it in the semantically right context. We empirically compare the performance of semantic backpropagation and forward propagation operators on a set of symbolic regression benchmarks. The experimental results demonstrate that semantic forward propagation produces smaller programs that achieve significantly higher generalization performance.
Abstract: Maintaining population diversity has long been considered fundamental to the effectiveness of evolutionary algorithms. Recently, with the advent of novelty search, there has been an increasing interest in sustaining behavioral diversity by using both fitness and behavioral novelty as separate search objectives. However, since the novelty objective explicitly rewards diverging from other individuals, it can antagonize the original fitness objective that rewards convergence toward the solution(s). As a result, fostering behavioral diversity may prevent proper exploitation of the most interesting regions of the behavioral space, and thus adversely affect the overall search performance. In this paper, we argue that an antagonism between behavioral diversity and fitness can indeed exist in semantic genetic programming applied to symbolic regression. Minimizing error draws individuals toward the target semantics but promoting novelty, defined as a distance in the semantic space, scatters them away from it. We introduce a less conflicting novelty metric, defined as an angular distance between two program semantics with respect to the target semantics. The experimental results show that this metric, in contrast to the other considered diversity promoting objectives, allows to consistently improve the performance of genetic programming regardless of whether it employs a syntactic or a semantic search operator.
Abstract: Both short-term (weather) and long-term (climate) variations in the atmosphere directly impact various ecosystems on earth. Forest ecosystems, especially tropical forests, are crucial as they are the largest reserves of terrestrial carbon sink. For example, the Amazon forests are a critical component of global carbon cycle storing about 100 billion tons of carbon in its woody biomass. There is a growing concern that these forests could succumb to precipitation reduction in a progressively warming climate, leading to release of significant amount of carbon in the atmosphere. Therefore, there is a need to accurately quantify the dependence of vegetation growth on different climate variables and obtain better estimates of drought-induced changes to atmospheric CO2. The availability of globally consistent climate and earth observation datasets have allowed global scale monitoring of various climate and vegetation variables such as precipitation, radiation, surface greenness, etc. Using these diverse datasets, we aim to quantify the magnitude and extent of ecosystem exposure, sensitivity and resilience to droughts in forests. The Amazon rainforests have undergone severe droughts twice in last decade (2005 and 2010), which makes them an ideal candidate for the regional scale analysis. Current studies on vegetation and climate relationships have mostly explored linear dependence due to computational and domain knowledge constraints. We explore a modeling technique called symbolic regression based on evolutionary computation that allows discovery of the dependency structure without any prior assumptions. In symbolic regression the population of possible solutions is defined via trees structures. Each tree represents a mathematical expression that includes pre-defined functions (mathematical operators) and terminal sets (independent variables from data). Selection of these sets is critical to computational efficiency and model accuracy. In this work we investigate appropriate function and terminal set choices for the symbolic regression based modeling of the effects of climate on Amazon vegetation. Additionally, we compare the predictive capability of the symbolic regression based model to baseline techniques such as linear regularized regression and support vector regression.