Abstract: Exploratory data analysis on physical, chemical and biological data from sediments and water in Lake Champlain reveals a strong relationship between cyanobacteria, sediment anoxia, and the ratio of dissolved nitrogen to soluble reactive phosphorus. Physical, chemical and biological parameters of lake sediment and water were measured between 2007 and 2009. Cluster analysis using a self-organizing artificial neural network, expert opinion and discriminant analysis separated the dataset into no-bloom and bloom groups. Clustering was based on similarities in water and sediment chemistry and non-cyanobacteria phytoplankton abundance. Our analysis focused on the contribution of individual parameters to discriminate between no-bloom and bloom groupings. Application to a second, more spatially diverse dataset, revealed similar no-bloom and bloom discrimination; yet a few samples possess all the physico-chemical characteristics of a bloom without the high cyanobacteria cell counts suggesting that while specific environmental conditions can support a bloom, another environmental trigger may be required to initiate the bloom. Results highlight the conditions coincident with cyanobacteria blooms in Missisquoi Bay of Lake Champlain, and indicate additional data are needed to identify possible ecological contributors to bloom initiation.
Abstract: We present a clustering methodology that distinguishes management zones in a landfill leachate contaminated groundwater aquifer using only microbiological data for input rather than traditional physiochemical information. The self-organizing map (SOM), an artificial neural network (ANN), is commonly used as a K-means clustering method. The method outperforms many traditional clustering methods on noisy datasets (e.g. high dispersion, outliers, non-uniform cluster densities); and is appropriate when combining the multiple correlated and auto-correlated data associated with most hydrochemical research. We applied an SOM to a set of genome-based microbial community profiles created using terminal restriction fragment length polymorphism (T-RFLP) of the 16S rRNA gene sampled from groundwater monitoring wells in an aquifer contaminated with landfill leachate. We modified the existing algorithm to allow weighting of the input variables according to their relative importance, and added a post-processing radial basis function to estimate group membership between measurement locations auto-correlated in space. We statistically tested the SOM output clusters using a nonparametric MANOVA to identify an optimal number of clusters. The SOM methodology distinguished between tiers of contamination in this multi-contaminant environment using expert knowledge to guide data preprocessing and to weight the input variables. Results showed a composite delineation representative of overall groundwater contamination at the landfill based only on microbiological information. Using a small number of clusters, the SOM distinguished between background and leachate-contaminated sampling locations, whereas with a larger number of clusters it groups across a gradient of groundwater contamination. The landfill leachate application demonstrates that microbial community data can compliment standard analytical analyses for the purpose of delineating spatial zones of groundwater contamination. The success of this research is attributed to communication between the computational and biological scientists. This ensured that the essential nature of the dataset was preserved throughout the computational transformations and that the methodology was optimized for the application.
Abstract: We modified a self-organizing map (SOM), a clustering artificial neural network, using variogram analyses to incorporate the spatial and temporal auto-correlation that exists in many surface and subsurface environmental datasets. The SOM reduces the dimensionality and clusters the data. The SOM is particularly effective with multiple data types (e.g. continuous, discrete and categorical variables). The standard SOM algorithm iteratively updates connection weights between the input parameters and the two-dimensional output mapping over a specified region of the estimation field. The method accounts for the anisotropy found in geologic and hydrologic datasets. The algorithm is tested on a unique dataset collected from a slab of Berea sandstone (1 m by 0.4 m). Air permeability, electrical resistivity and compressional wave velocity were measured on a 3 mm rectangular grid. Sparse testing data were drawn randomly from this exhaustive dataset for validating the new computational methods. We apply the method using biogeochemical data collected along a transect between the Vermont and New York shorelines of Lake Champlain, to demonstrate the ability to discriminate between different functional zones in the lake. This clustering method could be applied to a variety of terrestrial, aquatic, or subsurface biogeochemical or geophysical problems. Considering spatial auto-correlation in delineating regions or zones in environmental systems creates more accurate estimations.
Abstract: Reach-scale physical habitat assessment scores are increasingly used to make decisions about management. We characterized the spatial distribution of hydraulic habitat characteristics at the reach and sub-reach scales for four fish species using detailed two-dimensional hydraulic models and spatial analysis techniques (semi-variogram analyses). We next explored whether these hydraulic characteristics were correlated with commonly used reach-scale geomorphic assessment (RGA) scores, rapid habitat assessment (RHA) scores, or indices of fish biodiversity and abundance. River2D was used to calculate weighted usable areas (WUAs) at median flows, Q50, for six Vermont streams using modelled velocity, depth estimates, channel bed data and habitat suitability curves for blacknose dace (Rhinichthys atratulus), brown trout (Salmo trutta), common shiner (Notropis cornutus) and white sucker (Catostomus commersoni) at both the adult and spawn stages. All stream reaches exhibited different spatial distributions of WUA ranging from uniform distribution of patches of high WUA to irregular distribution of more isolated patches. Streams with discontinuous, distinct patches of high score WUA had lower fish biotic integrity measured with the State of Vermont's Mixed Water Index of Biotic Integrity (MWIBI) than streams with a more uniform distribution of high WUA. In fact, the distribution of usable habitats may be a determining factor for fish communities. A relationship between predicted WUAs averaged at the reach scale and RGA or RHA scores was not found. Future research is needed to identify the appropriate spatial scales to capture the connections between usable patches of stream channel habitat.