NitrateNitrate is a polyatomic ion with the chemical formula NO3−. Salts containing this ion are called nitrates. Nitrates are common components of fertilizers and explosives. Almost all inorganic nitrates are soluble in water. An example of an insoluble nitrate is bismuth oxynitrate. The ion is the conjugate base of nitric acid, consisting of one central nitrogen atom surrounded by three identically bonded oxygen atoms in a trigonal planar arrangement. The nitrate ion carries a formal charge of −1.
Time seriesIn mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average. A time series is very frequently plotted via a run chart (which is a temporal line chart).
Cluster analysisCluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, , information retrieval, bioinformatics, data compression, computer graphics and machine learning.
Multivariate normal distributionIn probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem.
Log-normal distributionIn probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = ln(X) has a normal distribution. Equivalently, if Y has a normal distribution, then the exponential function of Y, X = exp(Y), has a log-normal distribution. A random variable which is log-normally distributed takes only positive real values.
Mixture modelIn statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. Formally a mixture model corresponds to the mixture distribution that represents the probability distribution of observations in the overall population.
Probability distributionIn probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events (subsets of the sample space). For instance, if X is used to denote the outcome of a coin toss ("the experiment"), then the probability distribution of X would take the value 0.5 (1 in 2 or 1/2) for X = heads, and 0.
Gamma distributionIn probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use: With a shape parameter and a scale parameter . With a shape parameter and an inverse scale parameter , called a rate parameter. In each of these forms, both parameters are positive real numbers.
Water qualityWater quality refers to the chemical, physical, and biological characteristics of water based on the standards of its usage. It is most frequently used by reference to a set of standards against which compliance, generally achieved through treatment of the water, can be assessed. The most common standards used to monitor and assess water quality convey the health of ecosystems, safety of human contact, extent of water pollution and condition of drinking water.
K-means clusteringk-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances.
Autoregressive integrated moving averageIn statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. To better comprehend the data or to forecast upcoming series points, both of these models are fitted to time series data. ARIMA models are applied in some cases where data show evidence of non-stationarity in the sense of mean (but not variance/autocovariance), where an initial differencing step (corresponding to the "integrated" part of the model) can be applied one or more times to eliminate the non-stationarity of the mean function (i.
Sodium nitrateSodium nitrate is the chemical compound with the formula NaNO3. This alkali metal nitrate salt is also known as Chile saltpeter (large deposits of which were historically mined in Chile) to distinguish it from ordinary saltpeter, potassium nitrate. The mineral form is also known as nitratine, nitratite or soda niter. Sodium nitrate is a white deliquescent solid very soluble in water.
AutocorrelationAutocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable as a function of the time lag between them. The analysis of autocorrelation is a mathematical tool for finding repeating patterns, such as the presence of a periodic signal obscured by noise, or identifying the missing fundamental frequency in a signal implied by its harmonic frequencies.
Ammonium nitrateAmmonium nitrate is a chemical compound with the chemical formula . It is a white crystalline salt consisting of ions of ammonium and nitrate. It is highly soluble in water and hygroscopic as a solid, although it does not form hydrates. It is predominantly used in agriculture as a high-nitrogen fertilizer. Its other major use is as a component of explosive mixtures used in mining, quarrying, and civil construction.
Variational Bayesian methodsVariational Bayesian methods are a family of techniques for approximating intractable integrals arising in Bayesian inference and machine learning. They are typically used in complex statistical models consisting of observed variables (usually termed "data") as well as unknown parameters and latent variables, with various sorts of relationships among the three types of random variables, as might be described by a graphical model. As typical in Bayesian inference, the parameters and latent variables are grouped together as "unobserved variables".
WetlandWetlands, or simply a wetland, is a distinct ecosystem that is flooded or saturated by water, either permanently (for years or decades) or seasonally (for weeks or months). Flooding results in oxygen-free (anoxic) processes prevailing, especially in the soils. The primary factor that distinguishes wetlands from terrestrial land forms or water bodies is the characteristic vegetation of aquatic plants, adapted to the unique anoxic hydric soils.
Correlation clusteringClustering is the problem of partitioning data points into groups based on their similarity. Correlation clustering provides a method for clustering a set of objects into the optimum number of clusters without specifying that number in advance. Cluster analysis In machine learning, correlation clustering or cluster editing operates in a scenario where the relationships between the objects are known instead of the actual representations of the objects.
Nitrogen cycleThe nitrogen cycle is the biogeochemical cycle by which nitrogen is converted into multiple chemical forms as it circulates among atmospheric, terrestrial, and marine ecosystems. The conversion of nitrogen can be carried out through both biological and physical processes. Important processes in the nitrogen cycle include fixation, ammonification, nitrification, and denitrification. The majority of Earth's atmosphere (78%) is atmospheric nitrogen, making it the largest source of nitrogen.
DenitrificationDenitrification is a microbially facilitated process where nitrate (NO3−) is reduced and ultimately produces molecular nitrogen (N2) through a series of intermediate gaseous nitrogen oxide products. Facultative anaerobic bacteria perform denitrification as a type of respiration that reduces oxidized forms of nitrogen in response to the oxidation of an electron donor such as organic matter.
Trend-stationary processIn the statistical analysis of time series, a trend-stationary process is a stochastic process from which an underlying trend (function solely of time) can be removed, leaving a stationary process. The trend does not have to be linear. Conversely, if the process requires differencing to be made stationary, then it is called difference stationary and possesses one or more unit roots. Those two concepts may sometimes be confused, but while they share many properties, they are different in many aspects.