Cluster analysisCluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, , information retrieval, bioinformatics, data compression, computer graphics and machine learning.
Determining the number of clusters in a data setDetermining the number of clusters in a data set, a quantity often labelled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. For a certain class of clustering algorithms (in particular k-means, k-medoids and expectation–maximization algorithm), there is a parameter commonly referred to as k that specifies the number of clusters to detect.
Synthetic-aperture radarSynthetic-aperture radar (SAR) is a form of radar that is used to create two-dimensional images or three-dimensional reconstructions of objects, such as landscapes. SAR uses the motion of the radar antenna over a target region to provide finer spatial resolution than conventional stationary beam-scanning radars. SAR is typically mounted on a moving platform, such as an aircraft or spacecraft, and has its origins in an advanced form of side looking airborne radar (SLAR).
Interferometric synthetic-aperture radarInterferometric synthetic aperture radar, abbreviated InSAR (or deprecated IfSAR), is a radar technique used in geodesy and remote sensing. This geodetic method uses two or more synthetic aperture radar (SAR) images to generate maps of surface deformation or digital elevation, using differences in the phase of the waves returning to the satellite or aircraft. The technique can potentially measure millimetre-scale changes in deformation over spans of days to years.
K-means clusteringk-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances.
Digital elevation modelA digital elevation model (DEM) or digital surface model (DSM) is a 3D computer graphics representation of elevation data to represent terrain or overlaying objects, commonly of a planet, moon, or asteroid. A "global DEM" refers to a discrete global grid. DEMs are used often in geographic information systems (GIS), and are the most common basis for digitally produced relief maps. A digital terrain model (DTM) represents specifically the ground surface while DEM and DSM may represent tree top canopy or building roofs.
Correlation clusteringClustering is the problem of partitioning data points into groups based on their similarity. Correlation clustering provides a method for clustering a set of objects into the optimum number of clusters without specifying that number in advance. Cluster analysis In machine learning, correlation clustering or cluster editing operates in a scenario where the relationships between the objects are known instead of the actual representations of the objects.
Remote sensingRemote sensing is the acquisition of information about an object or phenomenon without making physical contact with the object, in contrast to in situ or on-site observation. The term is applied especially to acquiring information about Earth and other planets. Remote sensing is used in numerous fields, including geophysics, geography, land surveying and most Earth science disciplines (e.g. exploration geophysics, hydrology, ecology, meteorology, oceanography, glaciology, geology); it also has military, intelligence, commercial, economic, planning, and humanitarian applications, among others.
ElevationThe elevation of a geographic location is its height above or below a fixed reference point, most commonly a reference geoid, a mathematical model of the Earth's sea level as an equipotential gravitational surface (see Geodetic datum § Vertical datum). The term elevation is mainly used when referring to points on the Earth's surface, while altitude or geopotential height is used for points above the surface, such as an aircraft in flight or a spacecraft in orbit, and depth is used for points below the surface.
Water resourcesWater resources are natural resources of water that are potentially useful for humans, for example as a source of drinking water supply or irrigation water. 97% of the water on Earth is salt water and only three percent is fresh water; slightly over two-thirds of this is frozen in glaciers and polar ice caps. The remaining unfrozen freshwater is found mainly as groundwater, with only a small fraction present above ground or in the air. Natural sources of fresh water include surface water, under river flow, groundwater and frozen water.
Imaging radarImaging radar is an application of radar which is used to create two-dimensional s, typically of landscapes. Imaging radar provides its light to illuminate an area on the ground and take a picture at radio wavelengths. It uses an antenna and digital computer storage to record its images. In a radar image, one can see only the energy that was reflected back towards the radar antenna. The radar moves along a flight path and the area illuminated by the radar, or footprint, is moved along the surface in a swath, building the image as it does so.
Drinking waterDrinking water is water that is used in drink or food preparation; potable water is water that is safe to be used as drinking water. The amount of drinking water required to maintain good health varies, and depends on physical activity level, age, health-related issues, and environmental conditions. Recent work showed that the most important driver of water turnover which is closely linked to water requirements is energy expenditure. For those who work in a hot climate, up to a day may be required.
Water purificationWater purification is the process of removing undesirable chemicals, biological contaminants, suspended solids, and gases from water. The goal is to produce water that is fit for specific purposes. Most water is purified and disinfected for human consumption (drinking water), but water purification may also be carried out for a variety of other purposes, including medical, pharmacological, chemical, and industrial applications. The history of water purification includes a wide variety of methods.
Satellite imagerySatellite images (also Earth observation imagery, spaceborne photography, or simply satellite photo) are s of Earth collected by imaging satellites operated by governments and businesses around the world. Satellite imaging companies sell images by licensing them to governments and businesses such as Apple Maps and Google Maps. The first images from space were taken on sub-orbital flights. The U.S-launched V-2 flight on October 24, 1946, took one image every 1.5 seconds.
Geographic information systemA geographic information system (GIS) consists of integrated computer hardware and software that store, manage, analyze, edit, output, and visualize geographic data. Much of this often happens within a spatial database, however, this is not essential to meet the definition of a GIS. In a broader sense, one may consider such a system also to include human users and support staff, procedures and workflows, the body of knowledge of relevant concepts and methods, and institutional organizations.
Water pollutionWater pollution (or aquatic pollution) is the contamination of water bodies, usually as a result of human activities, so that it negatively affects its uses. Water bodies include lakes, rivers, oceans, aquifers, reservoirs and groundwater. Water pollution results when contaminants mix with these water bodies. Contaminants can come from one of four main sources: sewage discharges, industrial activities, agricultural activities, and urban runoff including stormwater. Water pollution is either surface water pollution or groundwater pollution.
Reclaimed waterWater reclamation (also called wastewater reuse, water reuse or water recycling) is the process of converting municipal wastewater (sewage) or industrial wastewater into water that can be reused for a variety of purposes. Types of reuse include: urban reuse, agricultural reuse (irrigation), environmental reuse, industrial reuse, planned potable reuse, de facto wastewater reuse (unplanned potable reuse). For example, reuse may include irrigation of gardens and agricultural fields or replenishing surface water and groundwater (i.
Dimensionality reductionDimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension. Working in high-dimensional spaces can be undesirable for many reasons; raw data are often sparse as a consequence of the curse of dimensionality, and analyzing the data is usually computationally intractable (hard to control or deal with).
Spectral clusteringIn multivariate statistics, spectral clustering techniques make use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimensionality reduction before clustering in fewer dimensions. The similarity matrix is provided as an input and consists of a quantitative assessment of the relative similarity of each pair of points in the dataset. In application to image segmentation, spectral clustering is known as segmentation-based object categorization.
Kernel principal component analysisIn the field of multivariate statistics, kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are performed in a reproducing kernel Hilbert space. Recall that conventional PCA operates on zero-centered data; that is, where is one of the multivariate observations.