Instrumental variables estimationIn statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment. Intuitively, IVs are used when an explanatory variable of interest is correlated with the error term, in which case ordinary least squares and ANOVA give biased results.
DataIn common usage and statistics, data (USˈdætə; UKˈdeɪtə) is a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted formally. A datum is an individual value in a collection of data. Data is usually organized into structures such as tables that provide additional context and meaning, and which may themselves be used as data in larger structures.
Simultaneous equations modelSimultaneous equations models are a type of statistical model in which the dependent variables are functions of other dependent variables, rather than just independent variables. This means some of the explanatory variables are jointly determined with the dependent variable, which in economics usually is the consequence of some underlying equilibrium mechanism. Take the typical supply and demand model: whilst typically one would determine the quantity supplied and demanded to be a function of the price set by the market, it is also possible for the reverse to be true, where producers observe the quantity that consumers demand and then set the price.
Big dataBig data primarily refers to data sets that are too large or complex to be dealt with by traditional data-processing application software. Data with many entries (rows) offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate. Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe big data is the one associated with a large body of information that we could not comprehend when used only in smaller amounts.
Dependent and independent variablesDependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical function), on the values of other variables. Independent variables, in turn, are not seen as depending on any other variable in the scope of the experiment in question. In this sense, some common independent variables are time, space, density, mass, fluid flow rate, and previous values of some observed value of interest (e.
Urban sprawlUrban sprawl (also known as suburban sprawl or urban encroachment) is defined as "the spreading of urban developments (such as houses and shopping centers) on undeveloped land near a city". Urban sprawl has been described as the unrestricted growth in many urban areas of housing, commercial development, and roads over large expanses of land, with little concern for urban planning. In addition to describing a special form of urbanization, the term also relates to the social and environmental consequences associated with this development.
Data analysisData analysis is the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains. In today's business world, data analysis plays a role in making decisions more scientific and helping businesses operate more effectively.
Endogeneity (econometrics)In econometrics, endogeneity broadly refers to situations in which an explanatory variable is correlated with the error term. The distinction between endogenous and exogenous variables originated in simultaneous equations models, where one separates variables whose values are determined by the model from variables which are predetermined; ignoring simultaneity in the estimation leads to biased estimates as it violates the exogeneity assumption of the Gauss–Markov theorem.
Public transportPublic transport (also known as public transportation, public transit, mass transit, or simply transit) is a system of transport for passengers by group travel systems available for use by the general public unlike private transport, typically managed on a schedule, operated on established routes, and that charge a posted fee for each trip. There is no rigid definition of which kinds of transport are included, and air travel is often not thought of when discussing public transport—dictionaries use wording like "buses, trains, etc.
Data scienceData science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured, and unstructured data. Data science also integrates domain knowledge from the underlying application domain (e.g., natural sciences, information technology, and medicine). Data science is multifaceted and can be described as a science, a research paradigm, a research method, a discipline, a workflow, and a profession.
Cluster analysisCluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, , information retrieval, bioinformatics, data compression, computer graphics and machine learning.
ExogenyIn a variety of contexts, exogeny or exogeneity () is the fact of an action or object originating externally. It contrasts with endogeneity or endogeny, the fact of being influenced within a system. In an economic model, an exogenous change is one that comes from outside the model and is unexplained by the model. Such changes of an economic model from outside factors can include the influence of technology, in which this had previously been noted as an exogenous factor, but has rather been noted as a factor that can depict economic forces as a whole.
Urban designUrban design is an approach to the design of buildings and the spaces between them that focuses on specific design processes and outcomes. In addition to designing and shaping the physical features of towns, cities, and regional spaces, urban design considers 'bigger picture' issues of economic, social and environmental value and social design. The scope of a project can range from a local street or public space to an entire city and surrounding areas.
Data warehouseIn computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is considered a core component of business intelligence. Data warehouses are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating analytical reports for workers throughout the enterprise. This is beneficial for companies as it enables them to interrogate and draw insights from their data and make decisions.
Data managementData management comprises all disciplines related to handling data as a valuable resource. The concept of data management arose in the 1980s as technology moved from sequential processing (first punched cards, then magnetic tape) to random access storage. Since it was now possible to store a discrete fact and quickly access it using random access disk technology, those suggesting that data management was more important than business process management used arguments such as "a customer's home address is stored in 75 (or some other large number) places in our computer systems.
Linear regressionIn statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.
DistanceDistance is a numerical or occasionally qualitative measurement of how far apart objects or points are. In physics or everyday usage, distance may refer to a physical length or an estimation based on other criteria (e.g. "two counties over"). Since spatial cognition is a rich source of conceptual metaphors in human thought, the term is also frequently used metaphorically to mean a measurement of the amount of difference between two similar objects (such as statistical distance between probability distributions or edit distance between strings of text) or a degree of separation (as exemplified by distance between people in a social network).
Car-free movementThe car-free movement is a broad, informal, emergent network of individuals and organizations, including social activists, urban planners, transportation engineers, environmentalists and others, brought together by a shared belief that large and/or high-speed motorized vehicles (cars, trucks, tractor units, motorcycles, etc.) are too dominant in most modern cities.
Exogenous and endogenous variablesIn an economic model, an exogenous variable is one whose measure is determined outside the model and is imposed on the model, and an exogenous change is a change in an exogenous variable. In contrast, an endogenous variable is a variable whose measure is determined by the model. An endogenous change is a change in an endogenous variable in response to an exogenous change that is imposed upon the model. The term endogeneity in econometrics has a related but distinct meaning.
Urban villageIn urban planning and design, an urban village is an urban development typically characterized by medium-density housing, mixed use zoning, good public transit and an emphasis on pedestrianization and public space. Contemporary urban village ideas are closely related to New Urbanism and smart growth ideas initiated in the United States. Urban villages are seen to provide an alternative to recent patterns of urban development in many cities, especially decentralization and urban sprawl.