Data analysisData analysis is the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains. In today's business world, data analysis plays a role in making decisions more scientific and helping businesses operate more effectively.
Exploratory data analysisIn statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments.
Data scienceData science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured, and unstructured data. Data science also integrates domain knowledge from the underlying application domain (e.g., natural sciences, information technology, and medicine). Data science is multifaceted and can be described as a science, a research paradigm, a research method, a discipline, a workflow, and a profession.
MiningMining is the extraction of valuable geological materials from the Earth and other astronomical objects. Mining is required to obtain most materials that cannot be grown through agricultural processes, or feasibly created artificially in a laboratory or factory. Ores recovered by mining include metals, coal, oil shale, gemstones, limestone, chalk, dimension stone, rock salt, potash, gravel, and clay. The ore must be a rock or mineral that contains valuable constituent, can be extracted or mined and sold for profit.
Sequential pattern miningSequential pattern mining is a topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence. It is usually presumed that the values are discrete, and thus time series mining is closely related, but usually considered a different activity. Sequential pattern mining is a special case of structured data mining. There are several key traditional computational problems addressed within this field.
Big dataBig data primarily refers to data sets that are too large or complex to be dealt with by traditional data-processing application software. Data with many entries (rows) offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate. Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe big data is the one associated with a large body of information that we could not comprehend when used only in smaller amounts.
Text miningText mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources." Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. According to Hotho et al.
PermutationIn mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or process of changing the linear order of an ordered set. Permutations differ from combinations, which are selections of some members of a set regardless of order. For example, written as tuples, there are six permutations of the set {1, 2, 3}, namely (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), and (3, 2, 1).
Attachment in childrenAttachment in children is "a biological instinct in which proximity to an attachment figure is sought when the child senses or perceives threat or discomfort. Attachment behaviour anticipates a response by the attachment figure which will remove threat or discomfort". Attachment also describes the function of availability, which is the degree to which the authoritative figure is responsive to the child's needs and shares communication with them.
Data collectionData collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. Data collection is a research component in all study fields, including physical and social sciences, humanities, and business. While methods vary by discipline, the emphasis on ensuring accurate and honest collection remains the same.
Data miningData mining is the process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information (with intelligent methods) from a data set and transforming the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.
Classified informationClassified information is material that a government body deems to be sensitive information that must be protected. Access is restricted by law or regulation to particular groups of people with the necessary security clearance and need to know, and mishandling of the material can incur criminal penalties. A formal security clearance is required to view or handle classified material. The clearance process requires a satisfactory background investigation.
Sequence analysisIn bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. Methodologies used include sequence alignment, searches against biological databases, and others. Since the development of methods of high-throughput production of gene and protein sequences, the rate of addition of new sequences to the databases increased very rapidly.
Mining engineeringMining in the engineering discipline is the extraction of minerals from underneath, open pit, above, or on the ground. Mining engineering is associated with many other disciplines, such as mineral processing, exploration, excavation, geology, and metallurgy, geotechnical engineering and surveying. A mining engineer may manage any phase of mining operations, from exploration and discovery of the mineral resources, through feasibility study, mine design, development of plans, production and operations to mine closure.
Underground hard-rock miningUnderground hard-rock mining refers to various underground mining techniques used to excavate "hard" minerals, usually those containing metals, such as ore containing gold, silver, iron, copper, zinc, nickel, tin, and lead. It also involves the same techniques used to excavate ores of gems, such as diamonds and rubies. Soft-rock mining refers to the excavation of softer minerals, such as salt, coal, and oil sands. Accessing underground ore can be achieved via a decline (ramp), inclined vertical shaft or adit.
Information sensitivityInformation sensitivity is the control of access to information or knowledge that might result in loss of an advantage or level of security if disclosed to others. Loss, misuse, modification, or unauthorized access to sensitive information can adversely affect the privacy or welfare of an individual, trade secrets of a business or even the security and international relations of a nation depending on the level of sensitivity and nature of the information. This refers to information that is already a matter of public record or knowledge.
WikiLeaksWikiLeaks (ˈwɪkiliːks) is a publisher and media organisation founded in 2006. It operates as a non-profit and is funded by donations and media partnerships. It has published classified documents and other media provided by anonymous sources. It was founded by Julian Assange, an Australian editor, publisher, and activist, who is currently challenging extradition to the United States over his work with WikiLeaks. Since September 2018, Kristinn Hrafnsson has served as its editor-in-chief.
Concept miningConcept mining is an activity that results in the extraction of concepts from artifacts. Solutions to the task typically involve aspects of artificial intelligence and statistics, such as data mining and text mining. Because artifacts are typically a loosely structured sequence of words and other symbols (rather than concepts), the problem is nontrivial, but it can provide powerful insights into the meaning, provenance and similarity of documents.
Permutation testA permutation test (also called re-randomization test) is an exact statistical hypothesis test making use of the proof by contradiction. A permutation test involves two or more samples. The null hypothesis is that all samples come from the same distribution . Under the null hypothesis, the distribution of the test statistic is obtained by calculating all possible values of the test statistic under possible rearrangements of the observed data. Permutation tests are, therefore, a form of resampling.
Dynamic-maturational model of attachment and adaptationThe dynamic-maturational model of attachment and adaptation (DMM) is a biopsychosocial model describing the effect attachment relationships can have on human development and functioning. It is especially focused on the effects of relationships between children and parents and between reproductive couples. It developed initially from attachment theory as developed by John Bowlby and Mary Ainsworth, and incorporated many other theories into a comprehensive model of adaptation to life's many dangers.