Entropy (information theory)In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable , which takes values in the alphabet and is distributed according to : where denotes the sum over the variable's possible values. The choice of base for , the logarithm, varies for different applications. Base 2 gives the unit of bits (or "shannons"), while base e gives "natural units" nat, and base 10 gives units of "dits", "bans", or "hartleys".
Mutual informationIn probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the "amount of information" (in units such as shannons (bits), nats or hartleys) obtained about one random variable by observing the other random variable. The concept of mutual information is intimately linked to that of entropy of a random variable, a fundamental notion in information theory that quantifies the expected "amount of information" held in a random variable.
Kullback–Leibler divergenceIn mathematical statistics, the Kullback–Leibler divergence (also called relative entropy and I-divergence), denoted , is a type of statistical distance: a measure of how one probability distribution P is different from a second, reference probability distribution Q. A simple interpretation of the KL divergence of P from Q is the expected excess surprise from using Q as a model when the actual distribution is P.
Information theoryInformation theory is the mathematical study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. The field, in applied mathematics, is at the intersection of probability theory, statistics, computer science, statistical mechanics, information engineering, and electrical engineering. A key measure in information theory is entropy.
Typical setIn information theory, the typical set is a set of sequences whose probability is close to two raised to the negative power of the entropy of their source distribution. That this set has total probability close to one is a consequence of the asymptotic equipartition property (AEP) which is a kind of law of large numbers. The notion of typicality is only concerned with the probability of a sequence and not the actual sequence itself.
Conditional entropyIn information theory, the conditional entropy quantifies the amount of information needed to describe the outcome of a random variable given that the value of another random variable is known. Here, information is measured in shannons, nats, or hartleys. The entropy of conditioned on is written as . The conditional entropy of given is defined as where and denote the support sets of and . Note: Here, the convention is that the expression should be treated as being equal to zero. This is because .
Cross-entropyIn information theory, the cross-entropy between two probability distributions and over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set is optimized for an estimated probability distribution , rather than the true distribution . The cross-entropy of the distribution relative to a distribution over a given set is defined as follows: where is the expected value operator with respect to the distribution .
Data analysisData analysis is the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains. In today's business world, data analysis plays a role in making decisions more scientific and helping businesses operate more effectively.
Transmission towerA transmission tower, also known as an electricity pylon or simply a pylon in British English and as a hydro tower in Canadian English, is a tall structure, usually a steel lattice tower, used to support an overhead power line. In electrical grids, they are generally used to carry high-voltage transmission lines that transport bulk electric power from generating stations to electrical substations before reaching its end consumers; utility poles are used to support lower-voltage subtransmission and distribution lines that transport power from substations to electric customers.
CommunicationCommunication is usually defined as the transmission of information. The term can also refer to the message itself, or the field of inquiry studying these transmissions, also known as communication studies. The precise definition of communication is disputed. Controversial issues are whether unintentional or failed transmissions are included and whether communication does not just transmit meaning but also create it. Models of communication aim to provide a simplified overview of its main components and their interaction.
Transmission lineIn electrical engineering, a transmission line is a specialized cable or other structure designed to conduct electromagnetic waves in a contained manner. The term applies when the conductors are long enough that the wave nature of the transmission must be taken into account. This applies especially to radio-frequency engineering because the short wavelengths mean that wave phenomena arise over very short distances (this can be as short as millimetres depending on frequency).
Communication theoryCommunication theory is a proposed description of communication phenomena, the relationships among them, a storyline describing these relationships, and an argument for these three elements. Communication theory provides a way of talking about and analyzing key events, processes, and commitments that together form communication. Theory can be seen as a way to map the world and make it navigable; communication theory gives us tools to answer empirical, conceptual, or practical communication questions.
Quantities of informationThe mathematical theory of information is based on probability theory and statistics, and measures information with several quantities of information. The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. The most common unit of information is the bit, or more correctly the shannon, based on the binary logarithm.
Noisy-channel coding theoremIn information theory, the noisy-channel coding theorem (sometimes Shannon's theorem or Shannon's limit), establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data (digital information) nearly error-free up to a computable maximum rate through the channel. This result was presented by Claude Shannon in 1948 and was based in part on earlier work and ideas of Harry Nyquist and Ralph Hartley.
Entropy in thermodynamics and information theoryThe mathematical expressions for thermodynamic entropy in the statistical thermodynamics formulation established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s are similar to the information entropy by Claude Shannon and Ralph Hartley, developed in the 1940s. The defining expression for entropy in the theory of statistical mechanics established by Ludwig Boltzmann and J. Willard Gibbs in the 1870s, is of the form: where is the probability of the microstate i taken from an equilibrium ensemble, and is the Boltzmann's constant.
Rate–distortion theoryRate–distortion theory is a major branch of information theory which provides the theoretical foundations for lossy data compression; it addresses the problem of determining the minimal number of bits per symbol, as measured by the rate R, that should be communicated over a channel, so that the source (input signal) can be approximately reconstructed at the receiver (output signal) without exceeding an expected distortion D. Rate–distortion theory gives an analytical expression for how much compression can be achieved using lossy compression methods.
NP-completenessIn computational complexity theory, a problem is NP-complete when: It is a decision problem, meaning that for any input to the problem, the output is either "yes" or "no". When the answer is "yes", this can be demonstrated through the existence of a short (polynomial length) solution. The correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying all possible solutions.
Interpersonal communicationInterpersonal communication is an exchange of information between two or more people. It is also an area of research that seeks to understand how humans use verbal and nonverbal cues to accomplish a number of personal and relational goals. Interpersonal communication research addresses at least six categories of inquiry: 1) how humans adjust and adapt their verbal communication and nonverbal communication during face-to-face communication; 2) how messages are produced; 3) how uncertainty influences behavior and information-management strategies; 4) deceptive communication; 5) relational dialectics; and 6) social interactions that are mediated by technology.
Models of communicationModels of communication are simplified representations of the process of communication. Most models try to describe both verbal and non-verbal communication and often understand it as an exchange of messages. Their function is to give a compact overview of the complex process of communication. This helps researchers formulate hypotheses, apply communication-related concepts to real-world cases, and test predictions. Despite their usefulness, many models are criticized based on the claim that they are too simple because they leave out essential aspects.
Coding theoryCoding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are studied by various scientific disciplines—such as information theory, electrical engineering, mathematics, linguistics, and computer science—for the purpose of designing efficient and reliable data transmission methods.