String operationsIn computer science, in the area of formal language theory, frequent use is made of a variety of string functions; however, the notation used is different from that used for computer programming, and some commonly used functions in the theoretical realm are rarely used when programming. This article defines some of these basic terms. A string is a finite sequence of characters. The empty string is denoted by . The concatenation of two string and is denoted by , or shorter by . Concatenating with the empty string makes no difference: .
Kleene starIn mathematical logic and computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation, either on sets of strings or on sets of symbols or characters. In mathematics, it is more commonly known as the free monoid construction. The application of the Kleene star to a set is written as . It is widely used for regular expressions, which is the context in which it was introduced by Stephen Kleene to characterize certain automata, where it means "zero or more repetitions".
ConcatenationIn formal language theory and computer programming, string concatenation is the operation of joining character strings end-to-end. For example, the concatenation of "snow" and "ball" is "snowball". In certain formalisations of concatenation theory, also called string theory, string concatenation is a primitive notion. In many programming languages, string concatenation is a binary infix operator, and in some it is written without an operator.
Trace monoidIn computer science, a trace is a set of strings, wherein certain letters in the string are allowed to commute, but others are not. It generalizes the concept of a string, by not forcing the letters to always be in a fixed order, but allowing certain reshufflings to take place. Traces were introduced by Pierre Cartier and Dominique Foata in 1969 to give a combinatorial proof of MacMahon's master theorem.
Lexicographic orderIn mathematics, the lexicographic or lexicographical order (also known as lexical order, or dictionary order) is a generalization of the alphabetical order of the dictionaries to sequences of ordered symbols or, more generally, of elements of a totally ordered set. There are several variants and generalizations of the lexicographical ordering. One variant applies to sequences of different lengths by comparing the lengths of the sequences before considering their elements.
Syntactic monoidIn mathematics and computer science, the syntactic monoid of a formal language is the smallest monoid that recognizes the language . The free monoid on a given set is the monoid whose elements are all the strings of zero or more elements from that set, with string concatenation as the monoid operation and the empty string as the identity element. Given a subset of a free monoid , one may define sets that consist of formal left or right inverses of elements in .
Empty stringIn formal language theory, the empty string, or empty word, is the unique string of length zero. Formally, a string is a finite, ordered sequence of characters such as letters, digits or spaces. The empty string is the special case where the sequence has length zero, so there are no symbols in the string. There is only one empty string, because two strings are only different if they have different lengths or a different sequence of symbols. In formal treatments, the empty string is denoted with ε or sometimes Λ or λ.
SemiautomatonIn mathematics and theoretical computer science, a semiautomaton is a deterministic finite automaton having inputs but no output. It consists of a set Q of states, a set Σ called the input alphabet, and a function T: Q × Σ → Q called the transition function. Associated with any semiautomaton is a monoid called the characteristic monoid, input monoid, transition monoid or transition system of the semiautomaton, which acts on the set of states Q.
Deterministic finite automatonIn the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automaton (DFSA)—is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string. Deterministic refers to the uniqueness of the computation run.
Process calculusIn computer science, the process calculi (or process algebras) are a diverse family of related approaches for formally modelling concurrent systems. Process calculi provide a tool for the high-level description of interactions, communications, and synchronizations between a collection of independent agents or processes. They also provide algebraic laws that allow process descriptions to be manipulated and analyzed, and permit formal reasoning about equivalences between processes (e.g., using bisimulation).
Free objectIn mathematics, the idea of a free object is one of the basic concepts of abstract algebra. Informally, a free object over a set A can be thought of as being a "generic" algebraic structure over A: the only equations that hold between elements of the free object are those that follow from the defining axioms of the algebraic structure. Examples include free groups, tensor algebras, or free lattices. The concept is a part of universal algebra, in the sense that it relates to all types of algebraic structure (with finitary operations).
MultisetIn mathematics, a multiset (or bag, or mset) is a modification of the concept of a set that, unlike a set, allows for multiple instances for each of its elements. The number of instances given for each element is called the multiplicity of that element in the multiset. As a consequence, an infinite number of multisets exist which contain only elements a and b, but vary in the multiplicities of their elements: The set contains only elements a and b, each having multiplicity 1 when is seen as a multiset.
MonoidIn abstract algebra, a branch of mathematics, a monoid is a set equipped with an associative binary operation and an identity element. For example, the nonnegative integers with addition form a monoid, the identity element being 0. Monoids are semigroups with identity. Such algebraic structures occur in several branches of mathematics. The functions from a set into itself form a monoid with respect to function composition. More generally, in , the morphisms of an to itself form a monoid, and, conversely, a monoid may be viewed as a category with a single object.
Alphabet (formal languages)In formal language theory, an alphabet, sometimes called a vocabulary, is a non-empty set of indivisible symbols/glyphs, typically thought of as representing letters, characters, digits, phonemes, or even words. Alphabets in this technical sense of a set are used in a diverse range of fields including logic, mathematics, computer science, and linguistics. An alphabet may have any cardinality ("size") and depending on its purpose maybe be finite (e.g., the alphabet of letters "a" through "z"), countable (e.
SemigroupIn mathematics, a semigroup is an algebraic structure consisting of a set together with an associative internal binary operation on it. The binary operation of a semigroup is most often denoted multiplicatively (just notation, not necessarily the elementary arithmetic multiplication): x·y, or simply xy, denotes the result of applying the semigroup operation to the ordered pair (x, y). Associativity is formally expressed as that (x·y)·z = x·(y·z) for all x, y and z in the semigroup.
Transformation semigroupIn algebra, a transformation semigroup (or composition semigroup) is a collection of transformations (functions from a set to itself) that is closed under function composition. If it includes the identity function, it is a monoid, called a transformation (or composition) monoid. This is the semigroup analogue of a permutation group. A transformation semigroup of a set has a tautological semigroup action on that set. Such actions are characterized by being faithful, i.e., if two elements of the semigroup have the same action, then they are equal.
SubstringIn formal language theory and computer science, a substring is a contiguous sequence of characters within a string. For instance, "the best of" is a substring of "It was the best of times". In contrast, "Itwastimes" is a subsequence of "It was the best of times", but not a substring. Prefixes and suffixes are special cases of substrings. A prefix of a string is a substring of that occurs at the beginning of ; likewise, a suffix of a string is a substring that occurs at the end of .
SemilatticeIn mathematics, a join-semilattice (or upper semilattice) is a partially ordered set that has a join (a least upper bound) for any nonempty finite subset. Dually, a meet-semilattice (or lower semilattice) is a partially ordered set which has a meet (or greatest lower bound) for any nonempty finite subset. Every join-semilattice is a meet-semilattice in the inverse order and vice versa.
String (computer science)In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed (after creation). A string is generally considered as a data type and is often implemented as an array data structure of bytes (or words) that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence (or list) data types and structures.
SemiringIn abstract algebra, a semiring is an algebraic structure. It is a generalization of a ring, dropping the requirement that each element must have an additive inverse. At the same time, it is a generalization of bounded distributive lattices. The smallest semiring that is not a ring is the two-element Boolean algebra, e.g. with logical disjunction as addition. A motivating example that is neither a ring nor a lattice is the set of natural numbers under ordinary addition and multiplication, when including the number zero.