Cross compilerA cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is running. For example, a compiler that runs on a PC but generates code that runs on an Android smartphone is a cross compiler. A cross compiler is useful to compile code for multiple platforms from one development host. Direct compilation on the target platform might be infeasible, for example on embedded systems with limited computing resources. Cross compilers are distinct from source-to-source compilers.
Source-to-source compilerA source-to-source translator, source-to-source compiler (S2S compiler), transcompiler, or transpiler is a type of translator that takes the source code of a program written in a programming language as its input and produces an equivalent source code in the same or a different programming language. A source-to-source translator converts between programming languages that operate at approximately the same level of abstraction, while a traditional compiler translates from a higher level programming language to a lower level programming language.
Runtime systemIn computer programming, a runtime system or runtime environment is a sub-system that exists both in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile time and runtime division from compiled languages, which similarly distinguishes the computer processes involved in the creation of a program (compilation) and its execution in the target machine (the run time). Most programming languages have some form of runtime system that provides an environment in which programs run.
Optimizing compilerIn computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption (the last three being popular for portable computers). Compiler optimization is generally implemented using a sequence of optimizing transformations, algorithms which take a program and transform it to produce a semantically equivalent output program that uses fewer resources or executes faster.
Program optimizationIn computer science, program optimization, code optimization, or software optimization, is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources. In general, a computer program may be optimized so that it executes more rapidly, or to make it capable of operating with less memory storage or other resources, or draw less power. Although the word "optimization" shares the same root as "optimal", it is rare for the process of optimization to produce a truly optimal system.
Symbol tableIn computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier (or symbol), constant, procedure and function in a program's source code is associated with information relating to its declaration or appearance in the source. In other words, the entries of a symbol table store the information related to the entry's corresponding symbol. A symbol table may only exist in memory during the translation process, or it may be embedded in the output of the translation, such as in an ABI for later use.
P-code machineIn computer programming, a p-code machine (portable code machine) is a virtual machine designed to execute p-code (the assembly language or machine code of a hypothetical central processing unit (CPU)). This term is applied both generically to all such machines (such as the Java virtual machine (JVM) and MATLAB precompiled code), and to specific implementations, the most famous being the p-Machine of the Pascal-P system, particularly the UCSD Pascal implementation, among whose developers, the p in p-code was construed to mean pseudo more often than portable, thus pseudo-code meaning instructions for a pseudo-machine.
Code generation (compiler)In computing, code generation is part of the process chain of a compiler and converts intermediate representation of source code into a form (e.g., machine code) that can be readily executed by the target system. Sophisticated compilers typically perform multiple passes over various intermediate forms. This multi-stage process is used because many algorithms for code optimization are easier to apply one at a time, or because the input to one optimization relies on the completed processing performed by another optimization.
BytecodeBytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normally numeric addresses) that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects. The name bytecode stems from instruction sets that have one-byte opcodes followed by optional parameters.
Low-level programming languageA low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture—commands or functions in the language map that are structurally similar to processor's instructions. Generally, this refers to either machine code or assembly language. Because of the low (hence the word) abstraction between the language and machine language, low-level languages are sometimes described as being "close to the hardware".
Control-flow graphIn computer science, a control-flow graph (CFG) is a representation, using graph notation, of all paths that might be traversed through a program during its execution. The control-flow graph was discovered by Frances E. Allen, who noted that Reese T. Prosser used boolean connectivity matrices for flow analysis before. The CFG is essential to many compiler optimizations and static-analysis tools. In a control-flow graph each node in the graph represents a basic block, i.e.
HaskellHaskell (ˈhæskəl) is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell has pioneered a number of programming language features such as type classes, which enable type-safe operator overloading, and monadic input/output (IO). It is named after logician Haskell Curry. Haskell's main implementation is the Glasgow Haskell Compiler (GHC).
Java bytecodeIn computing, Java bytecode is the bytecode-structured instruction set of the Java virtual machine (JVM), a virtual machine that enables a computer to run programs written in the Java programming language and several other programming languages, see List of JVM languages. A Java programmer does not need to be aware of or understand Java bytecode at all. However, as suggested in the IBM developerWorks journal, "Understanding bytecode and what bytecode is likely to be generated by a Java compiler helps the Java programmer in the same way that knowledge of assembly helps the C or C++ programmer.
Abstract syntax treeIn computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurring in the text. The syntax is "abstract" in the sense that it does not represent every detail appearing in the real syntax, but rather just the structural or content-related details. For instance, grouping parentheses are implicit in the tree structure, so these do not have to be represented as separate nodes.
Data-flow analysisData-flow analysis is a technique for gathering information about the possible set of values calculated at various points in a computer program. A program's control-flow graph (CFG) is used to determine those parts of a program to which a particular value assigned to a variable might propagate. The information gathered is often used by compilers when optimizing a program. A canonical example of a data-flow analysis is reaching definitions.
DecompilerA decompiler is a computer program that translates an executable file to high-level source code. It does therefore the opposite of a typical compiler, which translates a high-level language to a low-level language. While disassemblers translate an executable into assembly language, decompilers go a step further and translate the code into a higher level language such as C or Java, requiring more sophisticated techniques. Decompilers are usually unable to perfectly reconstruct the original source code, thus will frequently produce obfuscated code.
Interpreter (computing)In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution: Parse the source code and perform its behavior directly; Translate source code into some efficient intermediate representation or object code and immediately execute that; Explicitly execute stored precompiled bytecode made by a compiler and matched with the interpreter Virtual Machine.
LLVMLLVM is a set of compiler and toolchain technologies that can be used to develop a frontend for any programming language and a backend for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. The name LLVM originally stood for Low Level Virtual Machine, though the project has expanded and the name is no longer officially an initialism.
Register allocationIn compiler optimization, register allocation is the process of assigning local automatic variables and expression results to a limited number of processor registers. Register allocation can happen over a basic block (local register allocation), over a whole function/procedure (global register allocation), or across function boundaries traversed via call-graph (interprocedural register allocation). When done per function/procedure the calling convention may require insertion of save/restore around each call-site.
System programming languageA system programming language is a programming language used for system programming; such languages are designed for writing system software, which usually requires different development approaches when compared with application software. Edsger Dijkstra refers to these languages as machine oriented high order languages, or mohol. General-purpose programming languages tend to focus on generic features to allow programs written in the language to use the same code on different platforms.