Huffman codingIn computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by David A. Huffman while he was a Sc.D. student at MIT, and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes". The output from Huffman's algorithm can be viewed as a variable-length code table for encoding a source symbol (such as a character in a file).
PNGPortable Network Graphics (PNG, officially pronounced pɪŋ , colloquially pronounced ˌpiːɛnˈdʒiː ) is a raster-graphics file that supports lossless data compression. PNG was developed as an improved, non-patented replacement for Graphics Interchange Format (GIF)—unofficially, the initials PNG stood for the recursive acronym "PNG's not GIF". PNG supports palette-based images (with palettes of 24-bit RGB or 32-bit RGBA colors), grayscale images (with or without an alpha channel for transparency), and full-color non-palette-based RGB or RGBA images.
MP3MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany under the lead of Karlheinz Brandenburg, with support from other digital scientists in the United States and elsewhere. Originally defined as the third audio format of the MPEG-1 standard, it was retained and further extended — defining additional bit-rates and support for more audio channels — as the third audio format of the subsequent MPEG-2 standard.
Gzipgzip is a and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and intended for use by GNU (from where the "g" of gzip is derived). Version 0.1 was first publicly released on 31 October 1992, and version 1.0 followed in February 1993. The decompression of the gzip format can be implemented as a streaming algorithm, an important feature for Web protocols, data interchange and ETL (in standard pipes) applications.
Discrete cosine transformA discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine functions oscillating at different frequencies. The DCT, first proposed by Nasir Ahmed in 1972, is a widely used transformation technique in signal processing and data compression. It is used in most digital media, including (such as JPEG and HEIF), digital video (such as MPEG and H.26x), digital audio (such as Dolby Digital, MP3 and AAC), digital television (such as SDTV, HDTV and VOD), digital radio (such as AAC+ and DAB+), and speech coding (such as AAC-LD, Siren and Opus).
Data compression ratioData compression ratio, also known as compression power, is a measurement of the relative reduction in size of data representation produced by a data compression algorithm. It is typically expressed as the division of uncompressed size by compressed size. Data compression ratio is defined as the ratio between the uncompressed size and compressed size: Thus, a representation that compresses a file's storage size from 10 MB to 2 MB has a compression ratio of 10/2 = 5, often notated as an explicit ratio, 5:1 (read "five" to "one"), or as an implicit ratio, 5/1.
Data compressionIn information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information.
JPEG 2000JPEG 2000 (JP2) is an standard and coding system. It was developed from 1997 to 2000 by a Joint Photographic Experts Group committee chaired by Touradj Ebrahimi (later the JPEG president), with the intention of superseding their original JPEG standard (created in 1992), which is based on a discrete cosine transform (DCT), with a newly designed, wavelet-based method. The standardized is .jp2 for ISO/IEC 15444-1 conforming files and .jpx for the extended part-2 specifications, published as ISO/IEC 15444-2.
Lossy compressionIn information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat on this page show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression (reversible data compression) which does not degrade the data.
File formatA file format is a standard way that information is encoded for storage in a . It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free. Some file formats are designed for very particular types of data: PNG files, for example, store bitmapped using lossless data compression. Other file formats, however, are designed for storage of several different types of data: the Ogg format can act as a container for different types of multimedia including any combination of audio and video, with or without text (such as subtitles), and metadata.
DeflateIn computing, Deflate (stylized as DEFLATE) is a lossless data compression that uses a combination of LZ77 and Huffman coding. It was designed by Phil Katz, for version 2 of his PKZIP archiving tool. Deflate was later specified in RFC 1951 (1996). Katz also designed the original algorithm used to construct Deflate streams. This algorithm was patented as , and assigned to PKWARE, Inc. As stated in the RFC document, an algorithm producing Deflate files was widely thought to be implementable in a manner not covered by patents.
JPEGJPEG (ˈdʒeɪpɛɡ , short for Joint Photographic Experts Group) is a commonly used method of lossy compression for s, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and . JPEG typically achieves 10:1 compression with little perceptible loss in image quality. Since its introduction in 1992, JPEG has been the most widely used standard in the world, and the most widely used digital , with several billion JPEG images produced every day as of 2015.
ZIP (file format)ZIP is an that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed. The ZIP file format permits a number of compression algorithms, though DEFLATE is the most common. This format was originally created in 1989 and was first implemented in PKWARE, Inc.'s PKZIP utility, as a replacement for the previous compression format by Thom Henderson. The ZIP format was then quickly supported by many software utilities other than PKZIP.
Bit rateIn telecommunications and computing, bit rate (bitrate or as a variable R) is the number of bits that are conveyed or processed per unit of time. The bit rate is expressed in the unit bit per second (symbol: bit/s), often in conjunction with an SI prefix such as kilo (1 kbit/s = 1,000 bit/s), mega (1 Mbit/s = 1,000 kbit/s), giga (1 Gbit/s = 1,000 Mbit/s) or tera (1 Tbit/s = 1,000 Gbit/s). The non-standard abbreviation bps is often used to replace the standard symbol bit/s, so that, for example, 1 Mbps is used to mean one million bits per second.
WebPWebP is a raster graphics developed by Google intended as a replacement for JPEG, PNG, and GIF file formats. It supports both lossy and lossless compression, as well as animation and alpha transparency. Google announced the WebP format in September 2010, and released the first stable version of its supporting library in April 2018. WebP was first announced by Google on 30 September in 2010 as a new open format for lossy compressed true-color graphics on the web, producing files that were smaller than JPEG files for comparable image quality.
7-Zip7-Zip is a free and open-source , a utility used to place groups of files within compressed containers known as "archives". It is developed by Igor Pavlov and was first released in 1999. 7-Zip has its own archive format called 7z, but can read and write several others. The program can be used from a Windows graphical user interface that also features shell integration, from a Windows command-line interface as the command 7za or 7za.exe, and from POSIX systems as p7zip. Most of the 7-Zip source code is under the LGPL-2.
Lempel–Ziv–WelchLempel–Ziv–Welch (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is simple to implement and has the potential for very high throughput in hardware implementations. It is the algorithm of the Unix file compression utility compress and is used in the GIF image format.
Entropy (information theory)In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable , which takes values in the alphabet and is distributed according to : where denotes the sum over the variable's possible values. The choice of base for , the logarithm, varies for different applications. Base 2 gives the unit of bits (or "shannons"), while base e gives "natural units" nat, and base 10 gives units of "dits", "bans", or "hartleys".
GIFThe Graphics Interchange Format (GIF; ɡɪf or dʒɪf , see pronunciation) is a bitmap that was developed by a team at the online services provider CompuServe led by American computer scientist Steve Wilhite and released on June 15, 1987. It is in widespread usage on the World Wide Web due to its wide support and portability between applications and operating systems. The format supports up to 8 bits per pixel for each image, allowing a single image to reference its own palette of up to 256 different colors chosen from the 24-bit RGB color space.
PDFPortable Document Format (PDF), standardized as ISO 32000, is a developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991.