MojibakeMojibake (文字化け; mod͡ʑibake, "character transformation") is the garbled text that is the result of text being decoded using an unintended character encoding. The result is a systematic replacement of symbols with completely unrelated ones, often from a different writing system. This display may include the generic replacement character ("�") in places where the binary representation is considered invalid. A replacement can also involve multiple consecutive symbols, as viewed in one encoding, when the same binary code constitutes one symbol in the other encoding.
Windows code pageWindows code pages are sets of characters or code pages (known as character encodings in other operating systems) used in Microsoft Windows from the 1980s and 1990s. Windows code pages were gradually superseded when Unicode was implemented in Windows, although they are still supported both within Windows and other platforms, and still apply when Alt code shortcuts are used. There are two groups of system code pages in Windows systems: OEM and Windows-native ("ANSI") code pages. (ANSI is the American National Standards Institute.
Vertical barThe vertical bar, , is a glyph with various uses in mathematics, computing, and typography. It has many names, often related to particular meanings: Sheffer stroke (in logic), pipe, bar, or (literally the word "or"), vbar, and others. The vertical bar is used as a mathematical symbol in numerous ways: absolute value: , read "the absolute value of x" cardinality: , read "the cardinality of the set S" or "the length of a string S". conditional probability: , read "the probability of X given Y" determinant: , read "the determinant of the matrix A".
At signThe at sign, , is an accounting and invoice abbreviation meaning "at a rate of" (e.g. 7 widgets @ £2 per widget = £14), now seen more widely in email addresses and social media platform handles. It is normally read aloud as "at" and is also commonly called the at symbol, commercial at, or address sign. The absence of a single English word for the symbol has prompted some writers to use the French arobase or Spanish and Portuguese arroba, or to coin new words such as ampersat and asperand, or the (visual) onomatopoeia strudel, but none of these have achieved wide use.
Pound signThe pound sign is the symbol for the pound unit of sterling – the currency of the United Kingdom and previously of Great Britain and of the Kingdom of England. The same symbol is used for other currencies called pound, such as the Gibraltar, Egyptian, Manx and Syrian pounds. The sign may be drawn with one or two bars depending on personal preference, but the Bank of England has used the one-bar style exclusively on banknotes since 1975. In the United States, "pound sign" refers to the symbol (number sign).
Number signThe symbol is known variously in English-speaking regions as the number sign, hash, or pound sign. The symbol has historically been used for a wide range of purposes including the designation of an ordinal number and as a ligatured abbreviation for pounds avoirdupois – having been derived from the now-rare . Since 2007, widespread usage of the symbol to introduce metadata tags on social media platforms has led to such tags being known as "hashtags", and from that, the symbol itself is sometimes called a hashtag.
TildeThe tilde ("tIldeI,-di,-d@,_"tIld) or , is a grapheme with several uses. The name of the character came into English from Spanish, which in turn came from the Latin titulus, meaning "title" or "superscription". Its primary use is as a diacritic (accent) in combination with a base letter; but for historical reasons, it is also used in standalone form within a variety of contexts. The tilde was originally written over an omitted letter or several letters as a scribal abbreviation, or "mark of suspension" and "mark of contraction", shown as a straight line when used with capitals.
UTF-8UTF-8 is a variable-length character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode (or Universal Coded Character Set) Transformation Format - 8-bit. UTF-8 is capable of encoding all 1,112,064 valid character code points in Unicode using one to four one-byte (8-bit) code units. Code points with lower numerical values, which tend to occur more frequently, are encoded using fewer bytes.
ISO/IEC 8859-1ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ISO/IEC 8859-1 encodes what it refers to as "Latin alphabet no. 1", consisting of 191 characters from the Latin script. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa.
American National Standards InstituteThe American National Standards Institute (ANSI ˈænsi ) is a private nonprofit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organization also coordinates U.S. standards with international standards so that American products can be used worldwide. ANSI accredits standards that are developed by representatives of other standards organizations, government agencies, consumer groups, companies, and others.
Control characterIn computing and telecommunication, a control character or non-printing character (NPC) is a code point in a character set that does not represent a written character or symbol. They are used as in-band signaling to cause effects other than the addition of a symbol to the text. All other characters are mainly graphic characters, also known as printing characters (or printable characters), except perhaps for "space" characters. In the ASCII standard there are 33 control characters, such as code 7, , which rings a terminal bell.
Quotation markQuotation marks are punctuation marks used in pairs in various writing systems to set off direct speech, a quotation, or a phrase. The pair consists of an opening quotation mark and a closing quotation mark, which may or may not be the same character. Quotation marks have a variety of forms in different languages and in different media. The single quotation mark is traced to Ancient Greek practice, adopted and adapted by monastic copyists. Isidore of Seville, in his seventh century encyclopedia, Etymologiae, described their use of the Greek diplé (a chevron): [13] ⟩ Diple.
ASCIIASCII (ˈæskiː ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are , which severely limited its scope. Many computer systems instead use Unicode, which has millions of code points, but the first 128 of these are the same as the ASCII set.
Bulletin board systemA bulletin board system (BBS), also called a computer bulletin board service (CBBS), is a computer server running software that allows users to connect to the system using a terminal program. Once logged in, the user can perform functions such as uploading and downloading software and data, reading news and bulletins, and exchanging messages with other users through public message boards and sometimes via direct chatting. In the early 1980s, message networks such as FidoNet were developed to provide services such as NetMail, which is similar to internet-based email.
UnicodeUnicode, formally The Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, thousands of emoji (including in colours), and non-visual control and formatting codes.