Natural language processingNatural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.
Merriam-WebsterMerriam-Webster, Incorporated is an American company that publishes reference books and is especially known for its dictionaries. It is the oldest dictionary publisher in the United States. In 1831, George and Charles Merriam founded the company as G & C Merriam Co. in Springfield, Massachusetts. In 1843, after Noah Webster died, the company bought the rights to An American Dictionary of the English Language from Webster's estate. All Merriam-Webster dictionaries trace their lineage to this source.
Part of speechIn grammar, a part of speech or part-of-speech (abbreviated as POS or PoS, also known as word class or grammatical category) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are assigned to the same part of speech generally display similar syntactic behavior (they play similar roles within the grammatical structure of sentences), sometimes similar morphological behavior in that they undergo inflection for similar properties and even similar semantic behavior.
International Phonetic AlphabetThe International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standardized representation of speech sounds in written form. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech–language pathologists, singers, actors, constructed language creators, and translators.
Chinese charactersChinese characters are logograms developed for the writing of Chinese. Chinese characters are the oldest continuously used system of writing in the world. By virtue of their widespread current use throughout East Asia and Southeast Asia, as well as their profound historic use throughout the Sinosphere, Chinese characters are among the most widely adopted writing systems in the world by number of users. The total number of Chinese characters ever to appear in a dictionary is in the tens of thousands, though most are graphic variants, were used historically and passed out of use, or are of a specialized nature.
French languageFrench (Standard French: Français fʁɑ̃sɛ or langue française lɑ̃ɡ fʁɑ̃sɛz) is a Romance language of the Indo-European family. It descended from the Vulgar Latin of the Roman Empire, as did all Romance languages. French evolved from Gallo-Romance, the Latin spoken in Gaul, and more specifically in Northern Gaul. Its closest relatives are the other langues d'oïl—languages historically spoken in northern France and in southern Belgium, which French (Francien) largely supplanted.
Lexical Markup FrameworkLanguage resource management Lexical markup framework (LMF; ISO 24613:2008), is the International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication. The goals of LMF are to provide a common model for the creation and use of lexical resources, to manage the exchange of data between and among these resources, and to enable the merging of large number of individual electronic resources to form extensive global electronic resources.
LexicologyLexicology is the branch of linguistics that analyzes the lexicon of a specific language. A word is the smallest meaningful unit of a language that can stand on its own, and is made up of small components called morphemes and even smaller elements known as phonemes, or distinguishing sounds. Lexicology examines every feature of a word – including formation, spelling, origin, usage, and definition. Lexicology also considers the relationships that exist between words.
Bilingual dictionaryA bilingual dictionary or translation dictionary is a specialized dictionary used to translate words or phrases from one language to another. Bilingual dictionaries can be unidirectional, meaning that they list the meanings of words of one language in another, or can be bidirectional, allowing translation to and from both languages. Bidirectional bilingual dictionaries usually consist of two sections, each listing words and phrases of one language alphabetically along with their translation.
Usage (language)The usage of a language is the ways in which its written and spoken variations are routinely employed by its speakers; that is, it refers to "the collective habits of a language's native speakers", as opposed to idealized models of how a language works or (should work) in the abstract. For instance, Fowler characterized usage as "the way in which a word or phrase is normally and correctly used" and as the "points of grammar, syntax, style, and the choice of words.
Corpus linguisticsCorpus linguistics is the study of a language as that language is expressed in its text corpus (plural corpora), its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference. The text-corpus method uses the body of texts written in any natural language to derive the set of abstract rules which govern that language.