Publication

Current trends in multilingual speech processing

Related concepts (29)

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition. Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database.

Speech recognition

Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis.

Speech translation

Speech translation is the process by which conversational spoken phrases are instantly translated and spoken aloud in a second language. This differs from phrase translation, which is where the system only translates a fixed and finite set of phrases that have been manually entered into the system. Speech translation technology enables speakers of different languages to communicate. It thus is of tremendous value for humankind in terms of science, cross-cultural exchange and global business.

Multilingualism

Multilingualism is the use of more than one language, either by an individual speaker or by a group of speakers. It is believed that multilingual speakers outnumber monolingual speakers in the world's population. More than half of all Europeans claim to speak at least one language other than their mother tongue; but many read and write in one language. Multilingualism is advantageous for people wanting to participate in trade, globalization and cultural openness.

Text corpus

In linguistics and natural language processing, a corpus (: corpora) or text corpus is a dataset, consisting of natively digital and older, digitalized, language resources, either annotated or unannotated. Annotated, they have been used in corpus linguistics for statistical hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory. In search technology, a corpus is the collection of documents which is being searched.

Machine translation

Machine translation is use of either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches to translation of text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages. History of machine translation The origins of machine translation can be traced back to the work of Al-Kindi, a ninth-century Arabic cryptographer who developed techniques for systemic language translation, including cryptanalysis, frequency analysis, and probability and statistics, which are used in modern machine translation.

Speech processing

Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied to speech signals. Aspects of speech processing includes the acquisition, manipulation, storage, transfer and output of speech signals. Different speech processing tasks include speech recognition, speech synthesis, speaker diarization, speech enhancement, speaker recognition, etc.

Translation

Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. The English language draws a terminological distinction (which does not exist in every language) between translating (a written text) and interpreting (oral or signed communication between users of different languages); under this distinction, translation can begin only after the appearance of writing within a language community.

Lexicon

A lexicon (plural: lexicons, rarely lexica) is the vocabulary of a language or branch of knowledge (such as nautical or medical). In linguistics, a lexicon is a language's inventory of lexemes. The word lexicon derives from Greek word λεξικόν (lexikon), neuter of λεξικός (lexikos) meaning 'of or for words'. Linguistic theories generally regard human languages as consisting of two parts: a lexicon, essentially a catalogue of a language's words (its wordstock); and a grammar, a system of rules which allow for the combination of those words into meaningful sentences.

Google Translate

Google Translate is a multilingual neural machine translation service developed by Google to translate text, documents and websites from one language into another. It offers a website interface, a mobile app for Android and iOS, as well as an API that helps developers build browser extensions and software applications. As of 2022, Google Translate supports languages at various levels; it claimed over 500 million total users , with more than 100 billion words translated daily, after the company stated in May 2013 that it served over 200 million people daily.

Natural language processing

Natural language processing (NLP) is an interdisciplinary subfield of linguistics and computer science. It is primarily concerned with processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic (i.e. statistical and, most recently, neural network-based) machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.

Speech community

A speech community is a group of people who share a set of linguistic norms and expectations regarding the use of language. It is a concept mostly associated with sociolinguistics and anthropological linguistics. Exactly how to define speech community is debated in the literature. Definitions of speech community tend to involve varying degrees of emphasis on the following: Shared community membership Shared linguistic communication A typical speech community can be a small town, but sociolinguists such as William Labov claim that a large metropolitan area, for example New York City, can also be considered one single speech community.

Speaker recognition

Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to speaker recognition or speech recognition. Speaker verification (also called speaker authentication) contrasts with identification, and speaker recognition differs from speaker diarisation (recognizing when the same speaker is speaking).

Language convergence

Language convergence is a type of linguistic change in which languages come to resemble one another structurally as a result of prolonged language contact and mutual interference, regardless of whether those languages belong to the same language family, i.e. stem from a common genealogical proto-language. In contrast to other contact-induced language changes like creolization or the formation of mixed languages, convergence refers to a mutual process that results in changes in all the languages involved.

Translanguaging

Translanguaging is a term that can refer to different aspects of multilingualism. It can describe the way bilinguals and multilinguals use their linguistic resources to make sense of and interact with the world around them. It can also refer to a pedagogical approach that utilizes more than one language within a classroom lesson. The term "translanguaging" was coined in the 1980s by Cen Williams (applied in Welsh as trawsieithu) in his unpublished thesis titled “An Evaluation of Teaching and Learning Methods in the Context of Bilingual Secondary Education.

Speech

Speech is a human vocal communication using language. Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its words (that is, all English words sound different from all French words, even if they are the same word, e.g., "role" or "hotel"), and using those words in their semantic character as words in the lexicon of a language according to the syntactic constraints that govern lexical words' function in a sentence. In speaking, speakers perform many different intentional speech acts, e.

Grammatical gender

In linguistics, a grammatical gender system is a specific form of a noun class system, where nouns are assigned to gender categories that are often not related to the real-world qualities of the entities denoted by those nouns. In languages with grammatical gender, most or all nouns inherently carry one value of the called gender; the values present in a given language (of which there are usually two or three) are called the genders of that language.

Pattern recognition

Pattern recognition is the automated recognition of patterns and regularities in data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess (PR) capabilities but their primary function is to distinguish and create emergent pattern. PR has applications in statistical data analysis, signal processing, , information retrieval, bioinformatics, data compression, computer graphics and machine learning.

History of natural language processing

The history of natural language processing describes the advances of natural language processing (Outline of natural language processing). There is some overlap with the history of machine translation, the history of speech recognition, and the history of artificial intelligence. The history of machine translation dates back to the seventeenth century, when philosophers such as Leibniz and Descartes put forward proposals for codes which would relate words between languages.

Sumerian language

Sumerian (Cuneiform: "native tongue") is the language of ancient Sumer. It is one of the oldest attested languages, dating back to at least 2900 BC. It is accepted to be a local language isolate and to have been spoken in ancient Mesopotamia, in the area that is modern-day Iraq. Akkadian, a Semitic language, gradually replaced Sumerian as a spoken language in the area 2000 BC (the exact date is debated), but Sumerian continued to be used as a sacred, ceremonial, literary and scientific language in Akkadian-speaking Mesopotamian states such as Assyria and Babylonia until the 1st century AD.