Publication

Uncorking the bottleneck of crowding: a fresh look at object recognition

Related concepts (16)

Deep learning is part of a broader family of machine learning methods, which is based on artificial neural networks with representation learning. The adjective "deep" in deep learning refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

Artificial neural network

Artificial neural networks (ANNs, also shortened to neural networks (NNs) or neural nets) are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons.

Handwriting recognition

Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices. The image of the written text may be sensed "off line" from a piece of paper by optical scanning (optical character recognition) or intelligent word recognition. Alternatively, the movements of the pen tip may be sensed "on line", for example by a pen-based computer screen surface, a generally easier task as there are more clues available.

Long short-term memory

Long short-term memory (LSTM) network is a recurrent neural network (RNN), aimed to deal with the vanishing gradient problem present in traditional RNNs. Its relative insensitivity to gap length is its advantage over other RNNs, hidden Markov models and other sequence learning methods. It aims to provide a short-term memory for RNN that can last thousands of timesteps, thus "long short-term memory".

Transformer (machine learning model)

A transformer is a deep learning architecture that relies on the parallel multi-head attention mechanism. The modern transformer was proposed in the 2017 paper titled 'Attention Is All You Need' by Ashish Vaswani et al., Google Brain team. It is notable for requiring less training time than previous recurrent neural architectures, such as long short-term memory (LSTM), and its later variation has been prevalently adopted for training large language models on large (language) datasets, such as the Wikipedia corpus and Common Crawl, by virtue of the parallelized processing of input sequence.

Perception

Perception () is the organization, identification, and interpretation of sensory information in order to represent and understand the presented information or environment. All perception involves signals that go through the nervous system, which in turn result from physical or chemical stimulation of the sensory system. Vision involves light striking the retina of the eye; smell is mediated by odor molecules; and hearing involves pressure waves.

Gestalt psychology

Gestalt psychology, gestaltism, or configurationism is a school of psychology that emerged in the early twentieth century in Austria and Germany as a theory of perception that was a rejection of basic principles of Wilhelm Wundt's and Edward Titchener's elementalist and structuralist psychology. As used in Gestalt psychology, the German word Gestalt (gəˈʃtaelt,-'Stɑːlt,-ˈʃtɔːlt,-ˈstɑːlt,-ˈstɔːlt ɡəˈʃtalt; meaning "form") is interpreted as "pattern" or "configuration".

Theory of indispensable attributes

The theory of indispensable attributes (TIA) is a theory in the context of perceptual organisation which asks for the functional units and elementary features that are relevant for a perceptual system in the constitution of perceptual objects. Earlier versions of the theory emerged in the context of an application of research on vision to audition, and analogies between vision and audition were emphasised, whereas in more recent writings the necessity of a modality-general theory of perceptual organisation and objecthood is stressed.

Language model

A language model is a probabilistic model of a natural language that can generate probabilities of a series of words, based on text corpora in one or multiple languages it was trained on. Large language models, as their most advanced form, are a combination of feedforward neural networks and transformers. They have superseded recurrent neural network-based models, which had previously superseded the pure statistical models, such as word n-gram language model.

Activity recognition

Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. Since the 1980s, this research field has captured the attention of several computer science communities due to its strength in providing personalized support for many different applications and its connection to many different fields of study such as medicine, human-computer interaction, or sociology.

Object detection

Object detection is a computer technology related to computer vision and that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. Well-researched domains of object detection include face detection and pedestrian detection. Object detection has applications in many areas of computer vision, including and video surveillance. It is widely used in computer vision tasks such as , vehicle counting, activity recognition, face detection, face recognition, video object co-segmentation.

Body modification

Body modification (or body alteration) is the deliberate altering of the human anatomy or human physical appearance. In its broadest definition it includes skin tattooing, socially acceptable decoration (e.g., common ear piercing in many societies), and religious rites of passage (e.g., circumcision in a number of cultures), as well as the modern primitive movement.

Abundance of the chemical elements

The abundance of the chemical elements is a measure of the occurrence of the chemical elements relative to all other elements in a given environment. Abundance is measured in one of three ways: by mass fraction (in commercial contexts often called weight fraction), by mole fraction (fraction of atoms by numerical count, or sometimes fraction of molecules in gases), or by volume fraction. Volume fraction is a common abundance measure in mixed gases such as planetary atmospheres, and is similar in value to molecular mole fraction for gas mixtures at relatively low densities and pressures, and ideal gas mixtures.

Pattern recognition

Pattern recognition is the automated recognition of patterns and regularities in data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess (PR) capabilities but their primary function is to distinguish and create emergent pattern. PR has applications in statistical data analysis, signal processing, , information retrieval, bioinformatics, data compression, computer graphics and machine learning.

Recognition memory

Recognition memory, a subcategory of declarative memory, is the ability to recognize previously encountered events, objects, or people. When the previously experienced event is reexperienced, this environmental content is matched to stored memory representations, eliciting matching signals. As first established by psychology experiments in the 1970s, recognition memory for pictures is quite remarkable: humans can remember thousands of images at high accuracy after seeing each only once and only for a few seconds.

Body art

Body art is art made on, with, or consisting of, the human body. Body art covers a wide spectrum including tattoos, body piercings, scarification, and body painting. Body art may include performance art, body art is likewise utilized for investigations of the body in an assortment of different media including painting, casting, photography, film and video. More extreme body art can involve mutilation or pushing the body to its physical limits.