Remote sensing visual question answering (RSVQA) opens new avenues to promote the use of satellites data, by interfacing satellite image analysis with natural language processing. Capitalizing on the remarkable advances in natural language processing and c ...
We present the HIPE-2022 shared task on named entity processing in multilingual historical documents. Following the success of the first CLEF-HIPE-2020 evaluation lab, this edition confronts systems with the challenges of dealing with more languages, learn ...
Understanding how high-quality newspapers present and discuss major news plays a role towards tackling disinformation, as it contributes to the characterization of the full ecosystem in which information circulates. In this paper, we present an analysis of ...
Conversational interfaces have recently become a ubiquitous element in both the personal sphere by easing access to services, and industrial environments by the automation of services, improved customer support and its corresponding cost savings. However, ...
Visual Question Answering is a new task that can facilitate the extraction of information from images through textual queries: it aims at answering an open-ended question formulated in natural language about a given image. In this work, we introduce a new ...
With the current exponential growth of video-based social networks, video retrieval using natural language is receiving ever-increasing attention. Most existing approaches tackle this task by extracting individual frame-level spatial features to represent ...
Under the umbrella of smart toys, a myriad of interactive systems have addressed a variety of scenarios considering entertainment, education, sustainability, social and environmental learning through play. Tangibles and small toy robots prevail; but intera ...
Voice communication is the main channel to exchange information between pilots and Air-Traffic Controllers (ATCos). Recently, several projects have explored the employment of speech recognition technology to automatically extract spoken key information suc ...
Telerobotics is the process by which human operators control the movement of robots to achieve specific tasks. However, conventional control interfaces, such as joysticks and remote controllers, are not intuitive to use for novice users. Training to use th ...
We discuss some properties of generative models for word embeddings. Namely, (Arora et al., 2016) proposed a latent discourse model implying the concentration of the partition function of the word vectors. This concentration phenomenon led to an asymptotic ...
Open-domain chatbots engage in natural conversations with the user to socialize and establish bonds. However, designing and developing an effective open-domain chatbot is challenging. It is unclear what qualities of such chatbots most correspond to users' ...