Lecture

Transformers: Pretraining and Decoding Techniques

Related lectures (32)

Delves into Deep Learning for Natural Language Processing, exploring Neural Word Embeddings, Recurrent Neural Networks, and Attentive Neural Modeling with Transformers.

Coreference Resolution

Covers coreference resolution, models, applications, challenges, and advancements in natural language processing.

Non-Conceptual Knowledge Systems

Delves into the impact of deep learning on non-conceptual knowledge systems and the advancements in transformers and generative adversarial networks.

Neural Networks: Training and Activation

Explores neural networks, activation functions, backpropagation, and PyTorch implementation.

Language Models: From Theory to Computation

Explores the mathematics of language models, covering architecture design, pre-training, and fine-tuning, emphasizing the importance of pre-training and fine-tuning for various tasks.

Neural Networks: Two Layers Neural Network

Covers the basics of neural networks, focusing on the development from two layers neural networks to deep neural networks.

Transformers in Vision: Applications and Architectures

Covers the impact of transformers in computer vision, discussing their architecture, applications, and advancements in various tasks.

Deep Generative Models: Part 2

Explores deep generative models, including mixtures of multinomials, PCA, deep autoencoders, convolutional autoencoders, and GANs.

Classical Language Models: Foundations and Applications

Introduces classical language models, their applications, and foundational concepts like count-based modeling and evaluation metrics.

Deep Learning for Question Answering

Explores deep learning for question answering, analyzing neural networks and model robustness to noise.

Introduction to Modern Natural Language Processing

Introduces the course on Modern Natural Language Processing, covering its significance, applications, challenges, and advancements in technology.

Transformer: Pre-Training

Explores the Transformer model, from recurrent models to attention-based NLP, highlighting its key components and significant results in machine translation and document generation.