Skip to main content
Graph
Search
fr
en
Login
Search
All
Categories
Concepts
Courses
Lectures
MOOCs
People
Practice
Publications
Startups
Units
Show all results for
Home
Course
CS-422: Database systems
Graph Chatbot
Lectures in this course (104)
Views: SQL Queries Simplification
Explores how views simplify query writing and are key in data warehouses.
Resource Management in Spark
Explores resource management, fault tolerance, job recovery, and Spark SQL in Spark.
Data Wrangling: Structuring and Wrangling Issues
Covers data wrangling stages, structuring techniques, and common issues in data preparation.
Optimizing Join Operations: Challenges and Solutions
Explores optimizing join operations in distributed systems, addressing skewness and introducing the 1-Bucket-Theta algorithm.
Data Accuracy: Assessing Faithfulness and Error Detection
Explores data accuracy through faithfulness assessment, error detection, outlier handling, correlations, functional dependencies, violation detection, denial constraints, and data repairing techniques.
Data Stream Processing: Management and Challenges
Explores data stream management, real-time applications, challenges in analysis, and efficient stream management strategies.
Temporality and Entity Resolution
Explores challenges in data temporality and techniques for entity resolution.
Intro to Distributed Frameworks
Covers challenges of handling large data sizes and characteristics of big data.
Spark Streaming: Fault Tolerance and DStreams
Explores fault tolerance and DStreams in Spark Streaming for real-time big data analysis.
Approximate Query Processing: BlinkDB
Introduces BlinkDB, a framework for approximate query processing using sampling techniques.
Distributed Query Processing: Execution Models and Declustering Tradeoffs
Covers analytical query processing, declustering strategies, and distributed operations.
Privacy-preserving Data Management: Operations and Protocols
Explores privacy-preserving data management operations and summarization techniques for sensitive data protection.
MapReduce: Execution Models for Distributed Computing
Introduces the MapReduce programming model for distributed computing, focusing on its vision and under-the-hood mechanisms.
Privacy-Preserving Data Mining
Explores privacy-preserving data mining techniques, including k-anonymity, attacks, and differential privacy.
Dataflow: Execution Models for Distributed Computing
Explores the data flow model for distributed computing using RDDs in Spark.
Large-scale SQL Processing
Introduces data frames as a space-efficient data representation with an extensible SQL-like language.
Data Summarization: Minhashing and Locality-Sensitive Hashing
Explores Jaccard similarity, minhashing, and locality-sensitive hashing for data summarization.
Scheduling: Under the Hood
Explores the complexities of scheduling in distributed computing frameworks, emphasizing data locality optimization and multitenancy strategies.
Transactions and ACID: Overview
Explores transactions, ACID properties, concurrency challenges, and scheduling in database management systems.
Concurrency Control: Lock-Based Protocols
Covers lock-based concurrency control protocols and various transaction models.
Previous
Page 4 of 6
Next