Lectures related to Data lake | EPFL Graph Search

Covers data warehouses, data lakes, OLTP vs. OLAP, data quality, and the Data Lakehouse concept.

Data Warehouses and Decision Support Systems

Explores data warehouses, decision support systems, OLAP, data lakes, multidimensional data models, and query optimizations.

Data Warehousing: Overview and Challenges

Introduces data warehousing fundamentals, challenges, and the innovative concept of a 'lakehouse'.

Data Lakes: Structure and Optimization

Explores data lakes, data structure, and optimization for effective querying.

Big Data: Best Practices and Guidelines

Covers best practices and guidelines for big data, including data lakes, typical architecture, challenges, and technologies used to address them.

Data Ingestion Layer: SmartDataLake

Explores the data ingestion process in SmartDataLake, including datasets and the RAW platform.

Big Data Ecosystems: Technologies and Challenges

Covers the fundamentals of big data ecosystems, focusing on technologies, challenges, and practical exercises with Hadoop's HDFS.

Big Data Best Practices and Guidelines

Covers best practices and guidelines for big data, including data lakes, architecture, challenges, and technologies like Hadoop and Hive.

Data Wrangling with Hive: Managing Big Data Efficiently

Covers data wrangling techniques using Apache Hive for efficient big data management.

Distributed Computing: Challenges and Solutions

Explores challenges in distributed computing, data growth, and data types, emphasizing the battle against the three Vs in big data.

Data Virtualization: SmartDataLake

Explores data virtualization in SmartDataLake project, covering query optimization, storage tiering, and challenges in processing heterogeneous data.

Data Warehousing and Decision Support

Explores data warehousing, decision support systems, and the importance of statistics in data analysis.

Data Wrangling: ETL Process and Wrangling Issues

Explores the ETL process, data wrangling stages, and common issues.

Introduction to Data Stream Processing: Concepts and Applications

Covers the principles of data stream processing and its applications in real-time data analysis.

General Introduction to Big Data

Covers data science tools, Hadoop, Spark, data lake ecosystems, CAP theorem, batch vs. stream processing, HDFS, Hive, Parquet, ORC, and MapReduce architecture.

Introduction to Data Stream Processing: Concepts and Applications

Covers data stream processing concepts, focusing on Apache Kafka and Spark Streaming integration, event time management, and project implementation guidelines.

Introduction to Data Stream Processing

Covers the fundamentals of data stream processing, including tools like Apache Storm and Kafka, key concepts like event time and window operations, and the challenges of stream processing.

Data Warehouses: Introduction and Challenges

Covers the introduction and challenges of data warehouses, including integrating data, managing metadata, and optimizing query performance.

Mobilities through Big Data

Discusses the influence of Big Data on mobility planning and optimization, exploring its promises and limitations.

Calculating Average Growth Rates

Explores the calculation of average growth rates and the sensitivity of growth rates to different time periods.