Lectures related to Data formats and data wrangling with Hadoop

Data Wrangling with Hive: Managing Big Data Efficiently

Covers data wrangling techniques using Apache Hive for efficient big data management.

Covers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.

Data Wrangling with Hadoop: Storage Formats and Hive

Explores data wrangling with Hadoop, emphasizing storage formats and Hive for big data processing.

General Introduction to Big Data

Covers data science tools, Hadoop, Spark, data lake ecosystems, CAP theorem, batch vs. stream processing, HDFS, Hive, Parquet, ORC, and MapReduce architecture.

Big Data Best Practices and Guidelines

Covers best practices and guidelines for big data, including data lakes, architecture, challenges, and technologies like Hadoop and Hive.

Data Modeling: Concepts and Applications

Explores data modeling concepts, SQL implementations, and practical applications in handling missing data.

Data Wrangling with Hadoop: Advanced Techniques

Covers advanced data wrangling techniques using Hadoop, focusing on Hive and HBase integration.

Big Data Ecosystems: Technologies and Challenges

Covers the fundamentals of big data ecosystems, focusing on technologies, challenges, and practical exercises with Hadoop's HDFS.

Real-time Intelligence: Data Challenges and Hardware Evolution

Explores data challenges and hardware evolution for real-time intelligence in the era of big data.

Data Wrangling Techniques: HBase and Hive Integration

Covers data wrangling techniques using HBase and Hive, focusing on integration and practical applications.

Data Warehousing: Overview and Challenges

Introduces data warehousing fundamentals, challenges, and the innovative concept of a 'lakehouse'.

Handling Data: Data Models and Wrangling

Explores data handling fundamentals, including models, sources, and wrangling, emphasizing the importance of understanding and addressing data problems.

Data, big data, clouds and IoT

Explores data representation, databases, cloud computing, and challenges in the cloud environment.

Introduction to Database Systems

Covers the basics of database systems, including data modeling, DBMS, data independence, and the course overview.

Big Data: Best Practices and Guidelines

Covers best practices and guidelines for big data, including data lakes, typical architecture, challenges, and technologies used to address them.

Data Management: Overview

Introduces fundamental concepts of data management, including data models, databases, and key tasks.

Hydrology Modeling: Routing System

Covers the modeling of hydrological systems, focusing on the retention of floods and the example of the Routing System.

General Introduction to Data Science

Offers a comprehensive introduction to Data Science, covering Python, Numpy, Pandas, Matplotlib, and Scikit-learn, with a focus on practical exercises and collaborative work.

DDL, DML, Views

Covers SQL data definition, manipulation, and views in databases.

Data Wrangling: Structuring and Wrangling Issues

Covers data wrangling stages, structuring techniques, and common issues in data preparation.