Lecture

Untitled

Related lectures (32)

Covers data manipulation and exploration using Python with a focus on visualization techniques.

Introduces the basics of data science, covering decision trees, machine learning advancements, and deep reinforcement learning.

Data Wrangling with Hive: Managing Big Data Efficiently

Covers data wrangling techniques using Apache Hive for efficient big data management.

Spark Data Frames

Covers Spark Data Frames, distributed collections of data organized into named columns, and the benefits of using them over RDDs.

General Introduction to Big Data

Covers data science tools, Hadoop, Spark, data lake ecosystems, CAP theorem, batch vs. stream processing, HDFS, Hive, Parquet, ORC, and MapReduce architecture.

Data Wrangling with Hadoop: Storage Formats and Hive

Explores data wrangling with Hadoop, emphasizing storage formats and Hive for big data processing.

General Introduction to Data Science

Offers a comprehensive introduction to Data Science, covering Python, Numpy, Pandas, Matplotlib, and Scikit-learn, with a focus on practical exercises and collaborative work.

Data Wrangling with Hadoop

Covers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.

Big Data Best Practices and Guidelines

Covers best practices and guidelines for big data, including data lakes, architecture, challenges, and technologies like Hadoop and Hive.

Decision Tree Classification

Covers decision tree classification using KNIME Analytics Platform for data preprocessing and model creation.

Structures and Mechanisms: Opening a Box

Explores the analysis of structures and mechanisms through a sample problem of opening a box with a string-held lid.

Advanced Spark Optimization

Delves into advanced Spark optimization techniques, emphasizing data partitioning, shuffle operations, and memory management.

Advanced Pandas Functions

Focuses on advanced pandas functions for data manipulation, exploration, and visualization with Python, emphasizing the importance of understanding and preparing data.

Python Lists: Manipulation and Comprehension

Covers Python list manipulation and comprehension, emphasizing memory representation and mutability.

Big Data Ecosystems: Technologies and Challenges

Covers the fundamentals of big data ecosystems, focusing on technologies, challenges, and practical exercises with Hadoop's HDFS.

Spark DataFrames: Basics and Optimization

Covers the basics of Spark DataFrames, their advantages, performance comparison with RDDs, and practical demos.

3D Stone Scanning Session

Introduces a 'professional' 3D measurement system for stone analysis and feature extraction using stereo photogrammetry and structured light technologies.

Knowledge Representation: Semantics and Data Structures

Explores knowledge representation, data structures, semantics, and the challenges of searching for data on the web.

Advanced Spark Optimization Techniques: Managing Big Data

Discusses advanced Spark optimization techniques for managing big data efficiently, focusing on parallelization, shuffle operations, and memory management.

Introduction to Data Stream Processing

Covers the fundamentals of data stream processing, including tools like Apache Storm and Kafka, key concepts like event time and window operations, and the challenges of stream processing.