Introduction to Database SystemsCovers the fundamentals of database systems, including data modeling, information processing, and the challenges of managing large volumes of data.
Data Wrangling with HadoopCovers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.
Handling Data: Intro to PandasIntroduces the fundamentals of handling data, emphasizing the importance of Pandas and data modeling for effective analysis.
Spark Data FramesCovers Spark Data Frames, distributed collections of data organized into named columns, and the benefits of using them over RDDs.
Data Issues in ResearchExplores challenges in data assumptions, biases, and more in research, including incomplete write-ups and frustrations of newcomers.
Deanonymization ExerciseExplores deanonymization using public datasets from Netflix, focusing on matching users and evaluating films based on ratings.
Water Consumption in GenevaExplores water consumption data in Geneva, including charts on consumption and losses, available datasets, and data processing phases.
Introduction to Applied Data AnalysisIntroduces the Applied Data Analysis course at EPFL, covering a broad range of data analysis topics and emphasizing continuous learning in data science.