Data Wrangling with HadoopCovers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.
Data Management: OverviewIntroduces fundamental concepts of data management, including data models, databases, and key tasks.
Handling Data: Intro to PandasIntroduces the fundamentals of handling data, emphasizing the importance of Pandas and data modeling for effective analysis.
Relational Model: BasicsIntroduces the relational model, SQL, keys, integrity constraints, ER translation, weak entities, ISA hierarchies, and SQL vs. noSQL.
NoSQL DatabasesCovers the origins, properties, and types of NoSQL databases, focusing on MongoDB and the CAP theorem.
Spark Data FramesCovers Spark Data Frames, distributed collections of data organized into named columns, and the benefits of using them over RDDs.
Information Systems: OverviewCovers the overview of information systems, data modeling, managing data, and the distinction between data and information.