This page contains materials to my Big Data in 30 hours class .
Thank-you note: I want to express my respect and gratitude to all those whose work is linked below or in the slides: authors of data, software and insight generously shared online. I took care to only use the legally available resources (by following the license terms or acquiring an explicit author consent) and give explicit credits where applicable. That said, should any of the intellectual property owners wish to withdraw their consent in the future, please notify me.
The course description and syllabus is here: Data Engineering and Data Science class syllabus
The linkedin discussion group (open to all): Big Data in 30 hours
The slides and collateral for each lecture is below, and will be added systematically.
|Lecture name||Materials to download|
|Lecture 1: Linux power tools, and the programmer environment.||1. Git version control: concise introduction
2. Git version control: part 2
3. Linux power tools slides / handouts: Big Data in 30 hours Lecture 1 handouts
4. happiness worldwide - statistics from World Bank
5. The Enron corpus
|Lecture 2: Making relations work. (Relational databases, sqlite)||1. Slides / handouts: Big Data in 30 hours Lecture 2 handouts.
2. chinook database
3. fars data.
|Lecture 3: Data Warehousing (OLTP vs OLAP, 3NF vs star/snowflake, Oracle)||1. How to prepare the environment: guide to instalation of Oracle Database Express Edition and SQL Developer
2. Slides / Handouts: Big Data in 30 hours Lecture 3 handouts
|Lecture 4: Business Intelligence (BI, OLAP Cubes, data viz, Tableau: drill down, roll-up, aggregate, slice, dice)||1. Slides / handouts: Big Data in 30 hours Lecture 4 (BI) handouts
|Lecture 5: Non-relational. Mongo, BigTable, CosmosDB||CosmosDB Key Concepts by Michał Wierzbiński|
|Lecture 6: Distributed filesystems. Hadoop, MapReduce, Hive||Leture Notes: First Steps in Hadoop
Lecture Notes: the basics of HDFS
|Lecture 7: Apache Spark||1.Lecture notes: an intro to Apache Spark programming
|Lecture 8: Kafka||slides|