This page contains selected materials to my Big Data in 30 hours class . Note that only fragmentary material is available online, hence it is not suitable as an online resource for self-learners. The page is designed as reference for students who already participated in the class.
Thank-you note: I want to express my respect and gratitude to all those whose work is linked below or in the slides: authors of data, software and insight generously shared online. I took care to only use the legally available resources and give explicit credits. That said, should any of the IP owners wish to withdraw their consent in the future, please notify me.
The course description and syllabus is here: Data Engineering and Data Science class syllabus
The linkedin discussion group (open to all): Big Data in 30 hours
The slides and collateral for each lecture is below, and will be added systematically.
|Lecture name||Materials to download|
|Lecture 1: Linux power tools, and the programmer environment.||1. Git version control: concise introduction
2. Git version control: part 2
3. Linux power tools slides / handouts: Big Data in 30 hours Lecture 1 handouts
4. happiness worldwide - statistics from World Bank
5. The Enron corpus
|Lecture 2: Making relations work. (Relational databases, sqlite)||1. Slides / handouts: Big Data in 30 hours Lecture 2 handouts.
2. chinook database
3. fars data.
|Lecture 3: Data Warehousing (OLTP vs OLAP, 3NF vs star/snowflake, Oracle)||1. How to prepare the environment: guide to instalation of Oracle Database Express Edition and SQL Developer
2. Slides / Handouts: Big Data in 30 hours Lecture 3 handouts
|Lecture 4: Business Intelligence (BI, OLAP Cubes, data viz, Tableau: drill down, roll-up, aggregate, slice, dice)||1. Slides / handouts: Big Data in 30 hours Lecture 4 (BI) handouts
|Lecture 5: Non-relational. Mongo, BigTable, CosmosDB||CosmosDB Key Concepts by Michał Wierzbiński|
|Lecture 6: Distributed filesystems. Hadoop, MapReduce, Hive||Leture Notes: First Steps in Hadoop
Lecture Notes: the basics of HDFS
|Lecture 7: Apache Spark||1.Lecture notes: an intro to Apache Spark programming
|Lecture 8 - 16||This material is currently not available online. Feel free to contact me with any enquiries.|