Skip to content
OnData.blog

OnData.blog

Menu

  • Articles
  • By topic
  • About
  • Linkedin
  • Facebook
  • twitter
  • RSS

Month: March 2019

Taking advantage of Machine Learning (ML), without even starting

Taking advantage of Machine Learning (ML), without even starting

Here is a humorous recent example of what one can achieve with basic data exploration, without even going into any advanced ML techniques. In this recent project I was asked to study response time log of an online service running

Pawel Plaszczak March 21, 2019April 17, 2021 Articles No Comments Read more

4 reasons for building Data Lakes… or not

4 reasons for building Data Lakes… or not

Data Lakes are repositories where data is ingested and stored in its original form, without much (or any) preprocessing. This is in contrast to traditional data warehouses, where much effort is in the ETL processing, data cleansing and aggregation, to

Pawel Plaszczak March 15, 2019March 22, 2019 Articles No Comments Read more

Lecture notes: Introduction to Apache Spark

In Lecture 7 of Big Data in 30 hours lecture series, we introduce Apache Spark. The purpose of this memo is to serve to the students as a reference of some of the concepts learned. About Spark Spark, managed by

Pawel Plaszczak March 14, 2019March 14, 2019 Articles No Comments Read more

Product Owner vs Product Manager vs Architect

This short memo is to clarify the proper usage of these roles in the context of software development projects: Product Owner, Product Manager / Manager, and (Software/Product) Architect. Product Owner The term Product Owner is mainly used in Scrum context.

Pawel Plaszczak March 13, 2019March 14, 2019 Articles 3 Comments Read more

Collected thoughts on implementing Kafka data pipelines

Below are my recent notes and thoughts collected during the recent work with Kafka, to build data streaming pipelines between data warehouses and data lakes. Maybe someone will benefit. The rationale Some points on picking (or not picking) Kafka as

Pawel Plaszczak March 8, 2019March 8, 2019 Articles No Comments Read more

Recent Posts

  • Moving On April 4, 2026
  • Data Literacy: Six examples of bad data interpretation April 29, 2024
  • Porting PyTorch neural network to Amazon AWS June 30, 2022
  • Porting pyTorch cloud detection model to Amazon AWS S3 June 17, 2022
  • pushing data to AWS. SageMaker sucks. So does Anaconda June 14, 2022
  • Linear Regression: Killer App with 19-century maths January 19, 2022
  • Democratization of statistics: Chi2 for non-experts January 12, 2022
  • An approach to categorize multi-lingual phrases December 15, 2021
  • The implications of Scikit-learn bug #21455 November 29, 2021
  • Your model may be inaccurate November 25, 2021

Recent Posts

  • Moving On
  • Data Literacy: Six examples of bad data interpretation
  • Porting PyTorch neural network to Amazon AWS
  • Porting pyTorch cloud detection model to Amazon AWS S3
  • pushing data to AWS. SageMaker sucks. So does Anaconda

Recent Comments

  • Pawel Plaszczak on How to isolate data that constitutes a spike in histogram?
  • robert on How to isolate data that constitutes a spike in histogram?
  • Marcello Anselmi Tamburini on Your model may be inaccurate
  • C on Product Owner vs Product Manager vs Architect
  • Houcem on Don’t trust Data Science. Ask the people

Archives

  • April 2026
  • April 2024
  • June 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • June 2021
  • April 2021
  • March 2021
  • February 2021
  • October 2020
  • July 2020
  • June 2020
  • April 2020
  • March 2020
  • February 2020
  • November 2019
  • October 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • July 2018
  • November 2016

Categories

  • Articles
  • General Public
  • Uncategorized

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
Copyright © 2026 OnData.blog. All rights reserved. Theme Spacious by ThemeGrill. Powered by: WordPress.