When you really want to find a pattern in data, you will. Even if there is no pattern. What happened to me yesterday was embarrassing… it also is a lesson worth sharing. I learned how to interpret unusual data patterns,
3 Steps to Unmask Data in Camouflage
I am looking at distribution of a certain data set (left). It has two peaks (this is called ‘bimodal’) therefore I suspect that those are two overimposed populations. How do I split the data, to rediscover the original two populations
Techniques for comparing populations of IT data
I recently work a lot with IT Infrastructure Management data. At Sopra Steria, we manage sizeable ecosystems of our corporate clients that include thousands of apps and infrastructure elements. We handle events, incidents, alarms, and support tickets. We process thousands
Synchronizing an SQL Database to a Data Lake (Change Data Capture at ingest)
The considerations below result from some recent projects at Sopra Steria. The goal: having built a Data Lake, we want to deliver (ingest) in the Raw Zone the data from various sources,including several instances of an Oracle Database. We want
Simple hack to improve data clustering visualizations
Here is how to make your data clusters look pretty in no time (with python and matplotlib), with one-liner code hack. I wanted to visualize in python and matplotlib the data clusters returned by clustering algorithms such as K-means (sklearn.cluster.KMeans)
How to isolate data that constitutes a spike in histogram?
We would all love to spot business problems early on, to react before they become painful. You can learn a lot by looking at past problems. Hence, understanding the nature of anomalies in data can bring substantial operational benefits and