Skip to content
OnData.blog

OnData.blog

Menu

  • Articles
  • About & Contact
  • Linkedin
  • Facebook
  • twitter
  • RSS

machine learning

The implications of Scikit-learn bug #21455

As described last week, the Scikit-learn chi-square feature selection is not usable until the bug #21455 is addressed. The problem concerns sklearn.feature_selection.chi2 and the derivative methods, including SelectKBest, if used for categorical features other than binary. The nature of the

Pawel Plaszczak November 29, 2021November 29, 2021 Articles No Comments Read more

Your model may be inaccurate

With Machine Learning in Python, you may do feature selection with SelectKBest. As I just confirmed, this method sometimes returns faulty results. This potentially impacts the accuracy of numerous ML models worldwide. Below the details and the way out. The

Pawel Plaszczak November 25, 2021November 29, 2021 Articles 1 Comment Read more

Practical AIOps: 5 use cases

In Sopra Steria we manage the IT infrastructure and applications of big clients. We process millions of service tickets and infrastructure events. This massive stream of data comes from monitoring tools such as Zabbix, Nagios, Solarwinds, and higher level frameworks:

Pawel Plaszczak June 8, 2021June 14, 2021 Articles No Comments Read more

When Accuracy Grows But Precision Falls

My Machine Learning classifier’s prediction accuracy improves with the growing volume of train data. But at the same time, its precision falls. Why so? And how to fix it? Read on. reducing the problem to classification At Sopra Steria, we

Pawel Plaszczak July 23, 2020July 23, 2020 Articles No Comments Read more

Recent Posts

  • Porting PyTorch neural network to Amazon AWS June 30, 2022
  • Porting pyTorch cloud detection model to Amazon AWS S3 June 17, 2022
  • pushing data to AWS. SageMaker sucks. So does Anaconda June 14, 2022
  • Linear Regression: Killer App with 19-century maths January 19, 2022
  • Democratization of statistics: Chi2 for non-experts January 12, 2022
  • An approach to categorize multi-lingual phrases December 15, 2021
  • The implications of Scikit-learn bug #21455 November 29, 2021
  • Your model may be inaccurate November 25, 2021
  • Answering Why (with Chi-Square) November 19, 2021
  • What makes Data Quality so difficult November 8, 2021
Copyright © 2023 OnData.blog. Powered by WordPress. Theme: Spacious by ThemeGrill.