Skip to content
OnData.blog

OnData.blog

Menu

  • Articles
  • About & Contact
  • Linkedin
  • Facebook
  • twitter
  • RSS

feature elimination

The implications of Scikit-learn bug #21455

As described last week, the Scikit-learn chi-square feature selection is not usable until the bug #21455 is addressed. The problem concerns sklearn.feature_selection.chi2 and the derivative methods, including SelectKBest, if used for categorical features other than binary. The nature of the

Pawel Plaszczak November 29, 2021November 29, 2021 Articles No Comments Read more

Your model may be inaccurate

With Machine Learning in Python, you may do feature selection with SelectKBest. As I just confirmed, this method sometimes returns faulty results. This potentially impacts the accuracy of numerous ML models worldwide. Below the details and the way out. The

Pawel Plaszczak November 25, 2021November 29, 2021 Articles 1 Comment Read more

3 Steps to Unmask Data in Camouflage

I am looking at distribution of a certain data set (left). It has two peaks (this is called ‘bimodal’) therefore I suspect that those are two overimposed populations. How do I split the data, to rediscover the original two populations

Pawel Plaszczak June 29, 2020November 20, 2021 Articles No Comments Read more

Recent Posts

  • Linear Regression: Killer App with 19-century maths January 19, 2022
  • Democratization of statistics: Chi2 for non-experts January 12, 2022
  • An approach to categorize multi-lingual phrases December 15, 2021
  • The implications of Scikit-learn bug #21455 November 29, 2021
  • Your model may be inaccurate November 25, 2021
  • Answering Why (with Chi-Square) November 19, 2021
  • What makes Data Quality so difficult November 8, 2021
  • Don’t trust Data Science. Ask the people October 24, 2021
  • Mistaken by factor of 100,000 October 14, 2021
  • Practical AIOps: 5 use cases June 8, 2021
Copyright © 2022 OnData.blog. Powered by WordPress. Theme: Spacious by ThemeGrill.