Publications by T. Moudiki
Encoding your categorical variables based on the response variable and correlations
Sometimes in Statistical/Machine Learning problems, we encounter categorical explanatory variables with high cardinality. Let’s say for example that we want to determine if a diet is good or bad, based on what a person eats. In trying to answer this question, we’d construct a response variable containing a sequence of characters good or bad, ...
5075 sym R (6626 sym/12 pcs) 6 img
Custom errors for cross-validation using crossval::crossval_ml
This post is about using custom error measures in crossval, a tool offering generic functions for the cross-validation of Statistical/Machine Learning models. More information about cross-validation of regression models using crossval can be found in this post, or this other one. The default error measure for regression in crossval is Root Mean S...
1995 sym R (4671 sym/7 pcs)
AdaOpt (a probabilistic classifier based on a mix of multivariable optimization and a nearest neighbors) for R
Last week on this blog, I presented AdaOpt for Python on a handwritten digits classification task. AdaOpt is a novel probabilistic classifier, based on a mix of multivariable optimization and a nearest neighbors algorithm. It’s still very new and only time will allow to fully appreciate all of its features. The tool is fast due to Cython, and t...
2408 sym R (1622 sym/6 pcs) 2 img
AdaOpt classification on MNIST handwritten digits (without preprocessing)
Last week on this blog, I presented AdaOpt for R, applied to iris dataset classification. And the week before that, I introduced AdaOpt for Python. AdaOpt is a novel probabilistic classifier, based on a mix of multivariable optimization and a nearest neighbors algorithm. More details about the algorithm can be found in this (short) paper. This we...
2015 sym R (2571 sym/11 pcs) 2 img
Maximizing your tip as a waiter
A few weeks ago, I introduced a target-based categorical encoder for Statistical/Machine Learning based on correlations + Cholesky decomposition. That is, a way to convert explanatory variables such as the x below, to numerical variables which can be digested by ML models. # Have: x <- c("apple", "tomato", "banana", "apple", "pineapple", "bic mac...
2391 sym R (1870 sym/3 pcs) 2 img
Maximizing your tip as a waiter
A few weeks ago, I introduced a target-based categorical encoder for Statistical/Machine Learning based on correlations + Cholesky decomposition. That is, a way to convert explanatory variables such as the x below, to numerical variables which can be digested by ML models. # Have: x <- c("apple", "tomato", "banana", "apple", "pineapple", "bic mac...
2391 sym R (1870 sym/3 pcs) 2 img
Maximizing your tip as a waiter (Part 2)
In Part 1 of “Maximizing your tip as a waiter”, I talked about a target-based categorical encoder for Statistical/Machine Learning, firstly introduced in this post. An example dataset of tips was used for the purpose, and we’ll use the same dataset today. Here is a snippet of tips: Based on these informations, how would you maximize your t...
5234 sym Python (1834 sym/4 pcs) 18 img
Maximizing your tip as a waiter (Part 2)
In Part 1 of “Maximizing your tip as a waiter”, I talked about a target-based categorical encoder for Statistical/Machine Learning, firstly introduced in this post. An example dataset of tips was used for the purpose, and we’ll use the same dataset today. Here is a snippet of tips: Based on these informations, how would you maximize your t...
5234 sym Python (1834 sym/4 pcs) 18 img
nnetsauce version 0.5.0, randomized neural networks on GPU
nnetsauce is a general purpose tool for Statistical/Machine Learning, in which pattern recognition is achieved by using quasi-randomized networks. A new version, 0.5.0, is out on Pypi and for R: Install by using pip (stable version): pip install nnetsauce --upgrade Install from Github (development version): pip install git+https://github.com/...
2755 sym R (238 sym/5 pcs) 2 img
LSBoost: Explainable ‘AI’ using Gradient Boosted randomized networks (with examples in R and Python)
Disclaimer: I have no affiliation with The Next Web (cf. online article) A few weeks ago I read this interesting and accessible article about explainable AI, discussing more specifically self-explainable AI issues. I’m not sure – anymore – if there’s a mandatory need for AI models that explain themselves, as there are model-agnostic tools...
3277 sym R (7575 sym/10 pcs) 4 img