Publications by Michael Mayer
Effect Plots in Python and R
Christian and me did some code magic: Highly effective plots that help to build and inspect any model:Python: https://github.com/lorentzenchr/model-diagnostics pip install model-diagnosticsR: https://github.com/mayer79/effectplotsinstall.packages("effectplots")The functionality is best described by its output:PythonRThe plots show different types o...
3440 sym R (2916 sym/4 pcs) 2 img
Effect Plots in Python and R
Christian and me did some code magic: Highly effective plots that help to build and inspect any model: Python: https://github.com/lorentzenchr/model-diagnostics pip install model-diagnostics R: https://github.com/mayer79/effectplotsinstall.packages("effectplots") The functionality is best described by its output: Python R The plots show different...
3729 sym R (2916 sym/4 pcs) 4 img
Explaining a Causal Forest
We use a causal forest [1] to model the treatment effect in a randomized controlled clinical trial. Then, we explain this black-box model with usual explainability tools. These will reveal segments where the treatment works better or worse, just like a forest plot, but multivariately. Data For illustration, we use patient-level data of a 2-arm tria...
6742 sym R (4145 sym/4 pcs) 16 img
Out-of-sample Imputation with {missRanger}
{missRanger} is a multivariate imputation algorithm based on random forests, and a fast version of the original missForest algorithm of Stekhoven and Buehlmann (2012). Surprise, surprise: it uses {ranger} to fit random forests. Especially combined with predictive mean matching (PMM), the imputations are often quite realistic. Out-of-sample applica...
2504 sym R (1663 sym/1 pcs) 2 img
SHAP Values of Additive Models
Within only a few years, SHAP (Shapley additive explanations) has emerged as the number 1 way to investigate black-box models. The basic idea is to decompose model predictions into additive contributions of the features in a fair way. Studying decompositions of many predictions allows to derive global properties of the model. What happens if we app...
2592 sym R (4647 sym/4 pcs) 2 img
ML + XAI -> Strong GLM in Python
In our latest post, we explained how to use ML + XAI to build strong generalized linear models with R. Let’s do the same with Python.Insurance pricing dataWe will use again a synthetic dataset with 1 Mio insurance policies, with reference:Mayer, M., Meier, D. and Wuthrich, M.V. (2023), SHAP for Actuaries: Explain any Model.http://dx.doi.org/10.21...
1923 sym Python (3752 sym/4 pcs) 9 img
ML + XAI -> Strong GLM
My last post was using {hstats}, {kernelshap} and {shapviz} to explain a binary classification random forest. Here, we use the same package combo to improve a Poisson GLM with insights from a boosted trees model. Insurance pricing data This time, we work with a synthetic, but quite realistic dataset. It describes 1 Mio insurance policies and their ...
3152 sym R (4982 sym/9 pcs) 22 img
Explain that tidymodels blackbox!
Let’s explain a {tidymodels} random forest by classic explainability methods (permutation importance, partial dependence plots (PDP), Friedman’s H statistics), and also fancy SHAP. Disclaimer: {hstats}, {kernelshap} and {shapviz} are three of my own packages. Diabetes data We will use the diabetes prediction dataset of Kaggle to model diabetes...
4214 sym R (4174 sym/4 pcs) 22 img
Interactions – where are you?
This question sends shivers down the poor modelers spine… The {hstats} R package introduced in our last post measures their strength using Friedman’s H-statistics, a collection of statistics based on partial dependence functions. On Github, the preview version of {hstats} 1.0.0 out – I will try to bring it to CRAN in about one week (October...
5680 sym R (6789 sym/5 pcs) 20 img
Permutation SHAP versus Kernel SHAP
SHAP is the predominant way to interpret black-box ML models, especially for tree-based models with the blazingly fast TreeSHAP algorithm. For general models, two slower SHAP algorithms exist: Permutation SHAP (Štrumbelj and Kononenko, 2010) Kernel SHAP (Lundberg and Lee, 2017) Kernel SHAP was introduced as an approximation to permutation SHAP. ...
3299 sym R (1487 sym/2 pcs)