Publications by Michael Mayer

Effect Plots in Python and R

23.11.2024

Christian and me did some code magic: Highly effective plots that help to build and inspect any model:Python: https://github.com/lorentzenchr/model-diagnostics pip install model-diagnosticsR: https://github.com/mayer79/effectplotsinstall.packages("effectplots")The functionality is best described by its output:PythonRThe plots show different types o...

3440 sym R (2916 sym/4 pcs) 2 img

Effect Plots in Python and R

23.11.2024

Christian and me did some code magic: Highly effective plots that help to build and inspect any model: Python: https://github.com/lorentzenchr/model-diagnostics pip install model-diagnostics R: https://github.com/mayer79/effectplotsinstall.packages("effectplots") The functionality is best described by its output: Python R The plots show different...

3729 sym R (2916 sym/4 pcs) 4 img

Explaining a Causal Forest

02.09.2024

We use a causal forest [1] to model the treatment effect in a randomized controlled clinical trial. Then, we explain this black-box model with usual explainability tools. These will reveal segments where the treatment works better or worse, just like a forest plot, but multivariately. Data For illustration, we use patient-level data of a 2-arm tria...

6742 sym R (4145 sym/4 pcs) 16 img

Out-of-sample Imputation with {missRanger}

23.08.2024

{missRanger} is a multivariate imputation algorithm based on random forests, and a fast version of the original missForest algorithm of Stekhoven and Buehlmann (2012). Surprise, surprise: it uses {ranger} to fit random forests. Especially combined with predictive mean matching (PMM), the imputations are often quite realistic. Out-of-sample applica...

2504 sym R (1663 sym/1 pcs) 2 img

SHAP Values of Additive Models

28.06.2024

Within only a few years, SHAP (Shapley additive explanations) has emerged as the number 1 way to investigate black-box models. The basic idea is to decompose model predictions into additive contributions of the features in a fair way. Studying decompositions of many predictions allows to derive global properties of the model. What happens if we app...

2592 sym R (4647 sym/4 pcs) 2 img

ML + XAI -> Strong GLM in Python

02.02.2024

In our latest post, we explained how to use ML + XAI to build strong generalized linear models with R. Let’s do the same with Python.Insurance pricing dataWe will use again a synthetic dataset with 1 Mio insurance policies, with reference:Mayer, M., Meier, D. and Wuthrich, M.V. (2023), SHAP for Actuaries: Explain any Model.http://dx.doi.org/10.21...

1923 sym Python (3752 sym/4 pcs) 9 img

ML + XAI -> Strong GLM

21.01.2024

My last post was using {hstats}, {kernelshap} and {shapviz} to explain a binary classification random forest. Here, we use the same package combo to improve a Poisson GLM with insights from a boosted trees model. Insurance pricing data This time, we work with a synthetic, but quite realistic dataset. It describes 1 Mio insurance policies and their ...

3152 sym R (4982 sym/9 pcs) 22 img

Explain that tidymodels blackbox!

07.01.2024

Let’s explain a {tidymodels} random forest by classic explainability methods (permutation importance, partial dependence plots (PDP), Friedman’s H statistics), and also fancy SHAP. Disclaimer: {hstats}, {kernelshap} and {shapviz} are three of my own packages. Diabetes data We will use the diabetes prediction dataset of Kaggle to model diabetes...

4214 sym R (4174 sym/4 pcs) 22 img

Interactions – where are you?

16.10.2023

This question sends shivers down the poor modelers spine… The {hstats} R package introduced in our last post measures their strength using Friedman’s H-statistics, a collection of statistics based on partial dependence functions. On Github, the preview version of {hstats} 1.0.0 out – I will try to bring it to CRAN in about one week (October...

5680 sym R (6789 sym/5 pcs) 20 img

Permutation SHAP versus Kernel SHAP

11.11.2023

SHAP is the predominant way to interpret black-box ML models, especially for tree-based models with the blazingly fast TreeSHAP algorithm. For general models, two slower SHAP algorithms exist: Permutation SHAP (Štrumbelj and Kononenko, 2010) Kernel SHAP (Lundberg and Lee, 2017) Kernel SHAP was introduced as an approximation to permutation SHAP. ...

3299 sym R (1487 sym/2 pcs)