Publications by arthur charpentier

What it the interpretation of the diagonal for a ROC curve

25.03.2019

Last Friday, we discussed the use of ROC curves to describe the goodness of a classifier. I did say that I will post a brief paragraph on the interpretation of the diagonal. If you look around some say that it describes the “strategy of randomly guessing a class“, that it is obtained with “a diagnostic test that is no better than chance lev...

3354 sym R (3504 sym/18 pcs) 42 img

Estimates on training vs. validation samples

23.05.2019

Before moving to cross-validation, it was natural to say “I will burn 50% (say) of my data to train a model, and then use the remaining to fit the model”. For instance, we can use training data for variable selection (e.g. using some stepwise procedure in a logistic regression), and then, once variable have been selected, fit the model on the...

2569 sym R (1113 sym/4 pcs) 32 img

Pareto Models for Top Incomes

03.06.2019

With Emmanuel Flachaire, we uploaded on hal a paper on Pareto Models for Top Incomes, Top incomes are often related to Pareto distribution. To date, economists have mostly used Pareto Type I distribution to model the upper tail of income and wealth distribution. It is a parametric distribution, with an attractive property, that can be easily lin...

1644 sym

On my way to Manizales (Colombia)

16.06.2019

Next week, I will be in Manizales, Colombia, for the Third International Congress on Actuarial Science and Quantitative Finance. I will be giving a lecture on Wednesday with Jed Fress and Emilianos Valdez. I will give my course on Algorithms for Predictive Modeling on Thursday morning (after Jed and Emil’s lectures). Unfortunately, my computer...

1008 sym 4 img

Optimal transport on large networks

04.07.2019

With Alfred Galichon and Lucas Vernet, we recently uploaded a paper entitled optimal transport on large networks on arxiv. This article presents a set of tools for the modeling of a spatial allocation problem in a large geographic market and gives examples of applications. In our settings, the market is described by a network that maps the cost o...

1921 sym 4 img

Insurance data science : use and value of unusual data #1

05.08.2019

Next week, with , I will be at the Summer School of the Swiss Association of Actuaries, in Lausanne, with Jean-Philippe Boucher (UQAM) and Ewen Gallic (AMSE). I will give an introductionary talk on Monday morning, and the slides are now available There will be some hands-on applications, on R. I will share some codes in the slides. Related To...

746 sym 2 img

Insurance data science : Pictures

13.08.2019

At the Summer School of the Swiss Association of Actuaries, in Lausanne, following the part of Jean-Philippe Boucher (UQAM) on telematic data, I will start talking about pictures this Wednesday. Slides are available online Ewen Gallic (AMSE) will present a tutorial on satellite pictures, and a simple classification problem, related to Alzeimher ...

990 sym 18 img

Insurance data science : Text

14.08.2019

At the Summer School of the Swiss Association of Actuaries, in Lausanne, I will start talking about text based data and NLP this Thursday. Slides are available online Ewen Gallic (AMSE) will present a tutorial on tweets. I can upload a few additional slides on LSTM (recurrent neural nets) Related To leave a comment for the author, please follo...

703 sym 4 img

Insurance data science : Networks

15.08.2019

At the Summer School of the Swiss Association of Actuaries, in Lausanne, I will start talking about networks and insurance this Friday. Slides are available online Related To leave a comment for the author, please follow the link and comment on their blog: R-english – Freakonometrics. R-bloggers.com offers daily e-mail updates about R news ...

576 sym 2 img

On leverage

03.10.2019

Last week, in our STT5100 (applied linear models) class, I’ve introduce the hat matrix, and the notion of leverage. In a classical regression model, \(\boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta}\) (in a matrix form), the ordinary least square estimator of parameter \(\boldsymbol{\beta}\) is \(\widehat{\boldsymbol{\beta}}=(\boldsymbol{X}^\to...

6958 sym R (412 sym/4 pcs) 10 img