Publications by arthur charpentier

R Crash Course, Data Science for Actuaries, Year 2

01.03.2016

This Monday, we will start the second year of the Actuary: Data Science (ADS) program, supported by the (French) Institute of Actuaries. I will be there on monday morning for the opening, and we will start the R & Datamining course. The slides are now online, In order to get nice slides, I have been using slidify. Related To leave a comment f...

726 sym 2 img

Where People Live

03.03.2016

There was an interesting map on reddit this morning, with a visualisation of latitude and longituge of where people live, on Earth. So I tried to reproduce it. To compute the density, I used a kernel based approch > library(maps) > data("world.cities") > X=world.cities[,c("lat","pop")] > liss=function(x,h){ + w=dnorm(x-X[,"lat"],0,h) + sum(X[...

754 sym R (719 sym/2 pcs) 6 img

Forecasts with ARIMA Models

16.03.2016

In our time series class this morning, I was discussing forecasts with ARIMA Models. Consider some simple stationnary AR(1) simulated time series > n=95 > set.seed(1) > E=rnorm(n) > X=rep(0,n) > phi=.85 > for(t in 2:n) X[t]=phi*X[t-1]+E[t] > plot(X,type="l") If we fit an AR(1) model, > model=arima(X,order=c(1,0,0), + include.mean = F...

1732 sym R (1268 sym/6 pcs) 10 img

Classification on the German Credit Database

18.03.2016

In our data science course, this morning, we’ve use random forrest to improve prediction on the German Credit Dataset. The dataset is > url="http://freakonometrics.free.fr/german_credit.csv" > credit=read.csv(url, header = TRUE, sep = ",") Almost all variables are treated a numeric, but actually, most of them are factors, > str(credit) 'data.fr...

1816 sym R (3269 sym/13 pcs) 12 img

Where People Live, part 2

04.04.2016

Following my previous post, I wanted to use another dataset to visualize where people live, on Earth. The dataset is coming from sedac.ciesin.columbia.edu. We you register, you can download the database > base=read.table("glp00ag15.asc",skip=6) The database is a ‘big’ 1440×572 matrix, in each cell (latitude and longitude) we have the populat...

1132 sym R (1109 sym/7 pcs) 8 img

Computational Actuarial Science, with R, in Barcelona

05.04.2016

This Wednesday, I will give a graduate crash course on computational actuarial science, with R, which will be the second part of the lecture of Tuesday. Slides are now available, Related To leave a comment for the author, please follow the link and comment on their blog: R-english – Freakonometrics. R-bloggers.com offers daily e-mail updat...

590 sym

How long could it take to run a regression

06.04.2016

This afternoon, while I was discussing with Montserrat (aka @mguillen_estany) we were wondering how long it might take to run a regression model. More specifically, how long it might take if we use a Bayesian approach. My guess was that the time should probably be linear in , the number of observations. But I thought I would be good to check. ...

1666 sym R (2101 sym/12 pcs) 6 img

Non-Uniform Population Density in some European Countries

17.04.2016

A few months ago, I did mention that France was a country with strong inequalities, especially when you look at higher education, and research teams. Paris has almost 50% of the CNRS researchers, while only 3% of the population lives there. CNRS, “répartition des chercheurs en SHS” http://t.co/39dcJJBwrF, Paris 47.52% IdF 66.85% (pop 3.39% ...

2468 sym R (2753 sym/15 pcs) 8 img

Rupture Detection

13.12.2016

There are some graphs that you cannot forget. One graph that I found puzzling was mentioned on Andrew Gelman’s blog, a few years back, and was related to rupture detection What I remember from this graph is that if you want to get a rupture, you can easily find one… Recently, I had to review a paper, and Imbens & Lemieux (2008) was mentione...

4598 sym 18 img

What is a Linear Trend, by the way?

01.01.2017

I had a very stranger discussion on twitter (yes, another one), about regression curves. I think it started with a tweet based on some xkcd picture (just for fun, because it was New Year’s Day) “don’t trust linear regressions” https://t.co/exUCvyRd1G pic.twitter.com/O6rBJfkULa — Arthur Charpentier (@freakonometrics) 1 janvier 2017 Ther...

3933 sym 14 img