Publications by arthur charpentier

R, Twitter and URLs

26.08.2013

Yesterday evening, I wanted to play with Twitter, and see which websites I was using as references in my tweets, to get a Top 4 list. The first problem I got was because installing twitteR on Ubuntu is not that simple ! You have to install properly RCurl… But you before install the package in R, it is necessary to run the following line in a te...

2258 sym R (1888 sym/12 pcs) 2 img

Linear regression from a contingency table

07.09.2013

This morning, Benoit sent me an email, about an exercise he found in an econometric textbook, about linear regression. Consider the following dataset, Here, variable X denotes the income, and Y the expenses. The goal was to fit a linear regression (actually, in the email, it was mentioned that we should try to fit an heteroscedastic model, but l...

1985 sym R (2500 sym/8 pcs) 2 img

Non-observable vs. observable heterogeneity factor

11.09.2013

This morning, in the ACT2040 class (on non-life insurance), we’ve discussed the difference between observable and non-observable heterogeneity in ratemaking (from an economic perspective). To illustrate that point (we will spend more time, later on, discussing observable and non-observable risk factors), we looked at the following simple exampl...

3749 sym R (2862 sym/14 pcs) 38 img

Monty Hall (oh no, not again)

13.09.2013

Quite frequently, someone on the internet discovers the Monty Hall paradox, and become so enthusiastic that it becomes urgent to publish an article – or a post – about it. The latest example can be http://www.bbc.co.uk/news/magazine-24045598. I won’t blame them, I did the same a few years ago (see http://freakonometrics.hypotheses.org/776, ...

4547 sym R (1424 sym/12 pcs) 22 img

Logistic regression and categorical covariates

26.09.2013

A short post to get back – for my nonlife insurance course – on the interpretation of the output of a regression when there is a categorical covariate. Consider the following dataset > db = read.table("http://freakonometrics.free.fr/db.txt",header=TRUE,sep=";") > tail(db) Y X1 X2 X3 995 1 4.801836 20.82947 A 996 1 9.867854...

2233 sym R (3714 sym/7 pcs) 62 img

Nice tutorials to discover R

28.09.2013

A series of tutorials, in R, by Anthony Damico. As claimed on http://twotorials.com/, “how to do stuff in r. two minutes or less, for those of us who prefer to learn by watching and listening“. So far, 000 what is r? the lingua statistica, s’il vous plaît 001 how to download and install r 002 simple shortcuts for the windows r console 0...

7989 sym

ROC curves and classification

30.09.2013

To get back to a question asked after the last course (still on non-life insurance), I will spend some time to discuss ROC curve construction, and interpretation. Consider the dataset we’ve been using last week, > db = read.table("http://freakonometrics.free.fr/db.txt",header=TRUE,sep=";") > attach(db) The first step is to get a model. For inst...

3064 sym R (1163 sym/10 pcs) 40 img 1 tbl

Regression on variables, or on categories?

30.09.2013

I admit it, the title sounds weird. The problem I want to address this evening is related to the use of the stepwise procedure on a regression model, and to discuss the use of categorical variables (and possible misinterpreations). Consider the following dataset > db = read.table("http://freakonometrics.free.fr/db2.txt",header=TRUE,sep=";") First...

2174 sym R (4209 sym/8 pcs)

Some heuristics about local regression and kernel smoothing

08.10.2013

In a standard linear model, we assume that . Alternatives can be considered, when the linear assumption is too strong. Polynomial regression A natural extension might be to assume some polynomial function, Again, in the standard linear model approach (with a conditional normal distribution using the GLM terminology), parameters can be obtaine...

4846 sym R (2431 sym/18 pcs) 88 img

Some heuristics about spline smoothing

08.10.2013

Let us continue our discussion on smoothing techniques in regression. Assume that . where is some unkown function, but assumed to be sufficently smooth. For instance, assume that  is continuous, that exists, and is continuous, that  exists and is also continuous, etc. If  is smooth enough, Taylor’s expansion can be used. Hence, for whi...

2776 sym R (1933 sym/17 pcs) 72 img