Publications by arthur charpentier
Removing Uncited References in a Tex File (with R)
Last week, with @3wen, we were working a the revised version of our work on smoothing densities of spatial processes (with edge correction). Usually, once you have revised the paper, some references were added, others were droped. But you need to spend some time, to check that all references are actually mentioned in the paper. For instance, cons...
2544 sym R (4611 sym/13 pcs) 4 img
Kernel Density Estimation with Ripley’s Circumferential Correction
The revised version of the paper Kernel Density Estimation with Ripley’s Circumferential Correction with Ewen Gallic is now online, on hal.archives-ouvertes.fr/. In this paper, we investigate (and extend) Ripley’s circumference method to correct bias of density estimation of edges (or frontiers) of regions. The idea of the method was theoret...
1316 sym 2 img
Extracting datasets from excel files in a zipped folder
The title of the post is a bit long, but that’s the problem I was facing this morning: importing dataset from files, online. I mean, it was not a “problem” (since I can always download, and extract manually the files), more a challenge (I should be able to do it in R, directly). The files are located on ressources-actuarielles.net, in a z...
1365 sym R (1592 sym/4 pcs) 4 img
Shapefiles from Isodensity Curves
Recently, with @3wen, we wanted to play with isodensity curves. The problem is that it is difficult to get – numerically – the equation of the contour (even if we can easily plot it). Consider the following surface (just for fun, in order to illustrate the idea) > f=function(x,y) x*y+(1-x)*(1-y) > u=v=seq(0,1,length=21) > v=seq(0,1,length=11)...
2600 sym R (1939 sym/11 pcs) 20 img
Excel (and French people) are such a pain in the…
A few days ago, I published a post entitled extracting datasets from excel files in a zipped folder, because I wanted to use datasets that were online, in some (zipped) excel format. The first difficult part was the folder with a non-standard character (the French é). Because next week I should be using those dataset in a crash course in Gabon (...
2237 sym R (2655 sym/8 pcs) 2 img
Reinterpreting Lee-Carter Mortality Model
Last week, while I was giving my crash course on R for insurance, we’ve been discussing possible extensions of Lee & Carter (1992) model. If we look at the seminal paper, the model is defined as follows Hence, it means that This would be a (non)linear model on the logarithm of the mortality rate. A non-equivalent, but alternative expression ...
2263 sym R (1635 sym/7 pcs) 20 img
Confidence vs. Credibility Intervals
Tomorrow, for the final lecture of the Mathematical Statistics course, I will try to illustrate – using Monte Carlo simulations – the difference between classical statistics, and the Bayesien approach. The (simple) way I see it is the following, for frequentists, a probability is a measure of the the frequency of repeated events, so the int...
3645 sym R (1333 sym/8 pcs) 24 img
Subjective Ways of Cutting a Continuous Variables
You have probably seen @coulmont‘s maps. If you haven’t, you should probably go and spend some time on his blog (but please, come back afterwards, I still my story to tell you). Consider for instance the maps we obtained for a post published in Monkey Cage, a few months ago, The codes were discussed on a blog post (I spent some time on the e...
2293 sym R (407 sym/7 pcs) 12 img
Names in the U.S., from James Smith to Jose Rodriguez
Two weeks ago, @mona published an interesting post on her blog, about a difficult question, What’s The Most Common Name In America? There were stats about first names, in the U.S., and last names, too. Those informations are – somehow – easy to get. But usually, it is more complicated to get the first and the last name together. For confide...
4302 sym R (4864 sym/21 pcs) 10 img
An automatic code to extract tweets (and to produce the “Somewhere else” review)
A few weeks ago, I ask in a post the (simple) question “dear reader, who are you?” just to know more about the readers of my blog. I found that extremely interesting (even if – to be honest – I was expecting more answers to start a more serious sociological study of the readers of my blog). And an interesting point was that a lot of reade...
4549 sym R (3286 sym/5 pcs)