Publications by andrew
Are Green Number Runners More Likely to Bail?
Comrades Marathon runners are awarded a permanent green race number once they have completed 10 journeys between Durban and Pietermaritzburg. For many runners, once they have completed the race a few times, achieving a green number becomes a possibility. And once the idea takes hold, it can become something of a compulsion. I can testify to this:...
7985 sym R (6131 sym/6 pcs) 4 img
Please send all comments to /dev/ripley
Trey Causey asks, Has R-help gotten meaner over time?: I began by using Scrapy to download all the e-mails sent to R-help between April 1997 (the earliest available archive) and December 2012. . . . We each read 500 messages and coded them in the following categories: -2 Negative and unhelpful -1 Negative but helpful 0 No obviously valence or r...
4888 sym 4 img
Priors
Nick Firoozye writes: While I am absolutely sympathetic to the Bayesian agenda I am often troubled by the requirement of having priors. We must have priors on the parameter of an infinite number of model we have never seen before and I find this troubling. There is a similarly troubling problem in economics of utility theory. Utility is on consum...
9647 sym
Optimising a Noisy Objective Function
I am busy with a project where I need to calibrate the Heston Model to some Asian options data. The model has been implemented as a function which executes a Monte Carlo (MC) simulation. As a result, the objective function is rather noisy. There are a number of algorithms for dealing with this sort of problem, and here I simply give a brief ove...
7533 sym R (10599 sym/10 pcs) 4 img
Comrades Marathon Inference Trees
Following up on my previous posts regarding the results of the Comrades Marathon, I was planning on putting together a set of models which would predict likelihood to finish and probable finishing time. Along the way I got distracted by something else that is just as interesting and which produces results which readily yield to qualitative inter...
5230 sym R (5877 sym/6 pcs) 4 img
A Chart of Recent Comrades Marathon Winners
Continuing on my quest to document the Comrades Marathon results, today I have put together a chart showing the winners of both the men and ladies races since 1980. Click on the image below to see a larger version. The analysis started off with the same data set that I was working with before, from which I extracted only the records for the winn...
1367 sym R (1705 sym/3 pcs) 2 img
Uncertainty in parameter estimates using multilevel models
David Hsu writes: I have a (perhaps) simple question about uncertainty in parameter estimates using multilevel models — what is an appropriate threshold for measure parameter uncertainty in a multilevel model? The reason why I ask is that I set out to do a crossed two-way model with two varying intercepts, similar to your flight simulator exam...
2905 sym
Finding Correlations in Data with Uncertainty
A week or so ago a colleague of mine asked if I knew how to calculate correlations for data with uncertainties. Now, if we are going to be honest, then all data should have some level of experimental or measurement error. However, I suspect that in the majority of cases these uncertainties are ignored when considering correlations. To what degree...
4312 sym R (4754 sym/11 pcs) 6 img
Finding Correlations in Data with Uncertainty: Classical Solution
Following up on my previous post as a result of an excellent suggestion from Andrej Spiess. The data are indeed very heteroscedastic! Andrej suggested that an alternative way to attack this problem would be to use weighted correlation with weights being the inverse of the measurement variance. Let’s look at the synthetic data first. > library(...
1095 sym R (373 sym/2 pcs)
Fitting a Model by Maximum Likelihood
Maximum-Likelihood Estimation (MLE) is a statistical technique for estimating model parameters. It basically sets out to answer the question: what model parameters are most likely to characterise a given set of data? First you need to select a model for the data. And the model must have one or more (unknown) parameters. As the name implies, MLE p...
5928 sym R (4871 sym/18 pcs) 2 img