Publications by Andrew Collier
Escalating Life Expectancy
I’ve added mortality data to the lifespan package. A result that immediately emerges from these data is that average life expectancy is steadily climbing. > library(lifespan) > ggplot(deaths, aes(x = year, y = avgage)) + + geom_boxplot(aes(group = year, fill = sex)) + + facet_wrap(~ sex) + + labs(x = "", y = "Average Age at Death") + + ...
968 sym R (247 sym/1 pcs) 4 img
Life Expectancy by Country
I was rather inspired by this plot on Wikipedia’s List of Countries by Life Expectancy. Shouldn’t be too hard to reproduce with a bit of scraping. Here are the results (click on the static image to view the interactive plot): The bubble plot above compares female and male life expectancies for a number of countries. The diagonal line corres...
1277 sym 4 img
Mortality by Year and Age
Taking another look at the data from the lifespan package. Plot below shows the evolution of mortality in the US as a function of year and age. Also, following up on a suggestion from @robjohnnoble, population data have been included in the package. > library(lifespan) > tail(population) year count 112 2011 310.50 113 2012 312.86 114 2013 3...
729 sym R (151 sym/1 pcs) 2 img
Calculating Pi using Buffon’s Needle
I put together this example to illustrate some general R programming principles for my Data Science class at iXperience. The idea is to use Buffon’s Needle to generate a stochastic estimate for pi. > #' Exploit symmetry to limit range of centre position and angle. > #' > #' @param l needle length. > #' @param t line spacing. > #' > buffon <- ...
858 sym R (649 sym/1 pcs) 2 img
Building a Life Table
After writing my previous post, Mortality by Year and Age, I’ve become progressively more interested in the mortality data. Perhaps those actuaries are onto something? I found this report, which has a wealth of pertinent information. On p. 13 the report gives details on constructing a Life Table, which is one of the fundamental tools in Actuari...
2765 sym R (748 sym/4 pcs) 2 img
Sportsbook Betting (Part 1): Odds
This series of articles was written as support material for Statistics exercises in a course that I’m teaching for iXperience. In the series I’ll be using illustrative examples for wagering on a variety of Sportsbook events including Horse Racing, Rugby and Tennis. The same principles can be applied across essentially all betting markets. Odd...
7948 sym R (1002 sym/5 pcs) 4 img
Web Scraping and “invalid multibyte string”
A couple of my collaborators have had trouble using read_html() from the readr package to access this Wikipedia page. Specifically they have been getting errors like this: Error in utils::type.convert(out[, i], as.is = TRUE, dec = dec) : invalid multibyte string at '<e2><94>' Since I couldn’t reproduce these errors on my machine it appea...
1425 sym R (600 sym/5 pcs)
feedeR: Reading RSS and Atom Feeds from R
I’m working on a project in which I need to systematically parse a number of RSS and Atom feeds from within R. I was somewhat surprised to find that no package currently exists on CRAN to handle this task. So this presented the opportunity for a bit of DIY. You can find the fruits of my morning’s labour here. Installing and Loading The packag...
1916 sym R (2034 sym/7 pcs)
Sportsbook Betting (Part 2): Bookmakers’ Odds
In the first instalment of this series we gained an understanding of the various types of odds used in Sportsbook betting and the link between those odds and implied probabilities. We noted that the implied probabilities for all possible outcomes in an event may sum to more than 100%. At first sight these seems a bit odd. It certainly appears to ...
11714 sym R (7452 sym/15 pcs) 8 img
ubeR: A Package for the Uber API
Uber exposes an extensive API for interacting with their service. ubeR is a R package for working with that API which Arthur Wu and I put together during a Hackathon at iXperience. Installation The package is currently hosted on GitHub. Installation is simple using the devtools package. > devtools::install_github("DataWookie/ubeR") > library(ub...
3219 sym R (4822 sym/14 pcs) 6 img