Publications by Tony Hirst

On the Public Understanding of – and Public Engagement With – Statistics: Reflections on the OU Statistics Group Conference on “Visualisation and Presentation in Statistics”

24.05.2011

Last week I attended the OU Statistics conference on Visualisation and Presentation in Statistics (VIPS) (notes: here and here) One of the things that struck me from conversations and some of the presentations was that statistics – and in particular public engagement around statistics – appears to be lagging science efforts in this area. Whe...

6349 sym 16 img

Getting My Eye In Around F1 Quali Data – Parallel Coordinate Plots, Sort of…

30.07.2011

Looking over the sector times from the qualifying session for tomorrow’s Hungarian Grand Prix, I noticed that Vettel was only fastest in one of the sectors. Whilst looking for an easy way of shaping an R data frame so that I could plot categorical values sector1, sector2, sector3 on the x-axis, and then a line for each driver showing their time...

3867 sym 34 img

Merging Two Different Datasets Containing a Common Column With R and R-Studio

02.08.2011

Another way for the database challenged (such as myself!) for merging two datasets that share at least one common column… This recipe using the cross-platform stats analysis package, R. I use R via the R-Studio client, which provides an IDE wrapper around the R environment. So for example, here’s how to merge a couple of files sharing elemen...

3014 sym 22 img

Data Driven Story Discovery: Working Up a Multi-Layered Chart

03.08.2011

How many different dimensions (or “columns” in a dataset where each row represents a different sample and each column a different measurement taken as part of that sample) can you plot on a chart? Two are obvious: X and Y values, which are ideal for representing continuous numerical variables. If you’re plotting points, as in a scatterplot,...

6501 sym 28 img

The Visual Difference – R and Anscombe’s Quartet

30.08.2011

I spent a chunk of today trying to get my thoughts in order for a keynote presentation at next week’s The Difference that Makes a Difference conference. The theme of my talk will be on how visualisations can be used to discover structure and pattern in data, and as in many or my other recent talks I found the idea of Anscombe’s quartet once a...

7213 sym R (500 sym/3 pcs) 22 img

Using Google Spreadsheets as a Database Source for R

02.09.2011

I couldn’t contain myself (other more pressing things to do, but…), so I just took a quick time out and a coffee to put together a quick and dirty R function that will let me run queries over Google spreadsheet data sources and essentially treat them as database tables (e.g. Using Google Spreadsheets as a Database with the Google Visualisatio...

2942 sym R (195 sym/1 pcs) 18 img

Google Spreadsheets API: Listing Individual Spreadsheet Sheets in R

07.09.2011

In Using Google Spreadsheets as a Database Source for R, I described a simple Google function for pulling data into R from a Google Visualization/Chart tools API query language query applied to a Google spreadsheet, given the spreadsheet key and worksheet ID. But how do you get a list of sheets in spreadsheet, without opening up the spreadsheet a...

3434 sym 20 img

Power Tools for Aspiring Data Journalists: R

31.10.2011

Picking up on Paul Bradshaw’s post A quick exercise for aspiring data journalists which hints at how you can use Google Spreadsheets to grab – and explore – a mortality dataset highlighted by Ben Goldacre in DIY statistical analysis: experience the thrill of touching real data, I thought I’d describe a quick way of analysing the data usin...

8597 sym 24 img

How Might Data Journalists Show Their Working? Sweave

01.11.2011

If part of the role of data journalism is to make transparent the justification behind claims that are, or aren’t, backed up by data, there’s good reason to suppose that the journalists should be able to back up their own data-based claims with evidence about how they made use of the data. Posting links to raw data helps to a certain extent �...

2869 sym 16 img

Data Referenced Journalism and the Media – Still a Long Way to Go Yet?

04.11.2011

Reading our local weekly press this evening (the Isle of Wight County Press), I noticed a page 5 headline declaring “Alarm over death rates at St Mary’s”, St Mary’s being the local general hospital. It seems a Department of Health report on hospital mortality rates came out earlier this week, and the Island’s hospital, it seems, has not...

7676 sym 22 img