Publications by Tony Hirst

Getting Started With Twitter Analysis in R

09.11.2011

Earlier today, I saw a post vis the aggregating R-Bloggers service a post on Using Text Mining to Find Out What @RDataMining Tweets are About. The post provides a walktrhough of how to grab tweets into an R session using the twitteR library, and then do some text mining on it. I’ve been meaning to have a look at pulling Twitter bits into R for ...

1694 sym R (1521 sym/4 pcs) 20 img

Accessing and Visualising Sentencing Data for Local Courts

29.11.2011

A recent provisional data release from the Ministry of Justice contains sentencing data from English(?) courts, at the offence level, for the period July 2010-June 2011: “Published for the first time every sentence handed down at each court in the country between July 2010 and June 2011, along with the age and ethnicity of each offender.” Cri...

6601 sym 36 img

More Dabblings With Local Sentencing Data

01.12.2011

In Accessing and Visualising Sentencing Data for Local Courts I posted a couple of quick ways in to playing with Ministry of Justice sentencing data for the period July 2010-June 2011 at the local court level. At the end of the post, I wondered about how to wrangle the data in R so that I could look at percentage-wise comparisons between differen...

2889 sym 20 img

Rescuing Twapperkeeper Archives Before They Vanish

10.12.2011

A couple of years or so ago, various JISC folk picked up on the idea that there might be value in them thar tweets and started encouraging the use of Twapperkeeper for archiving hashtagged tweets around events, supporting the development of that service in exchange for an open source version of the code. Since then, Twapperkeeper has been sold on...

2776 sym R (1285 sym/1 pcs) 16 img

Rescuing Twapperkeeper Archives Before They Vanish, Redux

11.12.2011

In Rescuing Twapperkeeper Archives Before They Vanish, I described a routine for grabbing Twapperkeeper archives, parsing them, and saving them to a local desktop file using the R programming language (downloading RStudio is the easiest way I know of getting R…). Following a post fron @briankelly (Responding to the Forthcoming Demise of Twapper...

2211 sym R (2755 sym/1 pcs) 16 img

A Tool Chain for Plotting Twitter Archive Retweet Graphs – Py, R, Gephi

21.12.2011

Another set of stepping stones that provide a clunky route to a solution that @mhawksey has been working on a far more elegant expression of (eg Free the tweets! Export TwapperKeeper archives using Google Spreadsheet and Twitter: How to archive event hashtags and create an interactive visualization of the conversation)… The recipe is as follows...

2729 sym R (1556 sym/3 pcs) 20 img

Over on F1DataJunkie, 2011 Season Review Doodles…

30.12.2011

Things have been a little quiet, post wise here, of late, in part because of the holiday season… but I have been posting notes on a couple of charts in progress over on the F1DataJunkie blog. Here are links to the posts in chronological order – they capture the evolution of the chart design(s) to date: F1 2011 Progress Throughout the Year F1...

1909 sym 16 img

Amateur Mapmaking: Getting Started With Shapefiles

13.01.2012

One of the great things about (software) code is that people build on it and out from it… Which means that as well as producing ever more complex bits of software, tools also get produced over time that make it easier to do things that were once hard to do, or required expensive commercial software tools. Producing maps is a fine example of thi...

3393 sym R (3468 sym/2 pcs) 22 img

A Quick View Over a MASHe Google Spreadsheet Twitter Archive of UKGC12 Tweets

20.01.2012

Following on from A Tool Chain for Plotting Twitter Archive Retweet Graphs – Py, R, Gephi, here’s a quick view summary view over #UKGC12 tweets saved in Google Spreadsheet archive as developed by Martin Hawksey, generated from an R script (R code available here; #ukgc12 tweet archive here)… (I did mean to tidy these up, add in titles etc et...

1507 sym 22 img

Social Media Interest Maps of Newsnight and BBCQT Twitterers

26.01.2012

I grabbed independent samples of 1500 recent users of the #newsnight and #bbcqt hashtags within a minute or two of each other about half an hour ago. Here’s who’s followed by 25 or more of the recent hashtaggers in each case. Can you distinguish the programmes each audience interest projection map relates to? Here’s the first one – are th...

4728 sym R (1284 sym/3 pcs) 28 img