Publications by Adventures in Data

A new blogging workflow

23.11.2015

Just disovered the Jekyll – github pages combination. Perfect for a very simple static blog site. I also found a pretty neat integration with RMarkdown which suits me perfectly, as I am mainly using R and will be posting bits and pieces of that. A summary of how it works… Install Jekyll on a local directory and build the site on your disk. ...

1604 sym R (26 sym/1 pcs)

Geographic clustering of UK cities

23.11.2015

I know I am probably late to this party but I recently found out about DBSCAN or “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise”[^1]. In a nutshell, the algorithm visits successive data point and asks whether neighbouring points are density-reachable. In other words is it possible to connect two poi...

2259 sym R (566 sym/4 pcs) 2 img

Extracting data from Salesforce

06.12.2015

I have been asked to access and present some of our own internal data, stored on a CRM system called Salesforce. Luckily for me someone had already written a set of R bindings for it (phew). As always thank a million to the authors of RForcecom, details of which can be found here. A while ago, I was struggling to extract data from objects because...

1316 sym R (441 sym/4 pcs)

Downloading your twitter feed in R

06.12.2015

The twitteR library is one of the most comprehensive R bindings for an API that I have ever seen. Thanks a million to Jeff Gentry for authoring and maintaining it. It took me a while to get to know it and I thought I’d share a little trick here. I recently wanted to trace the generation of my personal data from a morning in my life (possibly mo...

1391 sym R (562 sym/4 pcs)

Extracting data from Salesforce

06.12.2015

I have been asked to access and present some of our own internal data, stored on a CRM system called Salesforce. Luckily for me someone had already written a set of R bindings for it (phew). As always thank a million to the authors of RForcecom, details of which can be found here. A while ago, I was struggling to extract data from objects because...

1312 sym R (441 sym/4 pcs)

Downloading your twitter feed in R

06.12.2015

The twitteR library is one of the most comprehensive R bindings for an API that I have ever seen. Thanks a million to Jeff Gentry for authoring and maintaining it. It took me a while to get to know it and I thought I'd share a little trick here. I recently wanted to trace the generation of my personal data from a morning in my life (possibly more...

1387 sym R (562 sym/4 pcs)

Seasonal mortality trend decomposition

10.12.2015

I recently wrote a blog on trends and seasonal variation in fruit and veg wholesale prices provided by DEFRA. It was using a beatiful technique called ‘STL’ or seasonal-trend decomposition via loess[^1]. Just now I spotted a dataset from the Office for National Statistics on winter mortality. ONS highlight that: last winter was a particularl...

2312 sym R (2043 sym/8 pcs) 4 img

Seasonal mortality trend decomposition

10.12.2015

I recently wrote a blog on trends and seasonal variation in fruit and veg wholesale prices provided by DEFRA. It was using a beatiful technique called ‘STL’ or seasonal-trend decomposition via loess[^1]. Just now I spotted a dataset from the Office for National Statistics on winter mortality. ONS highlight that: last winter was a particula...

2283 sym R (2046 sym/6 pcs) 4 img

Language categorisation of Star Wars character names

05.01.2016

EDIT: Thanks to timvink for bringing to my attention library(ggrepel) which can be installed via devtools::install_github("slowkow/ggrepel"). It fixes the overlap of labels in the first figure below. Great! This is something I have been looking for! Last year in December, in time for the big release, myself and a colleague at The Data Lab were ha...

3921 sym R (3536 sym/8 pcs) 4 img

Language categorisation of Star Wars character names

05.01.2016

Last year in December, in time for the big release, myself and a colleague at The Data Lab were having some fun with an Star Wars character names that we had scraped from Wikipedia. Luckily for us a national outlet, The Scotsman picked up on this and put out an article on their website. We had the idea one lunch time to attempt to cluster the Sta...

3659 sym R (3418 sym/5 pcs) 4 img