Publications by R on datawookie
Durban Twitter Analysis
I was invited to give a talk at Digifest (Durban University of Technology) on 10 November 2017. Looking at the other speakers and talks on the programme I realised that my normal range of topics would not be suitable. I needed to do something more in line with their mission to “celebrate the creative spirit through multimedia projects from disc...
5057 sym 12 img
Analysis of Feedback from satRday [Cape Town] 2017
We recently announced the second satRday (Cape Town) conference scheduled to take place on 17 March 2018. Obviously we want this to be bigger and better than this year’s event, so we are paying careful attention to the feedback that we received from the first event. This is a quick analysis of the feedback. We sold 192 tickets and gave out 11 c...
5651 sym 36 img 1 tbl
Variable Names: Camel Case to Underscore Delimited
A project I’m working on has a bunch of different data sources. Some of them have column names in Camel Case. Others are underscore delimited. My OCD rebels at this disarray and demands either one or the other. If it were just a few columns and I was only going to have to do this once, then I’d probably just quickly do it by hand. But there a...
1270 sym R (92 sym/1 pcs)
Installing rJava on Ubuntu
Installing the rJava package on Ubuntu is not quite as simple as most other R packages. Some quick notes on how to do it. Install the Java Runtime Environment (JRE). sudo apt-get install -y default-jre Install the Java Development Kit (JDK). sudo apt-get install -y default-jdk Update where R expects to find various Java files. sudo R CMD javarec...
831 sym R (118 sym/4 pcs)
Tips for Lightning Talks
It seems a little counter-intuitive, but a 5 minute lightning talk is far more difficult to prepare (and present!) than a standard 20 minute or longer talk. The principle challenge is fitting everything that you want to say into the allotted time, while still maintaining an engaging narrative. At the recent satRday conference in Cape Town (17 Mar...
2486 sym
Classification: Get the Balance Right
For classification problems the positive class (which is what you’re normally trying to predict) is often sparsely represented in the data. Unless you do something to address this imbalance then your classifier is likely to be rather underwhelming. Achieving a reasonable balance in the proportions of the target classes is seldom emphasised. Per...
4141 sym R (6904 sym/16 pcs) 2 img
Travelling Salesman with ggmap
I’ve been testing out some ideas around the Travelling Salesman Problem using TSP and ggmap. For illustration I’ll find the optimal route between the following addresses: ADDRESSES = c( "115 St Andrew's Drive, Durban North, KwaZulu-Natal, South Africa", "1 Evans Road, Glenwood, Berea, KwaZulu-Natal, South Africa", "7 Radar Drive, Durban...
1356 sym R (2597 sym/5 pcs) 2 img
eRum (2018) Top Twenty
My Top 20 highlights about eRum (2018) in Budapest. In no particular order: Returning to my favourite European city after so many years. Discovering the cheap and efficient bus 100E, which shuttles back and forth between the airport and city. I have previously only made this trip by car. Partial support from Toptal to attend the conference. Than...
2339 sym 2 img
Updating R on Ubuntu
Today I finally got around to updating my R to 3.5 (or, more specifically, 3.5.1). The complete instructions for doing the update on Ubuntu are available here. I’ve paraphrased them below. Authentication Key To ensure the integrity of files, add the appropriate public key to your system. You may have already done this, in which case you can ski...
1381 sym R (508 sym/5 pcs)
Diagnosing RStudio Startup Issues
Yesterday I tried to start RStudio and something weird happened: the window launched but it was blank and unresponsive. I tried dpkg --remove and then re-installed. Same problem. I tried dpkg --remove followed by dpkg --purge and then re-installed. Same problem. I renamed by .R folder. Still the same problem. A sense of desperation was beginning ...
1360 sym R (25 sym/1 pcs)