Publications by Steph

Giving back with code

20.07.2016

Tweet From code in answers on Stack Overflow to R packages or full programs, there’s a lot of code being written and given away. This post examines some of the reasons why the people writing all that code do it, why you should consider giving back with code, and how you can get started. Finally, I cap it all off with perspectives from some of ...

9914 sym 12 img

HIBPwned updated on CRAN

15.09.2016

Tweet Haveibeenpwned.com is a fantastic service that helps people find out if they’ve been involved in a data breach. HIBPwned is an R wrapper for that service. Recently, due to abuse of the system, Troy Hunt had to add a limit of one request per 1.5s. The new version published on CRAN last night adds a delay into each call so that we can cont...

918 sym

GirlswithDeepPockets.com

02.10.2016

Tweet Ok, this post is about one of my latest crazy/harebrained/whacky ideas. I’m fed up of having to carry my Galaxy Note 3 in my hand. I can’t stand handbags and most women’s clothing items don’t have pockets or the pockets are insufficient. Given how easy it is to build a website these days, I thought I’d become a sofa warrior for t...

1935 sym

Slack all the things!

21.10.2016

Tweet Slack all the things! OK, if you haven’t heard of it before Slack is kinda like IRC, kinda like Dropbox, kinda like a lot of things – it’s a neat place to bring together communications between your team or community, and the integrations allow you to pipe in external feeds like twitter activity or RSS. It’s a great way of collabora...

2177 sym 4 img

CRISP-DM and why you should know about it

13.01.2017

Tweet The Cross Industry Standard Process for Data Mining (CRISP-DM) was a concept developed 20 years ago now. I’ve read about it in various data mining and related books and it’s come in very handy over the years. In this post, I’ll outline what the model is and why you should know about it, even if it has that terribly out of vogue phras...

6467 sym 4 img

Is my time series additive or multiplicative?

20.02.2017

Time series data is an important area of analysis, especially if you do a lot of web analytics. To be able to analyse time series effectively, it helps to understand the interaction between general seasonality in activity and the underlying trend. The interactions between trend and seasonality are typically classified as either additive or multip...

6371 sym R (1903 sym/8 pcs) 4 img 7 tbl

Quick tip: knitr Python Windows setup checklist

22.02.2017

One of the nifty things about using R is that you can use it for many different purposes and even other languages! If you want to use Python in your knitr docs or the newish RStudio R notebook functionality, you might encounter some fiddliness getting all the moving parts running on Windows. This is a quick knitr Python Windows setup checklist to...

1484 sym 2 img

Announcing community R workshops

27.02.2017

A big part of why I’ve launched Locke Data is so that I can give back more to my communities. I want to give more time and more support to others. One of the first steps is doing some activities that give financial support to community groups without damaging my startup cashflow! Community R workshops that fund local user groups is the first ac...

2652 sym

R Quick tip: Microsoft Cognitive Services’ Text Analytics API

01.03.2017

Today in class, I taught some fundamentals of API consumption in R. As it was aligned to some Microsoft content, we first used HaveIBeenPwned.com‘s API and then played with Microsoft Cognitive Services‘ Text Analytics API. This brief post overviews what you need to get started, and how you can chain consecutive calls to these APIs in order to...

2287 sym R (1382 sym/7 pcs) 2 img 1 tbl

Community workshops

14.04.2017

Following on from when we announced the availability of our community workshops, we’ve got three in the next three months that folks can attend. May 19th – Data science project in a day We’ll be in Kiev, Ukraine, doing a whole data science project in a day. This is intended to give people a little bit of code, process, and critical thinking...

2034 sym