Publications by Steph

R Quick Tip: Table parameters for rmarkdown reports

19.04.2017

The recent(ish) advent of parameters in rmarkdown reports is pretty nifty but there’s a little bit of behaviour that can come in handy but doesn’t come across in the documentation. You can use table parameters for rmarkdown reports. Previously, if you wanted to produce multiple reports based off a dataset, you would make the dataset available...

1663 sym R (58 sym/1 pcs)

Logistic regressions (in R)

21.04.2017

Tweet Logistic regressions are a great tool for predicting outcomes that are categorical. They use a transformation function based on probability to perform a linear regression. This makes them easy to interpret and implement in other systems. Logistic regressions can be used to perform a classification for things like determining whether someon...

1735 sym

Building an R training environment

24.04.2017

Tweet I recently delivered a day of training at SQLBits and I really upped my game in terms of infrastructure for it. The resultant solution was super smooth and mitigated all the install issues and preparation for attendees. This meant we got to spend the whole day doing R, instead of troubleshooting. I’m so happy with the solution for an onl...

4189 sym R (1350 sym/6 pcs) 8 img

R Quick Tip: Upload multiple files in shiny and consolidate into a dataset

28.04.2017

Tweet In shiny, you can use the fileInput with the parameter multiple = TRUE to enable you to upload multiple files at once. But how do you process those multiple files in shiny and consolidate into a single dataset? The bit we need from shiny is the input$param$fileinputpath value. We can use lapply() with data.table‘s fread() to read multipl...

1382 sym

Getting started with data science – recommended resources

02.05.2017

An oft asked question is what resources can I recommend for getting started with data science? Here are my recommendations, and if you have others, please put them in the comments! NB Links in this post may be affiliate links – it doesn’t change the prices you get but might earn me a little money Books Data Science for Business Data Scien...

6228 sym 8 img

The making of datasauRus

02.05.2017

Around 8:30pm I saw this tweet and duly retweeted https://t.co/WuyU9D6npK — Richie Cotton (@richierocks) May 1, 2017 It turns out awesome folks, George and Justin, had made a process whereby they can generate different distributions of points that retain the same summary statistics. They used this process for making some friends for Dino the...

4533 sym R (1404 sym/6 pcs)

Error installing latest R version (3.4.0) on Windows

03.05.2017

If you’re getting the following error when you’ve installed R 3.4.0 on Windows, you’re not alone. Error in if (file.exists(dest) && file.mtime(dest) > file.mtime(lib) && : missing value where TRUE/FALSE needed The R team have released a patched version but right now it’s a little difficult to find out about. If you need/want the p...

1174 sym

Minor update to HIBPwned

05.05.2017

A new version of HIBPwned has been accepted onto CRAN. This occurred yesterday so it could still be filtering into some mirrors. HIBPwned is an R wrapper for the useful website HaveIBeenPwned and if you don’t already utilise the package or the site – you should. HaveIBeenPwned tells you when your details are included in data breaches. This is...

1527 sym

R Quick Tip: parameter re-use within rmarkdown YAML

08.05.2017

Ever wondered how to make an rmarkdown title dynamic? Maybe, wanted to use a parameter in multiple locations? Maybe wanted to pass through a publication date? Advanced use of YAML headers can help! Normally, when we write rmarkdown, we might use something like the basic YAML header that the rmarkdown template gives us. --- title: "My report" date...

1558 sym R (425 sym/4 pcs)

datasauRus now on CRAN

09.05.2017

datasauRus is a package storing the datasets from the paper Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. It’s a useful package for: Having a dinosaur dataset Showing a dinosaur related variant of Anscombe’s Quartet You can now get datasauRus on CRAN, though i...

1803 sym R (434 sym/3 pcs) 2 img