Publications by Gary Hutson

Build and improve a Machine Learning Classification model with TidyModels and R

25.05.2021

These set of tutorial arose through my desire to use as many machine learning packages as possible. My favourites still remain tensorflow, caret, sci-kit learn and now TidyModels. Why TidyModels? Instead of replacing the modelling package, tidymodels replaces the interface. Better said, tidymodels provides a single set of functions and argum...

3387 sym 8 img

DTPlyr – easier data.table for DPLYR users

08.06.2021

Do you program in R and normally use DPLYR for data wrangling, manipulation or whatever term you call it? Have you heard all the hype about data.table and how this package can significantly improve the performance run time of your R scripts? Have you been meaning to get round to learning data.table and have never managed it? The answer to these q...

2929 sym R (693 sym/5 pcs) 8 img

OddsPlotty has landed on CRAN

22.06.2021

I am so excited that my package OddsPlotty has landed on CRAN. This was a package I worked on when I was doing lots of multiple comparisons of logistic regression models and wanted a way to visualise the odds ratios on a graph i.e. an odds plot. This package allows for the generation of odds plots and their associated Tibble outputs. It can be us...

2506 sym 8 img

Foghorn package – find out pending CRAN packages in the pipeline

12.07.2021

I have recently just pushed my fourth package to CRAN, I will do a separate post on this, but the FeatureTerminatoR package has been built to perform automated feature selection, utilising methods such as recursive partitioning, multicollinearity purging and other types will be built into the second version. Installing the package The package cu...

2380 sym R (240 sym/2 pcs) 4 img

FeatureTerminatoR – a package to remove unimportant variables from statistical and machine learning models automatically

15.07.2021

The motivation for this package is simple, while there are many packages that do similar things, few of them perform automated removal of the features from your models. This was the motivation, plus having them all in one location to enable you to easily find them, otherwise you would be looking through Caret, Tidymodels and mlr3 documentation al...

5663 sym 4 img

ConfusionTableR has made it to CRAN

21.07.2021

In my dusty GitHub repository, sitting there, was a gem of a tool for tidying the outputs of a machine learning classification model into a record and row-level view for storage in databases. It has taken me time to get this to CRAN, as the dreaded closure, not subsettable, plagues me on the CRAN checks in devtools for many months. I have now had...

2724 sym 4 img

Tracking and getting download statistics for your R packages

08.09.2021

I had the privilege of tapping into the R package funding stream to fund my first, and not last, CRAN package entitled NHSDataDictionaRy. The motivation for the package was to provide a consistent way to scrape the live NHS Data Dictionary website. The aim was to allow the lookups to always be up to date and to allow R users in the NHS to quickly...

3031 sym R (1399 sym/3 pcs) 2 img

Common mistakes we Data Scientists make

17.09.2021

DISCLAIMER I am a data scientist and have made all these mistakes, but I have had the privilege of sitting on the managerial, project lead and developer side of the fence, and here are some tips to getting your stakeholders (i.e. anyone involved in the project team or has an interest in the success of the project) on board and delivering a succes...

10234 sym

Roll up, roll up the NHS-R Community Conference 2021 is coming to town

23.09.2021

The conference will be held virtually and will kick-off : Monday 8th of November – Wednesday 10th November Featuring: Main Conference Events. This will include more workshops, lightning talks and plenary sessions. And the week before we will have lots of hands-on workshops: Monday 1st November – Friday 5th November Series of workshops for all...

2967 sym 2 img

Crash Course in R Model Deployment with Docker and friends

01.10.2021

I have put together a complete guide to model training, docker file creation and then consuming your API in R. This has arisen as part of a workshop the NHS-R community are doing around R in Production: show and tell, but instead of just making it local, I thought I would open up my part of the tutorial to everyone. The process follows this gen...

2414 sym 2 img