Publications by Gavin L. Simpson

Introducing gratia

23.10.2018

I use generalized additive models (GAMs) in my research work. I use them a lot! Simon Wood’s mgcv package is an excellent set of software for specifying, fitting, and visualizing GAMs for very large data sets. Despite recently dabbling with brms, mgcv is still my go-to GAM package. The only down-side to mgcv is that it is not very tidy-aware an...

9360 sym R (2369 sym/13 pcs) 10 img

Confidence intervals for GLMs

10.12.2018

You’ve estimated a GLM or a related model (GLMM, GAM, etc.) for your latest paper and, like a good researcher, you want to visualise the model and show the uncertainty in it. In general this is done using confidence intervals with typically 95% converage. If you remember a little bit of theory from your stats classes, you may recall that such a...

10694 sym R (4703 sym/19 pcs) 6 img

Tibbles, checking examples, & character encodings

22.01.2019

Recently I’ve been preparing my gratia package for submission to CRAN. During my pre-flight testing I noticed an issue under Windows checking the examples in the package against the reference output I generated on linux. In the latest release of the tibble package, the way tibbles are printed has changed subtly and in a way that leads to cross-...

4687 sym R (3328 sym/7 pcs)

radian: a modern console for R

18.06.2019

Whenever I’m developing R code or writing data wrangling or analysis scripts for research projects that I work on I use Emacs and its add-on package Emacs Speaks Statistics (ESS). I’ve done so for nigh on a couple of decades now, ever since I switched full time to running Linux as my daily OS. For years this has served me well, though I would...

6398 sym R (614 sym/6 pcs) 8 img

Pivoting tidily

25.10.2019

One of the fun bits of my job is that I have actual time dedicated to helping colleagues and grad students with statistical or computational problems. Recently I’ve been helping one of our Lab Instructors with some data that from their Plant Physiology Lab course. Whilst I was writing some R code to import the raw data for the lab from an Excel...

12035 sym R (5854 sym/14 pcs) 10 img

Rendering your README with GitHub Actions

30.04.2020

There’s one thing that has bugged me for a while about developing R packages. We have all these nice, modern tools we have for tracking our code, producing web sites from the roxygen documentation, an so on. Yet for every code commit I make to the master branch of a package repo, there’s often two or more additional steps I need to take to ke...

12165 sym R (3143 sym/11 pcs)

gratia 0.4.1 released

31.05.2020

After a slight snafu related to the 1.0.0 release of dplyr, a new version of gratia is out and available on CRAN. This release brings a number of new features, including differences of smooths, partial residuals on partial plots of univariate smooths, and a number of utility functions, while under the hood gratia works for a wider range of models...

3937 sym R (2525 sym/8 pcs) 6 img

Extrapolating with B splines and GAMs

03.06.2020

An issue that often crops up when modelling with generlaized additive models (GAMs), especially with time series or spatial data, is how to extrapolate beyond the range of the data used to train the model? The issue arises because GAMs use splines to learn from the data using basis functions. The splines themselves are built from basis functions ...

16608 sym R (10906 sym/32 pcs) 14 img

Two new versions of gratia released

30.01.2021

While the Covid-19 pandemic and teaching a new course in the fall put paid to most of my development time last year, some time off work this January allowed me time to work on gratia ???? again. I released 0.5.0 to CRAN in part to fix an issue with tests not running on the new M1 chips from Apple because I wasn’t using vdiffr ???? conditionally...

8333 sym R (4885 sym/15 pcs) 6 img

Getting data from the Canada Covid-19 Tracker using R

31.01.2021

Last semester (Fall 2020) I taught a new course in healthcare data science for the Johnson Shoyama Graduate School in Public Policy. One of the final topics of the course was querying application programming interfaces (APIs) from within R. The example we used was querying data on the Covid 19 pandemic from the Covid-19 Tracker Canada, which has ...

5816 sym R (4408 sym/16 pcs) 4 img