Publications by Very statisticious on Very statisticious

Getting started with emmeans

24.03.2019

Package emmeans (formerly known as lsmeans) is enormously useful for folks wanting to do post hoc comparisons among groups after fitting a model. It has a very thorough set of vignettes (see the vignette topics here), is very flexible with a ton of options, and works out of the box with a lot of different model objects (and can be extended to oth...

12260 sym R (11783 sym/22 pcs)

Custom contrasts in emmeans

14.04.2019

Following up on a previous post, where I demonstrated the basic usage of package emmeans for doing post hoc comparisons, here I’ll demonstrate how to make custom comparisons (aka contrasts). These are comparisons that aren’t encompassed by the built-in functions in the package. Remember that you can explore the available built-in emmeans func...

8482 sym R (5017 sym/17 pcs)

Embedding subplots in ggplot2 graphics

21.04.2019

The idea of embedded plots for visualizing a large dataset that has an overplotting problem recently came up in some discussions with students. I first learned about embedded graphics from package ggsubplot. You can still see an old post about that package and about embedded graphics in general, with examples. However, ggsubplot is no longer main...

13301 sym R (20710 sym/29 pcs) 26 img

The small multiples plot: how to combine ggplot2 plots with one shared axis

12.05.2019

There are a variety of ways to combine ggplot2 plots with a single shared axis. However, things can get tricky if you want a lot of control over all plot elements. I demonstrate three different approaches for this: 1. Using facets, which is built in to ggplot2 but doesn’t allow much control over the non-shared axes. 2. Using package cowplot, wh...

5480 sym R (9939 sym/14 pcs) 16 img

Many similar models – Part 1: How to make a function for model fitting

23.06.2019

I worked with several students over the last few months who were fitting many linear models, all with the same basic structure but different response variables. They were struggling to find an efficient way to do this in R while still taking the time to check model assumptions. A first step when working towards a more automated process for fittin...

7253 sym R (3783 sym/20 pcs) 4 img

Many similar models – Part 2: Automate model fitting with purrr::map() loops

21.07.2019

When we have many similar models to fit, automating at least some portions of the task can be a real time saver. In my last post I demonstrated how to make a function for model fitting. Once you have made such a function it’s possible to loop through variable names and fit a model for each one. In this post I am specifically focusing on having ...

7928 sym R (6436 sym/20 pcs) 4 img

More exploratory plots with ggplot2 and purrr: Adding conditional elements

26.09.2019

This summer I was asked to collaborate on an analysis project with many response variables. As usual, I planned on automating my initial graphical data exploration through the use of functions and purrr::map() as I’ve written about previously. However, this particular project was a follow-up to a previous analysis. In the original analysis, dif...

5970 sym R (16795 sym/16 pcs) 10 img

Expanding binomial counts to binary 0/1 with purrr::pmap()

03.10.2019

Data on successes and failures can be summarized and analyzed as counted proportions via the binomial distribution or as long format 0/1 binary data. I most often see summarized data when there are multiple trials done within a study unit; for example, when tallying up the number of dead trees out of the total number of trees in a plot. If these ...

5082 sym R (3900 sym/10 pcs)

Making a background color gradient in ggplot2

13.10.2019

I was recently making some arrangements for the 2020 eclipse in South America, which of course got me thinking of the day we were lucky enough to have a path of totality come to us. We have a weather station that records local temperature every 5 minutes, so after the eclipse I was able to plot the temperature change over the eclipse as we exper...

7366 sym R (10587 sym/14 pcs) 18 img

An example of base::split() for looping through groups

26.11.2019

I recently had a question from a client about the simplest way to subset a data.frame and apply a function to each subset. “Simplest” could mean many things, of course, since what is simple for one person could appear very difficult to another. In this specific case I suggested using base::split() as a possible option since it is one I find f...

4374 sym R (4486 sym/9 pcs)