Publications by mrtnj
Using R: reshape2 to tidyr
Tidy data — it’s one of those terms that tend to confuse people, and certainly confused me. It’s Codd’s third normal form, but you can’t go around telling that to people and expect to be understood. One form is ”long”, the other is ”wide”. One form is ”melted”, another ”cast”. One form is ”gathered”, the other ”spr...
3702 sym R (995 sym/5 pcs) 14 img
Using R: the best thing I’ve changed about my code in years
Hopefully, one’s coding habits are constantly improving. If you feel any doubt about yourself, I suggest looking back at something you wrote 2011. One thing I’ve changed recently that made my life so much better is a simple silly thing: meaningful name for index and counter variables. Take a look at these pieces of fake code, that both loop o...
1574 sym R (385 sym/1 pcs)
Showing a difference in means between two groups
Visualising a difference in mean between two groups isn’t as straightforward as it should. After all, it’s probably the most common quantitative analysis in science. There are two obvious options: we can either plot the data from the two groups separately, or we can show the estimate of the difference with an interval around it. A swarm of do...
1868 sym R (1397 sym/3 pcs) 4 img
Using R: plotting the genome on a line
Imagine you want to make a Manhattan-style plot or anything else where you want a series of intervals laid out on one axis after one another. If it’s actually a Manhattan plot you may have a friendly R package that does it for you, but here is how to cobble the plot together ourselves with ggplot2. We start by making some fake data. Here, we ha...
2510 sym R (1772 sym/6 pcs) 2 img
What single step does with relationship
We had a journal club about the single step GBLUP method for genomic evaluation a few weeks ago. In this post, we’ll make a few graphs of how the single step method models relatedness between individuals. Imagine you want to use genomic selection in a breeding program that already has a bunch of historical pedigree and trait information. You co...
3884 sym 8 img
‘Simulating genetic data with R: an example with deleterious variants (and a pun)’
A few weeks ago, I gave a talk at the Edinburgh R users group EdinbR on the RAGE paper. Since this is an R meetup, the talk concentrated on the mechanics of genetic data simulation and with the paper as a case study. I showed off some of what Chris Gaynor’s AlphaSimR can do, and how we built on that to make the specifics of this simulation stud...
1958 sym R (2604 sym/1 pcs) 8 img
Using R: When weird errors occur in packages that used to work, check that you’re not feeding them a tibble
There are some things that are great about the tidyverse family of R packages and the style they encourage. There are also a few gotchas. Here’s a reminder to myself about this phenomenon: tidyverse-style data frames (”tibbles”) do not simplify to vectors upon extracting a single column with hard bracket indexing. Because some packages rely...
2101 sym R (1370 sym/2 pcs)
Using R: Animal model with simulated data
Last week’s post just happened to use MCMCglmm as an example of an R package that can get confused by tibble-style data frames. To make that example, I simulated some pedigree and trait data. Just for fun, let’s look at the simulation code, and use MCMCglmm and AnimalINLA to get heritability estimates. First, here is some AlphaSimR code that ...
2506 sym R (3710 sym/3 pcs) 2 img
Exploratory analysis of a banana
This post is just me amusing myself by exploring a tiny data set I have lying around. The dataset and the code is on Github. In 2014 (I think), I was teaching the introductory cell biology labs (pictures in the linked post) in Linköping. We were doing a series of simple preparations to look at cells and organelles: a cheek swab gives you a view ...
6260 sym R (6648 sym/8 pcs) 16 img
#TidyTuesday: horror films, squirrels and commuters
Tidy Tuesday is a fun weekly activity where a lot of R enthusiasts make different visualisations, and possibly modelling, of the same dataset. You can read more about it at their Github page. I participated for three weeks, and here is a recap. I will show excerpts of the code, but you can read the whole thing by clicking through to Github. 2019-...
10419 sym R (6763 sym/11 pcs) 24 img