Publications by mrtnj
Toying with models: The Game of Life with selection
Conway’s Game of life is probably the most famous cellular automaton, consisting of a grid of cells developing according simple rules. Today, we’re going to add mutation and selection to the game, and see let patterns evolve. The fate of a cell depends on the number cells that live in the of neighbouring positions. A cell with fewer than two ...
6188 sym R (2841 sym/3 pcs) 40 img
Balancing a centrifuge
I saw this cute little paper on arxiv about balancing a centrifuge: Peil & Hauryliuk (2010) A new spin on spinning your samples: balancing rotors in a non-trivial manner. Let us have a look at the maths of balancing a centrifuge. The way I think most people (including myself) balance their samples is to put them opposite of each other, just like ...
4228 sym R (313 sym/2 pcs) 30 img
Using R: tibbles and the t.test function
A participant in the R course I’m teaching showed me a case where a tbl_df (the new flavour of data frame provided by the tibble package; standard in new RStudio versions) interacts badly with the t.test function. I had not seen this happen before. The reason is this: Interacting with legacy code A handful of functions are don’t work with ti...
1668 sym R (348 sym/1 pcs) 14 img
It seems dplyr is overtaking correlation heatmaps
(… on my blog, that is.) For a long time, my correlation heatmap with ggplot2 was the most viewed post on this blog. It still leads the overall top list, but by far the most searched and visited post nowadays is this one about dplyr (followed by it’s sibling about plyr). I fully support this, since data wrangling and reorganization logically ...
1938 sym R (1006 sym/2 pcs) 14 img
Using R: Don’t save your workspace
To everyone learning R: Don’t save your workspace. When you exit an R session, you’re faced with the question of whether or not to save your workspace. You should almost never answer yes. Saving your workspace creates an image of your current variables and functions, and saves them to a file called ”.RData”. When you re-open R from that w...
2273 sym 14 img
Using R: a function that adds multiple ggplot2 layers
Another interesting thing that an R course participant identified: Sometimes one wants to make a function that returns multiple layers to be added to a ggplot2 plot. One could think that just adding them and returning would work, but it doesn’t. I think it has to do with how + is evaluated. There are a few workarounds that achieve similar resul...
1813 sym R (1239 sym/5 pcs) 20 img
Mutation, selection, and drift (with Shiny)
Imagine a gene that comes in two variants, where one of them is deleterious to the carrier. This is not so hard to imagine, and it is often the case. Most mutations don’t matter at all. Of those that matter, most are damaging. Next, imagine that the mutation happens over and over again with some mutation rate. This is also not so hard. After al...
4809 sym 30 img
Using R: When using do in dplyr, don’t forget the dot
There will be a few posts about switching from plyr/reshape2 for data wrangling to the more contemporary dplyr/tidyr. My most common use of plyr looked something like this: we take a data frame, split it by some column(s), and use an anonymous function to do something useful. The function takes a data frame and returns another data frame, both of...
1866 sym R (530 sym/5 pcs) 14 img
Summer of data science 1: Genomic prediction machines #SoDS17
Genetics is a data science, right? One of my Summer of data science learning points was to play with out of the box prediction tools. So let’s try out a few genomic prediction methods. The code is on GitHub, and the simulated data are on Figshare. Genomic selection is the happy melding of quantitative and molecular genetics. It means using gene...
6307 sym 20 img
Scripting for data analysis (with R)
Course materials (GitHub) This was a PhD course given in the spring of 2017 at Linköping University. The course was organised by the graduate school Forum scientium and was aimed at people who might be interested in using R for data analysis. The materials developed from a part of a previous PhD course from a couple of years ago, an R tutorial g...
3898 sym 14 img