Publications by mrtnj

Showing a difference in mean between two groups, take 2

09.05.2021

A couple of years ago, I wrote about the paradoxical difficulty of visualising a difference in means between two groups, while showing both the data and some uncertainty interval. I still feel like many ills in science come from our inability to interpret as simple comparison of means. Anything with more than two groups or a predictor that isn’...

3785 sym R (1394 sym/3 pcs) 6 img

Convincing myself about the Monty Hall problem

17.05.2021

Like many others, I’ve never felt that the solution to the Monty Hall problem was intuitive, despite the fact that explanations of the correct solution are everywhere. I am not alone. Famously, columnist Marilyn vos Savant got droves of mail from people trying to school her after she had published the correct solution. The problem goes like th...

6990 sym R (1666 sym/1 pcs) 22 img

A plot of genes on chromosomes

25.07.2021

Marta Cifuentes and Wayne Crismani asked on Twitter if there is a web tool similar to the Arabidopsis Chromosome Map Tool that makes figures of genes on chromosomes for humans. This will not really be an answer to the question — not a web tool, not conveniently packaged — but I thought that would be a nice plot to make in R with ggplot2. We w...

4686 sym R (2830 sym/5 pcs) 2 img

Using R: plyr to purrr, part 1

08.08.2021

This is the second post about my journey towards writing more modern Tidyverse-style R code; here is the previous one. We will look at the common case of taking subset of data out of a data frame, making some complex R object from them, and then extracting summaries from those objects. More nostalgia about plyr I miss the plyr package. Especiall...

6546 sym R (3760 sym/8 pcs) 2 img

Estimating recent population history from linkage disequilibrium with GONE and SNeP

29.12.2021

In this post, we will look at running two programs that infer population history — understood as changes in linkage disequilibrium over time — from genotype data. The post will chronicle running them on some simulated data; it will be light on theory, and light on methods evaluation. Linkage disequilibrium, i.e. correlation between alleles at...

13890 sym R (7377 sym/15 pcs) 8 img

Simulating shared segments between relatives

06.02.2022

A few months ago I saw this nice figure from Amy Williams of the number of DNA segments that are expected to be shared between relatives. I thought it would be fun to simulate segment sharing with AlphaSimR. Because DNA comes in chromosomes that don’t break up and recombine that much, the shared DNA between relatives tends to come in long chunk...

7040 sym R (4770 sym/10 pcs) 8 img