Publications by mrtnj

Using R: correlation heatmap, take 2

03.03.2014

Apparently, this turned out to be my most popular post ever.  Of course there are lots of things to say about the heatmap (or quilt, tile, guilt plot etc), but what I wrote was literally just a quick celebratory post to commemorate that I’d finally grasped how to combine reshape2 and ggplot2 to quickly make this colourful picture of a correlat...

1518 sym R (181 sym/1 pcs) 16 img

Using R: common errors in table import

06.03.2014

I’ve written before about importing tabular text files into R, and here comes some more. This is because I believe (firmly) that importing data is the major challenge for beginners who want to analyse their data in R. What is the most important thing about using any statistics software? To get your data into it in the first place! Unfortunately...

6094 sym R (203 sym/4 pcs) 14 img

Morning coffee: scripting language

13.03.2014

Several people have asked: what scripting language should biologists learn if they are interested in doing a little larger-scale data analysis and have never programmed before? I’m not an expert, but these are the kinds of things I tend to say: The language is not so important; the same principles apply everywhere. Use what your friends and col...

1457 sym 14 img

Using R: barplot with ggplot2

19.03.2014

Ah, the barplot. Loved by some, hated by some, the first graph you’re likely to make in your favourite office spreadsheet software, but a rather tricky one to pull off in R. Or, that depends. If you just need a barplot that displays the value of each data point as a bar — which is one situation where I like a good barplot — the barplot( ) f...

7055 sym R (1565 sym/9 pcs) 20 img

Using R: quickly calculating summary statistics from a data frame

25.03.2014

A colleague asked: I have a lot of data in a table and I’d like to pull out some summary statistics for different subgroups. Can R do this for me quickly? Yes, there are several pretty convenient ways. I wrote about this in the recent post on the barplot, but as this is an important part of quickly getting something useful out of R, just like i...

4914 sym R (1754 sym/8 pcs) 14 img

Using R: quickly calculating summary statistics (with dplyr)

26.03.2014

I know I’m on about Hadley Wickham‘s packages a lot. I’m not the president of his fanclub, but if there is one I’d certainly like to be a member. dplyr is going to be a new and improved ddply: a package that applies functions to, and does other things to, data frames. It is also faster and will work with other ways of storing data, such a...

2921 sym R (1687 sym/4 pcs) 14 img

More fun with %.% and %>%

27.03.2014

The %.% operator in dplyr allows one to put functions together without lots of nested parentheses. The flanking percent signs are R’s way of denoting infix operators; you might have used %in% which corresponds to the match function or %*% which is matrix multiplication. The %.% operator is also called chain, and what it does is rearrange the ca...

2435 sym R (269 sym/3 pcs) 14 img

Finding the distance from ChIP signals to genes

04.07.2014

I’ve had a couple of months off from blogging. Time for some computer-assisted biology! Robert Griffin asks on Stack Exchange about finding the distance between HP1 binding sites and genes in Drosophila melanogaster.  We can get a rough idea with some public chromatin immunoprecipitation data, R and the wonderful BEDTools. Finding some binding...

6052 sym R (1744 sym/5 pcs) 18 img

R in genomics @ SciLifeLab, Solna

24.03.2015

Dear diary, I went to the Stockholm R useR group meetup on R in genomics at the Stockholm node of SciLifeLab. It was nice. If I had worked a bit closer I would attend meetups all the time. I even got to be pretentious with my notebook while waiting for the train. The speakers were: Jakub Orzechowski Westholm on R and genomics in general. He demo...

2746 sym 16 img

Toying with models: The Luria–Delbrück fluctuation test

19.02.2016

I hope that Genetics will continue running expository papers about their old classics, like this one by Philip Meneely about Luria & Delbrück (1943). Luria & Delbrück performed an experiment on bacteriophage resistance in Escherichia coli, growing bacterial cultures, exposing them to a phage, and then plating and counting the survivors, who hav...

3929 sym R (1409 sym/4 pcs) 28 img