Publications by Stephen Turner

RNA-seq Data Analysis Course Materials

20.11.2014

Last week I ran a one-day workshop on RNA-seq data analysis in the UVA Health Sciences Library. I set up an AWS public EC2 image with all the necessary software installed. Participants logged into AWS, launched the image, and we kicked off the morning session with an introduction to the Unix shell (taught by Jessica Bonnie, a biostati...

1860 sym

Importing Illumina BeadArray data into R

08.12.2014

A colleague needed some help getting Illumina BeadArray gene expression data loaded into R for data analysis with limma. Hopefully whoever ran your arrays can export the data as text files formatted as described in the code below. If so, you can import those text files directly using the beadarray package. This way you avoid getting ...

987 sym

Using the microbenchmark package to compare the execution time of R expressions

14.01.2015

I recently learned about the microbenchmark package while browsing through Hadley’s advanced R programming book. I’ve done some quick benchmarking using system.time() in a for loop and taking the average, but the microbenchmark function in the microbenchmark package makes this much easier. Hadley gives the example of taking the square root of...

3932 sym R (2803 sym/11 pcs) 2 img

R + ggplot2 Graph Catalog

03.02.2015

Joanna Zhao’s and Jenny Bryan’s R graph catalog is meant to be a complement to the physical book, Creating More Effective Graphs, but it’s a really nice gallery in its own right. The catalog shows a series of different data visualizations, all made with R and ggplot2. Click on any of the plots and you get the R code necessary to generate th...

1797 sym R (2371 sym/2 pcs) 6 img

Using and Abusing Data Visualization: Anscombe’s Quartet and Cheating Bonferroni

26.02.2015

Anscombe’s quartet comprises four datasets that have nearly identical simple statistical properties, yet appear very different when graphed. Each dataset consists of eleven (x,y) points. They were constructed in 1973 by the statistician Francis Anscombe to demonstrate both the importance of graphing data before analyzing it and the effect of ou...

4269 sym R (2434 sym/6 pcs) 6 img 1 tbl

R User Group Recap: Heatmaps and Using the caret Package

10.04.2015

At our most recent R user group meeting we were delighted to have presentations from Mark Lawson and Steve Hoang, both bioinformaticians at Hemoshear. All of the code used in both demos is in our Meetup’s GitHub repo.Making heatmaps in RSteve started with an overview of making heatmaps in R. Using the iris dataset, Steve demonstrated making hea...

2090 sym R (628 sym/2 pcs) 4 img 1 tbl

R: single plot with two different y-axes

21.04.2015

I forgot where I originally found the code to do this, but I recently had to dig it out again to remind myself how to draw two different y axes on the same plot to show the values of two different features of the data. This is somewhat distinct from the typical use case of aesthetic mappings in ggplot2 where I want to have different lines/points/...

1982 sym R (562 sym/3 pcs) 4 img 1 tbl

Compiling RMarkdown from a Helper R Script

06.08.2015

The problemI was looking for a way to compile an RMarkdown document and have the filename of the resulting PDF or HTML document contain the name of the input data that it processed. That is, if I compiled the analysis.Rmd file, where in that file it did some analysis and reporting on data001.txt, I’d want the resulting filename to look somethin...

2366 sym Python (1297 sym/2 pcs)

Compiling RMarkdown from a Helper R Script

06.08.2015

The problemI was looking for a way to compile an RMarkdown document and have the filename of the resulting PDF or HTML document contain the name of the input data that it processed. That is, if I compiled the analysis.Rmd file, where in that file it did some analysis and reporting on data001.txt, I’d want the resulting filename to look somethin...

2366 sym Python (1297 sym/2 pcs)

Software from CSHL Genome Informatics 2015

02.11.2015

I just returned from the Genome Informatics meeting at Cold Spring Harbor. This was, hands down, the best scientific conference I’ve been to in years. The quality of the talks and posters was excellent, and it was great meeting in person many of the scientists and developers whose tools and software I use on a daily basis. To get a sense of w...

4419 sym