Publications by Karl Broman
The stupidest R code ever
Let’s start this blog off right, with the stupidest R mistake I’ve ever made (I think). In the R package that I write, R/qtl, one of the main file formats is a comma-delimited file, where the blank cells in the second row are important, as they distinguish the initial phenotype columns from the genetic marker columns. I’d gotten some report...
1943 sym R (428 sym/2 pcs) 18 img
useR! Conference 2011 highlights
I was at the useR! Conference at The University of Warwick in Coventry, UK, last week. My goal in going was to learn the latest things regarding (simple) dynamic graphics, (simple) web-based apps, parallel computing, and memory management (dealing with big data sets). I got just what I was hoping for and more. There are a lot of useful tools avai...
10222 sym 16 img
Quick labels within figures
One of the coolest R packages I heard about at the useR! Conference: Toby Dylan Hocking‘s directlabels package for putting labels directly next to the relevant curves or point clouds in a figure. I think I first learned about this idea from Andrew Gelman: that a separate legend requires a lot of back-and-forth glances, so it’s better to put t...
1836 sym R (788 sym/2 pcs) 26 img
Gamified
Barry Rowlingson gave an interesting talk at UseR 2011, “Why R-help must die!” He suggested the Q-and-A type sites Stack Overflow (on programming) and Cross Validated (on statistics), both part of Stack Exchange. An interesting feature of these sites is that, in addition to voting up and down on the questions and answers, one accrues reputati...
1391 sym 16 img
Ghastly R code
My R package, R/qtl, contains about 33k lines of R code (and 21k lines of C code). Some of it is quite good; some of it is terrible. Here’s another example of the terrible. I’ve long needed to revise the function scantwo, for performing a two-dimensional genome scan for pairs of loci. I was looking at the function today, and was aghast to ...
1036 sym 16 img
Halloween 2011 count
We don’t get many kids seeking candy at our house. I’m not sure if there just aren’t many kids in the neighborhood, or if it’s our location (next to the pond, with a big gap before the next house). I decided to keep track. As usual, we bought a huge bag of candy, and we still had about half of it left to hand out tonight. But only 19 kids...
2079 sym 6 img
Row names in data frames: beware of 1:nrow
I spent some time puzzling over row names in data frames in R this morning. It seems that if you make the row names for a data frame, x, as 1:nrow(x), R will act as if you’d not assigned row names, and the names might get changed when you do rbind. Here’s an illustration: > x <- data.frame(id=1:3) > y <- data.frame(id=4:6) > rownames(x) <- 1:...
1004 sym R (199 sym/1 pcs) 4 img
as.character() for rownames()
Rainer pointed out, in response to my post, Row names in data frames: Beware of 1:nrow, that if I’d used rownames(x) <- as.character(1:3) rather than rownames(x) <- 1:3, I wouldn’t have had the problem I’d seen. > x <- z <- data.frame(id=1:3) > y <- data.frame(id=4:6) > rownames(x) <- 1:3 > rownames(y) <- LETTERS[4:6] > rownames(z) <- as.ch...
899 sym R (576 sym/3 pcs) 4 img
Should I be nice?
I got the following email. Subject: i have a question? Date: May 18, 2012 7:57:56 AM CDT how can i enter the data of QTL analysis. That was the whole thing. I presume that the writer wishes to use my R/qtl software. I could probably respond helpfully (for example, “See the sample data files and code at the R/qtl web site.”), but can’t I ...
1371 sym 4 img
A course in statistical programming
Graduate students in statistics often take (or at least have the opportunity to take) a statistical computing course, but often such courses are focused on methods (like numerical linear algebra, the EM algorithm, and MCMC) and not on actual coding. For example, here’s a course in “advanced statistical computing” that I taught at Johns Ho...
3359 sym 4 img