Publications by Stephen Turner

R function for extracting F-test P-value from linear model object

10.01.2011

I thought it would be trivial to extract the p-value on the F-test of a linear regression model (testing the null hypothesis R²=0). If I fit the linear model: fit var vglnk = {key: '949efb41171ac6ec1bf7f206d57e90b8'}; (function(d, t) { var s = d.createElement(t); s.type = 'text/javascript'; ...

1053 sym

Summarize Missing Data for all Variables in a Data Frame in R

16.02.2011

Something like this probably already exists in an R package somewhere out there, but I needed a function to summarize how much missing data I have in each variable of a data frame in R. Pass a data frame to this function and for each variable it’ll give you the number of missing values, the total N, and the proportion missing. prop...

1230 sym

R: Given column name in a Data Frame, Get the Index

17.02.2011

Had a mental block today trying to figure out how to get the indices of columns in a data frame given their names. Simple task but difficult to search Google for an answer. Thanks to jashapiro, Matt, and Vince for giving me a heads up on the which() function. The which() function returns the indices of TRUE values in a logical vector....

816 sym

Get all your Questions Answered

22.02.2011

When I have a question I usually ask the internet before bugging my neighbor. Yet it seems like Google’s search results have become increasingly irrelevant over the last few years, and this is especially true for searching anything related to R (and previously mentioned Rseek.org doesn’t really do the job I would expect it to do e...

814 sym

Split a Data Frame into Testing and Training Sets in R

24.02.2011

I recently analyzed some data trying to find a model that would explain body fat distribution as predicted by several blood biomarkers. I had more predictors than samples (p>n), and I didn’t have a clue which variables, interactions, or quadratic terms made biological sense to put into a model. I then turned to a few data mining pr...

817 sym

RStudio: New free IDE for R

28.02.2011

Just saw the announcement of the availability of Rstudio, a new (free & open source) integrated development environment for R that works on Windows, Mac, and Linux. Judging from the screenshots, it looks like Rstudio supports syntax highlighting for Sweave & easy PDF creation from Sweave code, which is something I haven’t seen anywhere else (on...

814 sym

Splitting a Dataset Revisited: Keeping Covariates Balanced Between Splits

08.03.2011

In my previous post I showed you how to randomly split up a dataset into training and testing datasets. (Thanks to all those who emailed me or left comments letting me know that this could be done using other means. As things go with R, it’s sometimes easier to write a new function yourself than it is to hunt down the function or pa...

811 sym

Forest plots using R and ggplot2

09.03.2011

Abhijit over at Stat Bandit posted some nice code for making forest plots using ggplot2 in R. You see these lots of times in meta-analyses, or as seen in the BioVU demonstration paper. The idea is simple – on the x-axis you have the odds ratio (or whatever stat you want to show), and each line is a different study, gene, SNP, phenot...

816 sym

New GenABEL Website, and more *ABEL software

18.03.2011

The *ABEL suite of R packages and software for genetic analysis has grown substantially since the appearance of GenABEL and the previously mentioned ProbABEL R packages. There are now a handful of useful R packages and other software utilities facilitating genome-wide association studies, analysis of imputed data, meta-analysis, effic...

817 sym

RStudio Keyboard Shortcut Reference PDF

21.03.2011

I recently started using RStudio, the amazing new IDE for R. You can view all of RStudio’s keyboard shortcuts by going to the help menu, but I made this printable reference for myself and thought I’d share it. I only included the Windows shortcuts, and I cut out all the obvious ones (Ctrl-S for save, Ctrl-O for open, etc) so it wo...

811 sym