Publications by Stephen Turner

Efficient Mixed-Model Association in GWAS using R

13.04.2010

I recently did an analysis for the eMERGE network where I had lots of individuals from a small town in central Wisconsin where many of the subjects were related to one another. The subjects could not be treated as independent, but I could not use a family-based design either. I ended up using a mixed model approach using previously me...

817 sym

Top 10 Algorithms in Data Mining

23.04.2010

The authors here invited ACM KDD Innovation Award and IEEE ICDM Research Contributions Award winners to each nominate up to 10 best-known algorithms in data mining, including the algorithm name, justification for nomination, and a representative publication reference. The list was voted on by other IEEE and ACM award winners to narrow...

813 sym

Mixed linear model approach adapted for genome-wide association studies

06.05.2010

A few weeks ago I covered an R package for efficient mixed model regression that is capable of simultaneously accounting for both population stratification and relatedness to compute unbiased estimates of standard errors and p-values for genetic association studies. Fitting linear mixed effects models on GWAS scale can be very time co...

817 sym

R Package ‘rms’ for Regression Modeling

11.05.2010

If you attended Frank Harrell’s Regression Modeling Strategies course a few weeks ago, you got a chance to see the rms package for R in action. Frank’s rms package does regression modeling, testing, estimation, validation, graphics, prediction, and typesetting by storing enhanced model design attributes in the fit. rms is a re-w...

817 sym

Sweave for Reproducible Research and Beatiful Statistical Reports

11.05.2010

Frank Harrell, chair of the Biostatistics department here at Vanderbilt, is giving a seminar entitled “Sweave for Reproducible Research and Beautiful Statistical Reports” tomorrow, Wednesday, May 12, 1:30-2:30pm, in the MRBIII Conference Room 1220. This tutorial covers the basics of Sweave and shows how to enhance the default outp...

817 sym

Using R, LaTeX, and Sweave for Reproducible Research: Handouts, Templates, & Other Resources

13.05.2010

Several readers emailed me or left a comment on my previous announcement of Frank Harrell’s workshop on using Sweave for reproducible research asking if we could record the seminar. Unfortunately we couldn’t record audio or video, but take a look at the Sweave/Latex page on the Biostatistics Dept Wiki. Here you can find Frank...

810 sym

Tutorial: Principal Components Analysis (PCA) in R

20.05.2010

Found this tutorial by Emily Mankin on how to do principal components analysis (PCA) using R. Has a nice example with R code and several good references. The example starts by doing the PCA manually, then uses R’s built in prcomp() function to do the same PCA. Principle Components Analysis: A How-To Manual for R Related To leave...

735 sym

Use SQL queries to manipulate data frames in R with sqldf package

25.05.2010

I’ve covered a few topics in the past including the plyr package, which is kind of like “GROUP BY” for R, and the merge function for merging datasets. I only recently found the sqldf package for R, and it’s already one of the most useful packages I’ve ever installed. The main function in the package is sqldf(), which takes a...

817 sym

Efficient Mixed-Model Association eXpedited (EMMAX) to Simutaneously Account for Relatedness and Stratification in Genome-Wide Association Studies

09.06.2010

A few months ago I covered an algorithm called EMMA (Efficient Mixed-Model Association) implemented in R for simultaneously correct for both population stratification and relatedness in an association study. This method/software is very useful because most methods that account for relatedness in an association study assume a genetical...

813 sym

All code on GGD is Free (Open Source BSD)

07.07.2010

At the request of a commenter I just wanted to clarify that any code released here for R or anything else is free and open source unless specifically stated otherwise. The open source BSD license for any code on GGD can be found on this copyright page. Related To leave a comment for the author, please follow the link and comment on...

670 sym