Publications by Abhijit

Workflow with Python and R

06.03.2009

I seem to be doing more and more with Python for work over and above using it as a generic scripting language. R has been my workhorse for analysis for a long time (15+ years in various incarnations of S+ and R), but it still has some deficiencies. I’m finding Python easier and faster to work with for large data sets. I’m also a bit happier w...

3496 sym 16 img

R amusements

05.03.2010

On a lark, and to kill a bit of time, I was running the R fortune command looking for references to SAS. Here’s what two successive random fortunes turned up. Can there be two more antipodal opinions about the same product? I laughed out loud. > fortune(‘SAS’) There are companies whose yearly license fees to SAS total millions of dollars. ...

1404 sym 16 img

Quick and dirty parallel processing in R

30.04.2010

R has some powerful tools for parallel processing, which I discovered while searching for ways to fully utilize my 8-core computer at work. What surprised me is how easy it is…about 6 lines of code, if that. Given that I wasn’t allowed to install heavy duty parallel-processing systems like MPICH on the computer, I found that the library SNOW ...

1860 sym 16 img

A small customization of ESS

14.05.2010

JD Long (at Cerebral Mastication) posted a question on Twitter about an artifact in ESS, where typing “_” gets you “ Type “_” twice, which puts in the underscore Use “C-q _”, i.e. Ctrl-q then underscore Put (setq ess-S-assign "_") in your .emacs file The last fix obviously customizes ESS permanently for your emacs setup, while the...

1359 sym 18 img

useR! 2010 done and dusted

23.07.2010

The useR! 2010 R users conference just finished up this afternoon with a thought-provoking, controversial, and sometimes hilarious talk by Richard Stallman of GNU fame. It started on Tuesday with great tutorials (I took ones on MICE for multiple imputation and Frank Harrell’s excellent regression modeling). In between these bookends was a wonde...

2689 sym 16 img

ggplot2 joy

25.02.2011

I’ve been working on a long-term (25+yr) longitudinal study of rheumatoid arthritis with my boss. He just walked in and asked if I could create a plot showing the trajectory of pain scores over time for each subject, separated by educational level (4 groups). Having now worked with ggplot2 for a while, and learning more at the last two DC useR ...

1481 sym 6 img

The split-apply-combine paradigm in R

25.02.2011

Last night at the DC R Users meetup, which was our largest meetup to date, I gave an introductory presentation on data munging, and spent a bit of time on the split-apply-combine paradigm that I use almost daily in my work. I talked mainly about the packages plyr and doBy, which I use a lot now. David Smith posted a link on the Revolution blog to...

1133 sym 4 img

RStudio: a cut above

01.03.2011

As most followers of R-bloggers.com and the Twitter #rstats know by now, RStudio is a new open-source IDE for R that was beta-released yesterday. I have started putting it through its paces within my R workflow, and my impressions are more than favorable. I also tried it out on my home Linux server in server mode. RStudio is obviously designed by...

2135 sym 4 img

An enhanced Kaplan-Meier plot

08.03.2011

We often see, in publications, a Kaplan-Meier survival plot, with a table of the number of subjects at risk at different time points aligned below the figure. I needed this type of plot (or really, matrices of such plots) for an upcoming publication. Of course, my preferred toolbox was R and the ggplot2 package. There were other attempts to do th...

4207 sym R (5282 sym/2 pcs) 4 img

SAS, R and categorical variables

13.07.2011

One of the disappointing problems in SAS (as I need PROC MIXED for some analysis) is to recode categorical variables to have a particular reference category. In R, my usual tool, this is rather easy both to set and to modify using the  relevel command available in base R (in the stats package). My understanding is that this is actually easy in S...

1289 sym 4 img