Publications by John Johnson

From datasets to algorithms in R

05.12.2011

Many statistical algorithms are taught and implemented in terms of linear algebra. Statistical packages often borrow heavily from optimized linear algebra libraries such as LINPACK, LAPACK, or BLAS. When implementing these algorithms in systems such as Octave or MATLAB, it is up to you to translate the data from the use case terms (factors, categ...

3407 sym

Using R for a salary negotiation–an extension of decision tree models

21.03.2012

Let’s say you are in the middle of a salary negotiation, and you want to know whether you should be aggressive in your offering or conservative. One way to help with the decision is to make a decision tree. We’ll work with the following assumptions: You are at a job currently making $50k You have the choices between asking $60k (which will ...

3785 sym R (2264 sym/2 pcs) 4 img

Integrating R into a SAS shop

29.08.2012

I work in an environment dominated by SAS, and I am looking to integrate R into our environment. Why would I want to do such a thing? First, I do not want to get rid of SAS. That would not only take away most of our investment in SAS training and hiring good quality SAS programmers, but it would also remove the advantages of SAS from our environm...

1977 sym

Distrust of R

12.03.2013

I guess I’ve been living in a bubble for a bit, but apparently there are a lot of people who still mistrust R. I got asked this week why I used R (and, specifically, the package rpart) to generate classification and regression trees instead of SAS Enterprise Miner. Never mind the fact that rpart code has been around a very long time, and probab...

1139 sym

RStudio is reminding me of the older Macs

15.04.2013

The only thing missing is the cryptic ID number.Well, the only bad thing is that I am trying to run a probabilistic graphical model on some real data, and having a crash like this will definitely slow things down. Related To leave a comment for the author, please follow the link and comment on their blog: Realizations in Biostatistics. R-blo...

627 sym 2 img

Even the tiniest error messages can indicate an invalid statistical analysis

25.11.2015

The other day, I was reading in a data set in R, and the function indicated that there was a warning about a parsing error on one line. I went ahead with the analysis anyway, but that small parsing error kept bothering me. I thought it was just one line of goofed up data, or perhaps a quote in the wrong place. I finally opened up the ...

1522 sym

Even the tiniest error messages can indicate an invalid statistical analysis

25.11.2015

The other day, I was reading in a data set in R, and the function indicated that there was a warning about a parsing error on one line. I went ahead with the analysis anyway, but that small parsing error kept bothering me. I thought it was just one line of goofed up data, or perhaps a quote in the wrong place. I finally opened up the ...

1314 sym

Talk to Upstate Data Science Group on Caret

14.01.2016

Last night I gave an introduction and demo of the caret R package to the Upstate Data Science group, meeting at Furman University. It was fairly well attended (around 20 people), and well received.It was great to get out of my own comfort zone a bit (since graduate school, I’ve only really given talks on some topic in biostatistics...

1287 sym

Talk to Upstate Data Science Group on Caret

14.01.2016

Last night I gave an introduction and demo of the caret R package to the Upstate Data Science group, meeting at Furman University. It was fairly well attended (around 20 people), and well received.It was great to get out of my own comfort zone a bit (since graduate school, I’ve only really given talks on some topic in biostatistics...

1287 sym

Simulating a Weibull conditional on time-to-event is greater than a given time

20.05.2016

Recently, I had to simulate a time-to-event of subjects who have been on a study, are still ongoing at the time of a data cut, but who are still at risk of an event (e.g. progressive disease, cardiac event, death). This requires the simulation of a conditional Weibull. To do this, I created the following function:# simulate conditional Weibull co...

2104 sym 2 img