Publications by John Johnson
From datasets to algorithms in R
Many statistical algorithms are taught and implemented in terms of linear algebra. Statistical packages often borrow heavily from optimized linear algebra libraries such as LINPACK, LAPACK, or BLAS. When implementing these algorithms in systems such as Octave or MATLAB, it is up to you to translate the data from the use case terms (factors, categ...
3407 sym
Using R for a salary negotiation–an extension of decision tree models
Let’s say you are in the middle of a salary negotiation, and you want to know whether you should be aggressive in your offering or conservative. One way to help with the decision is to make a decision tree. We’ll work with the following assumptions: You are at a job currently making $50k You have the choices between asking $60k (which will ...
3785 sym R (2264 sym/2 pcs) 4 img
Integrating R into a SAS shop
I work in an environment dominated by SAS, and I am looking to integrate R into our environment. Why would I want to do such a thing? First, I do not want to get rid of SAS. That would not only take away most of our investment in SAS training and hiring good quality SAS programmers, but it would also remove the advantages of SAS from our environm...
1977 sym
Distrust of R
I guess I’ve been living in a bubble for a bit, but apparently there are a lot of people who still mistrust R. I got asked this week why I used R (and, specifically, the package rpart) to generate classification and regression trees instead of SAS Enterprise Miner. Never mind the fact that rpart code has been around a very long time, and probab...
1139 sym
RStudio is reminding me of the older Macs
The only thing missing is the cryptic ID number.Well, the only bad thing is that I am trying to run a probabilistic graphical model on some real data, and having a crash like this will definitely slow things down. Related To leave a comment for the author, please follow the link and comment on their blog: Realizations in Biostatistics. R-blo...
627 sym 2 img
Even the tiniest error messages can indicate an invalid statistical analysis
The other day, I was reading in a data set in R, and the function indicated that there was a warning about a parsing error on one line. I went ahead with the analysis anyway, but that small parsing error kept bothering me. I thought it was just one line of goofed up data, or perhaps a quote in the wrong place. I finally opened up the ...
1522 sym
Even the tiniest error messages can indicate an invalid statistical analysis
The other day, I was reading in a data set in R, and the function indicated that there was a warning about a parsing error on one line. I went ahead with the analysis anyway, but that small parsing error kept bothering me. I thought it was just one line of goofed up data, or perhaps a quote in the wrong place. I finally opened up the ...
1314 sym
Talk to Upstate Data Science Group on Caret
Last night I gave an introduction and demo of the caret R package to the Upstate Data Science group, meeting at Furman University. It was fairly well attended (around 20 people), and well received.It was great to get out of my own comfort zone a bit (since graduate school, I’ve only really given talks on some topic in biostatistics...
1287 sym
Talk to Upstate Data Science Group on Caret
Last night I gave an introduction and demo of the caret R package to the Upstate Data Science group, meeting at Furman University. It was fairly well attended (around 20 people), and well received.It was great to get out of my own comfort zone a bit (since graduate school, I’ve only really given talks on some topic in biostatistics...
1287 sym
Simulating a Weibull conditional on time-to-event is greater than a given time
Recently, I had to simulate a time-to-event of subjects who have been on a study, are still ongoing at the time of a data cut, but who are still at risk of an event (e.g. progressive disease, cardiac event, death). This requires the simulation of a conditional Weibull. To do this, I created the following function:# simulate conditional Weibull co...
2104 sym 2 img