Publications by Bob Muenchen
Poll Shows Open Source Almost Even with Commercial Analytics Software
The 2012 results of the annual KDnuggets poll are in. It shows R in first place with 30.7% of users reporting having used it for a real project. Excel is almost as popular. It seems out of place among so many more capable packages, but Excel is a tool that almost everyone has and knows how to use. It’s interesting to note that four of the top f...
1092 sym 6 img
Why R is Hard to Learn
The open source R software for analytics has a reputation for being hard to learn. It certainly can be, especially for people who are already familiar with similar packages such as SAS, SPSS or Stata. Training and documentation that leverages their existing knowledge and points out where their previous knowledge is likely to mislead them can sa...
10473 sym 4 img
SAS Beats R on July 2012 TIOBE Rankings
The TIOBE Community Programming Index ranks the popularity of programming languages, but from a programming language perspective rather than as analytical software (http://www.tiobe.com). It extracts measurements from blogs, entries in Wikipedia, books on Amazon, search engine results, etc. and combines them into a single index. The July 201...
1260 sym 4 img
Specifying Variables in R
R has several ways to specify which variables to use in an analysis. Some of the most frustrating errors can result from not understanding the order in which R searches for variables. This post demonstrates that order, hopefully smoothing your future use of R. If all your variables are vectors in your workspace, using them in an analysis is easy:...
6773 sym R (933 sym/9 pcs) 4 img
R for SAS, SPSS, Stata Users Workshop Redesigned
My workshop R for SAS, SPSS and Stata Users has been popular over the years, but it’s time for an overhaul. A common request has been to simplify it, so I have moved data management to a separate 4-hour workshop, Managing Data with R. This makes it much easier to absorb the basics in the remaining two 4-hour sessions. When you’re ready for mo...
935 sym 4 img
Comparing Transformation Styles: attach, transform, mutate and within
There are several ways to perform data transformations in R. Each has its own set of advantages and disadvantages. Let’s take one variable, square it and add 100. How many ways might an R beginner screw up such a simple computation? Quite a few! Here’s a data frame with one variable: > mydata <- data.frame(x = 1:5) > mydata x 1 1 2 2 3 3 ...
5759 sym R (1850 sym/13 pcs) 4 img
What Analytic Software are People Discussing?
by Robert A. Muenchen How can we measure the popularity or market share of analytic software? One way is to see what people are discussing. I’m in the process of updating my annual article, The Popularity of Data Analysis Software. Below is the newly updated Internet Discussion section. Don’t bother to read the rest of the main article unles...
6258 sym 10 img
R’s 2012 Growth in Capability Exceeds SAS’ All Time Total
by Robert A. Muenchen I’m slowly gathering all the data needed to update my ongoing article, The Popularity of Data Analysis Software. The section below is the latest installment. Growth in Capability The capability of all the software in this article has grown significantly over the years. It would be helpful to be able to plot the growth of e...
3425 sym 6 img
R Tackles Big Garbage
April 1, 2013 – Although the capabilities of the R system for data analytics have been expanding with impressive speed, it has heretofore been missing important fundamental methods. A new function works with the popular plyr package to provide these missing algorithms. Function names in plyr begin with two letters which indicate their input an...
4115 sym 4 img
Knoxville R Users Group Formed, Free Training Offered
R is popular free and open-source software for graphics and data analytics. The Knoxville R Users Group is being formed to help people learn R and improve their skills with it. Three departments of The University of Tennessee are working together to get it started: the Office of Information Technology, the National Institute for Computational S...
1775 sym 4 img