Publications by David Smith
Surveys continue to rank R #1 for Data Mining
KDnuggets recently posted its annual poll on data mining software, and the R language retains its #1 ranking as the most commonly-used software for data mining: R is now used by 52.5% of poll respondents, compared with 45% last year. Donnie Berkholz provides an analysis of the year-on-year trends for Redmonk. He provides the chart below, and no...
1516 sym 8 img
Predicting the 100m sprint: results
Last week, Markus Gesmann used a log-linear model in R to predict the Olympic gold-medal winning 100m sprint time to be 9.68 seconds. The actual time was 9.63 seconds. Not bad! Meanwhile, the New York Times put Usuain Bolt's olympic record in context, comparing him in a virtual race with other gold medal-winners over the past century (via Flowing...
748 sym
An analysis of the r-help mailing list
Even though forums and question-and-answer services like StackOverflow are emerging as the place to find crowdsourced technical help when using software like R, the traditional r-help email list is still going strong. UCLA grad student and R user Richard Kwock presented a poster at last month's JSM conference with an analysis of traffic on the ...
2303 sym 6 img
The top 10 critical packages on CRAN
While most R packages on CRAN are designed to be used by an R user directly, a few packages are designed to be used by other package developers. (And some packages are so useful that they're regularly used by both camps.) When a package author publishes a package to CRAN, she must list those packages that provide functions her package uses (this ...
2234 sym 2 img
Cheat sheet for prediction and classification models in R
Ricky Ho has created a reference a 6-page PDF reference card on Big Data Machine Learning, with examples implemented in the R language. (A free registration to DZone Refcardz is required to download the PDF.) The examples cover: Predictive modeling overview (how to set up test and training sets in R) Linear regression (using lm) Logistic regress...
1572 sym 2 img
In case you missed it: July 2012 Roundup
In case you missed them, here are some articles from June of particular interest to R users. The Environmental Performance Index website uses R to rank countries by measures like environmental health and ecosystem vitality.A log-linear regression in R predicted the gold-winning Olympic 100m sprint time to be 9.68 seconds (it was actually 9.63 sec...
2752 sym
New R User Groups in San Antonio, Milwaukee, Nicaragua
We have three new local R user groups to announce this month. The Alamo City R Users Group in San Antonio becomes the fifth R user group in Texas. The group's just getting started, and volunteers are always welcome. Although not a dedicated R group, the Milwaukee Chapter of the ASA hosts occasional R workshops. In May next year, they will h...
1086 sym
Is gas cheaper than it used to be?
Biostatistician and R user Matt Cooper noticed recently that the price he pays for petrol (gasoline) at the pump in Perth, Australia was about the same as he was paying four years ago. Nonetheless, inflation has marched on over the years, so does that mean petrol is effectively cheaper now than it used to be? And how does the price of gas today c...
2402 sym 4 img
New Revolution Analytics office in Singapore
We're excited to announce the latest outpost of the Revolution Analytics team, with the opening of a new office in Singapore! This office will serve as the local HQ for Revolution Analytics serving our customers in the Asia-Pacific region. It was opened with the support of the support of the Infocomm Development Authority of Singapore, which is ...
1129 sym
How Williams Sonoma uses R to target customers online
If you live in the US, you've probably visited a Williams Sonoma store for gourmet food or quality cookware for the kitchen. And if you've shopped at Pottery Barn or West Elm stores for furniture, those chains are part of the Williams Sonoma stable as well. All three brands have major online stores, all supported by a sophisticated marketing oper...
3653 sym 2 img