Publications by David Smith
The tools in an R package developer’s toolbox
Yihui Xie is the creator of several popular R packages, including knitr, animation and cranvas. In an interview with The Setup, he shares some of the software and hardware he uses in his data-to-day work, including (of course) R: For programming and data analysis, I primarily use R since I'm a statistician. I have created a bunch of R packages i...
1337 sym 2 img
Real-Time Predictive Analytics with Big Data, and R
Can R be used for real-time applications? Absolutely! The key is in setting up an technology stack that can support real-time interactions with models developed in R … and a clear understanding of what “real-time” really means, and its implications in the context of Big Data. I explained how this works in yesterday's webinar, Real-Time Pre...
1442 sym 2 img
Because it’s Friday: Evolution of a research paper about Reddit
Computer Science PhD student Tim Weninger wrote a 10-page paper for the World Wide Web conference looking at how Reddit users interact on the discussion pages of the social news site. During the process, he saved 463 revisions of the paper in a source-code control system. Then, he wrote a computer program to animate each revision of the paper. Th...
1749 sym
Tutorial: How to make NYT-style bar charts with R
New York Times columnist Charles Blow needed a chart to accompany his op-ed piece Lincoln, Liberty and Two Americas (about one-party control in state legislatures). So he turned to resident graphic editor Kevin Quealy, who found the source data and used R to create the chart below: If you'd like to create similar charts yourself, Kevin provi...
1209 sym 2 img
Big Data Trees with Hadoop HDFS
Last month's release of Revolution R Enterprise 6.1 added the capability to fit decision and regresson trees on large data sets (using a new parallel external memory algorithm included in the RevoScaleR package). It also introduced the possibility of applying this and the other big-data statistical methods of RevoScaleR to data files distributed ...
1278 sym
Shiny released to CRAN; Shiny Server coming soon
The shiny package, the R package from RStudio that makes it easy to build simple interactive interfaces for R scripts, is now available on CRAN. This will make it easier for R programmers to install and use shiny, and to run the interfaces they create from a local web browser. The next step is to be able to publish interactive interfaces for othe...
1846 sym
Learn R by trying R
By Revolution Analytics training manager James Peruvankal If you are new to R, and want to get an introduction to the R language, in the classic “learning by doing way”, Code school and O’Reilly have put together the Try R interactive tutorial. This tutorial is a painless introduction to the R programming language. During the course y...
1825 sym 4 img
R analysis shows how UK health system could save £200m
According to an analysis by Prescribing Analytics (a joint venture of technologists and doctors in the UK), Britain's cash-strapped National Health Service (NHS) is overspending on prescription drugs. While cheaper (but equally effective) generic drugs are widely available for many treatments, some doctors continue to prescribe patented drugs wh...
2285 sym 4 img
Four years of the Revolutions Blog
Yesterday was the fourth anniversary of the Revolutions blog. Our first post was way back on December 9, 2008, and in the four years since we've been regularly posting about R, open source, statistics, big data, data science and other random things that happened to catch our eye. In fact, there have been 1488 posts published in the last four yea...
1976 sym
Videos from Coursera’s four week course in R
Coursera's Computing for Data Analysis course on R is now over, with four weeks of free, in-depth training on the R language. While you'll have to wait for the next installment of the course to participate in the full online learning experience, you can still view the lecture videos, courtesy of course presenter Roger Peng's YouTube page. The cou...
2072 sym