Publications by David Smith
Poll: R top language for data science three years running
KDDNuggets has completed its annual poll of top languages for analytics, data mining and data science, and just as in the prior two years the R language is ranked the most popular. R is used by almost 61% of respondents: R's usage grew year over year as well, up 16% compared to the 2012 poll. By contrast, the rate of SAS usage has been flat, at ...
948 sym 2 img
How to build a single-node Hadoop/R system
The best way to learn any software is to use it, and if you're new to Hadoop and want to try using Hadoop with R the process of setting up your own Hadoop cluster can be daunting (to say the least). But if learning is the goal, the key is that you don't need to install a full cluster. All you need is your own machine, and the ability to install s...
1574 sym
Two presentations on Big Data Big Analytics with R
Last week, Revolution Analytics' US Chief Scientist Mario Inchiosa gave a presentation on high-performance predictive analytics in R and Hadoop, showing how Revolution R Enterprise 7 will bring the high-performance predictable algorithms of ScaleR to run on Cloudera and Hortonworks Hadoop clusters, while retaining the same easy-to-use interface f...
1331 sym
R in the news
R has been featured in a couple of recent articles in the tech press. Last month, Data Informed's feature article 5 Key Considerations When Choosing Open Source Statistics Software suggested R for its analytics capabilities: Certainly, the statistical language R, for instance, is these days hugely popular—not least because it’s free, rather...
1703 sym
An animated peek into the workings of Bayesian Statistics
One of the practical challenges of Bayesian statistics is being able to deal with all of the complex probability distributions involved. You begin with the likelihood function of interest, but once you combine it with the prior distributions of all the parameters, you end up with a complex posterior distribution that you need to characterize. Si...
2376 sym 2 img
Coursera’s free R courses are running again soon
The massively-online open course (MOOC) platform Coursera has already delivered two essential free courses for anyone who wants to learn the R language. Computing for Data Analysis, presented by Roger Peng, covers the basics of R programming. The follow-up course Data Analysis, presented by Jeff Leek, covers statistical modeling and data visuali...
1512 sym
In case you missed it: August 2013 Roundup
In case you missed them, here are some articles from August of particular interest to R users: A tutorial on parallel programming with the foreach, doMC and doSNOW packages. Joe Rickert reviews R's capabilities for linear algebra, sparse matrices and big matrices. How R is disrupting the insurance industry with big data. Revolution Analytics ...
2929 sym
Results of survey of statisticians at JSM 2013 conference
During the 2013 JSM (Joint Statistics Meetings) Conference in Montreal, Revolution Analytics conducted a survey of attendees from August 5 to August 8. The 865 respondents gave their opinions on the privacy and ethics related to data collection, and on their familiarity with statistical software used for the analysis of such data. Out of the 865 ...
2865 sym R (1038 sym/7 pcs) 8 img
Putting the R in Cloudera and Hortonworks Hadoop
Datanami interviews Revolution Analytics' Bill Jacobs about the upcoming Revolution R Enterprise 7, which will be available later this year. A key feature of this release is that that the big-data predictive analytics R functions in the ScaleR package will run on data situated in a Hadoop cluster, and use the parallel computational power of the H...
3170 sym
Revolution Newsletter: September 2013
The most recent edition of the Revolution Newsletter is now available. In case you missed it, the news section is below, and you can read the full September edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Rrrr, Mateys! September 19th is Int...
6909 sym