Publications by David Smith

Data Mining with R

08.06.2012

Earlier this week, Revolution Analytics' Joe Rickert gave a webinar Introduction to R for Data Mining. You can watch the replay below: If you're already familiar with R and the basics of data mining, you might want to skip ahead to the 13-minute mark where Joe's live demo begins. There you'll see practical examples of using R for decision trees,...

1228 sym

Data distillation with Hadoop and R

11.06.2012

We're definitely in the age of Big Data: today, there are many more sources of data readily available to us to analyze than there were even a couple of years ago. But what about extracting useful information from novel data streams that are often noisy and minutely transactional … aye, there's the rub.   One of the great things about Hadoop i...

6548 sym 8 img

In case you missed it: May 2012 Roundup

13.06.2012

In case you missed them, here are some articles from May of particular interest to R users. R tops the annual KDNuggets Data Mining Software poll for the first time. R 2.15.1 is scheduled for June 22. (Revolution R Enterprise 6, released on June 5, is based on 2.14.2.) A tutorial uses R, Hadoop, and the RHadoop project to simulate and analyze a...

2545 sym

Revolution Newsletter: June 2012

14.06.2012

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full June edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Revolution R Enterprise 6 Now Available! The latest release of Revolution R...

3087 sym

More on birthday probabilities

15.06.2012

Last week, Joe Rickert used R and four years of US Census data to create an image plot of the relative probabilities of being born on a given day of the year: Chris Mulligan also tackled this problem with R, but this time using 20 years of Census data from 1969 to 1988. Chris extracted the birthday frequencies using Google BigQuery, and charted ...

1575 sym 4 img

June 20: See the new features of Revolution R Enterprise 6

18.06.2012

A quick heads-up that I'll be hosting a live webinar this Wednesday (June 20) with my colleage Sue Ranney on the new Revolution R Enterprise 6. If you've never taken a look at Revolution R Enterprise and want to know it's different from open-source R, or just want to learn about the new features, then please join us on Wednesday by registering at...

853 sym

CIO.com: R is a Big Data open-source technology to watch

19.06.2012

CIO.com recently published its list of 9 open-source technologies to watch. Hadoop is first on the list, and second up is the R Project: R is an open source programming language and software environment designed for statistical computing and visualization. R was designed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zeala...

1196 sym

UseR 2012 highlights

20.06.2012

The eighth annual R user conference, UseR! 2012, has come and gone — and what an event it was! I’ve been to five useR! conferences so far, and each one improves upon the last. This year’s conference at Vanderbilt was the best so far: an outstanding location (my first visit to Nashville, a great city), excellent facilities (the lecture rooms...

3982 sym 2 img

FDA: R OK for drug trials

21.06.2012

In a poster (PDF) presented at the UseR 2012 conference, FDA biostatistician Jae Brodsky reiterated the FDA policy regarding software used to prepare submissions for drug approvals with clinical trials: Sponsors may use R in their submissions. The FDA does not endorse or require any particular software to be used for clinical trial submissions, ...

3186 sym 2 img

R 2.15.1 includes performance improvements inspired by dataframe package

22.06.2012

The latest update to open-source R, R 2.15.1, was released this morning. (You can grab sources now, and binary versions will hit the CRAN mirrors over the next couple of days.) In addition to several new features and bug fixes (including the new globalVariables function, which will be a boon to package developers), this update also includes some...

2053 sym