Publications by David Smith
Oracle’s Big Data Appliance to include R
At the Oracle OpenWorld conference in San Francisco today, Oracle announced the new Oracle Big Data Appliance, “a new engineered system that includes an open source distribution of Apache™ Hadoop™, Oracle NoSQL Database, Oracle Data Integrator Application Adapter for Hadoop, Oracle Loader for Hadoop, and an open source distribution of R.�...
3647 sym
Slides and replay for "Backtesting FINRA’s Limit Up/Down Rules" available
If you missed last week's webinar on using Revolution R and IBM Netezza to analyze the effectiveness of new rules intended to prevent another financial “Flash Crash“, you can watch a replay by filling in this form. Once the replay begins, you can download the slides by clicking the “Download” button that appears below the media player. Re...
881 sym
Webinar Oct 13: Successful uses of R in Banking
On Thursday October 13, Hong Ooi from ANZ (Australia and New Zealand Banking Group) will give a webinar presentation on Successful Uses of R (along with SAS and Excel) in Banking. We've covered Hong's use of R for credit risk analysis here on the blog before, and in next week's webinar he'll take an in-depth look at applying R and SAS to analys...
1879 sym 1 tbl
In case you missed it: September Roundup
In case you missed them, here are some articles from September of particular interest to R users. The deadline to enter the “R Applications” contest with $20,000 in prizes is October 31. The RHadoop Project, a new collection of open-source R packages from Revolution Analytics, makes it possible to write map-reduce jobs in R to analyze huge ...
3726 sym
Because it’s Friday: Reviews of Random Digits
If you dig around enough on Amazon.com, you can find some pretty odd products (like the Badonkadonk tank now sadly unavailable). Attached to these products you can often find a new form of comedy: the funny Amazon review. The products that attract such attention can be hard to fathom: this gallon of milk has more than 1,000 reviews. (Sample: “...
2652 sym
Top 50 Statistics blogs
TheBestColleges.org has just published their list of the “Top 50 Statistics Blogs of 2011“, and I'm pleased say that not only did our own Revolutions blog make the list, but it's in fine company with some truly excellent blogs. Several of my personal favourites made the list, including: Guardian columnist Ben Goldacre's Bad Science blog Th...
1405 sym
Slides and replay for "Introduction to R for SAS and SPSS users"
If you missed last week's webinar from Bob Muenchen, “Introduction to R for SAS and SPSS users“, you missed a great overview of the R Project and how it compares to commercial statistical software. Bob's slides are below, and you can download the slides and replay from the Revolution Analytics website. Bob pointed out a couple of really use...
1844 sym
There’s a lot to like about R
I once heard John Chambers (the inventor of the S language, and member of the R Core Group) say, “Show me a programming language no-one complains about, and I'll show you a language no-one uses”. The R language has its fair share of complainants, to be sure — and that's to be expected for a language with more than 2 million users. R user ...
2018 sym
Tomorrow: ACM Data Mining Camp at eBay
If you're in the Bay Area, tomorrow would be a great day to head down to San José for the ACM Data Mining Camp. Hundreds of data scientists, data hackers and data miners will be there for a fun “unconference”, with talks and practical sessions organized on the spot according to demand. Revolution Analytics is proud to be a continuing sponsor...
1129 sym
Implementing K-means clustering for Hadoop in R and Java
At the Bay Area R User Group meeting this week, Antonio Piccolboni gave an overview of the design goals and implementation of the RHadoop Project packages that connect Hadoop and R: rhdfs, rhbase and rmr: (The image above was captured from Antionio's slides.) The most revealing part of the talk for me was the comparison of implementing the K-mea...
1226 sym 2 img