Publications by David Smith
The Netflix Prize, Big Data, SVD and R
One of the key data analysis tools that the BellKor team used to win the Netflix Prize was the Singular Value Decomposition (SVD) algorithm. As a file on disk, the Neflix Prize data (a matrix of about 480,000 members' ratings for about 18,000 movies) was about 65Gb in size — too large to be read into the standard in-memory data model of open-s...
1117 sym
Highlights from R/Finance 2011 presentations
Patrick Burns offers his selections from the presentations at the R/Finance 2011 conference. Check out his post for overviews of some great presentations (and truly, there's some awesome content available to download). I'll add another of my favourites: Bryan Lewis's presentation of his interface from R to the betfair betting market. (But if you ...
942 sym
The residuals of crime
Real-estate search website Trulia has a new tool to help you in your choice of a new home: crime maps. With local police forces being much better about sharing data crime maps are nothing new, but Trulia takes it to the next level with a slick user interface for navigating US cities, a beautiful heat-map visualization of crime hot-spots and — m...
1955 sym 2 img
In case you missed it: May Roundup
In case you missed them, here are some articles from May of particular interest to R users. A review of “R Cookbook”, a new how-to book for R programmers. A detailed example of using the RevoScaleR package to analyze a large airline data set. A new guide for R beginners, “How to Learn R”, provides links to R resources, blogs and courses...
2956 sym
R for Data Mining
Statistics and data mining often get bundled together, but (in my opinion), they're generally different practices with different goals. As a language designed for statistics, much of R's core functionality is focused on exploring and understanding data: model design, inference, and visualization. But when your goal is simply to get the best pre...
1421 sym
The ‘Big Analytics’ Revolution Starts with R: Webinar June 14
On Tuesday next week I'll be teaming up with Revolution Analytics' Mike Minelli to give a 30-minute webinar to introduce executives to R, Big Data, and applications of advanced analytics. If there's someone in your company who needs to know about the impact of R on getting value out of data, they can register here. Here's the agenda: The 'Big An...
1940 sym
Real-time Analytics for Capital Markets with Revolution R
In the 2011 edition of the Sybase Capital Markets Guide, Revolution Analytics CTO David Champagne talks about the need for up-to-date analytics in Finance, and how you can integrate Revolution R with quality real-time data sources. Here's an excerpt: R represents a radically different approach to the challenges posed by analyzing increasingly la...
1902 sym
The R-Files: Jeroen Ooms
“The R-Files” is an occasional series from Revolution Analytics, where we profile prominent members of the R Community. Name: Jeroen Ooms Background: Ph.D. Candidate, Statistics, UCLA Nationality: Netherlands Years Using R: 3 1/2 Known for: Developing web applications for popular R packages including ggplot2, lme4, stockplot and irttool Jer...
3617 sym 4 img 1 tbl
Hot Job in IT: Data Science
CIO Magazine today has an article on the “6 Hottest New Jobs in IT” in which features Data Science and R at #2: “There's now an intellectual consensus in business that the only way to run an enterprise is to use analytics with data scientists to find opportunities,” says Norman Nie, CEO of Revolution Analytics, which produces the first...
1288 sym
The Big Analytics Revolution starts with R
Thanks to everyone who attended our webinar The 'Big Analytics' Revolution Starts with R yesterday. If you missed the live session, you can download the presentation slides (PDF) and the 30-minute replay video (WMV) from the Revolution Analytics website. The presentation focuses on the isse of Big Data, and how businesses can use advanced anal...
1266 sym