Publications by David Smith
Webinar: High-Performance Analytics with R and Microsoft HPC Server
On April 14 I’ll be giving a new webinar in partnership with Microsoft on High-Performance Computing with R. I’ll be focusing on the new parallel programming capabilities of REvolution R Enterprise 3.1 for Windows, and how to use the features of Microsoft HPC Server to enable computing on clusters. Here’s the complete agenda, and you can re...
1902 sym
Because it’s Friday: Kittens, beware Tufte
Edward Tufte has been a tireless promoter of good infographics, and he’s even taken some controversial steps to rid the world of chartjunk. But now he’s gone too far: Then again, this chart from the Wall Street Journal could lead anyone to felinicide: What’s wrong with a simple bar chart, WSJ? Mark Goetz: My New Wallpaper (via @sarahd23 a...
751 sym 4 img
Charting SVN commits with R
Want to get a quick sense of who are the most active committers to your SVN project? Using just a few lines of R code and the SVN log file, reader and new R user Rhys Kidd created this chart to review commits to the Freespace 2 Source Code Project: Rhys posts the 6 lines of R code to create the plot in this forum post, and I recreate it here (...
2096 sym 2 img
R 2.11.0 scheduled for April 22
Announced this morning on the r-announce mailing list is the impending release of R 2.11.0, scheduled for April 22. As usual, the release soon goes into a beta-test phase, with updated sources to be available on the 22nd and binaries to follow a few days thereafter. Related To leave a comment for the author, please follow the link and comment o...
663 sym
Video: Hadley Wickham gives a short course on graphics with R
Hadley Wickham (the creator of the popular ggplot2 graphics package for R) has posted video of a 2-hour short course on Visualisation in R at his blip.tv channel. The video is split into four thirty-minute segments: Basic Graphics Displaying Large Data Data manipulation and transformations Polishing your plots for publication The course is peppe...
1162 sym 2 img
Statistical learning with MARS
Steve Miller at the InformationManagement blog has been looking at predictive analytics tools for business intelligence applications, and naturally turns to the statistical modeling and prediction capabilities of R. Says Steve: The R Project for Statistical Computing continues to dazzle in the open source world, with exciting new leadership at Re...
1409 sym
Future of Open Source Survey – Results
The results of the 2010 Future of Open Source survey were presented at last week’s Open Source Business Conference in San Francisco, and here are they are in slide format: While I was at the presentation I captured a few additional tidbits from the presentation that weren’t in the slides. The continued growth of open-source generally was a p...
1482 sym
Predicting Pizza
What’s the secret to the best pizza in New York? That’s what statistical consultant and R user Jared Lander sought to find out, by analyzing the rankings of NY pizza joints at MenuPages.com, and building a regression model for ratings based on variables like localion, price, number of reviews, and pizza-oven type (gas, coal or wood)? Here’...
1746 sym 2 img
Smoothing time series with R
Smoothing is a statistical technique that helps you to spot trends in noisy data, and especially to compare trends between two or more fluctuating time series. It’s a useful visualization tool that I’m pleased to see cropping up more and more in statistical graphics on the Web — it’s now a staple in econometric charts and is heavily used ...
1572 sym 4 img
Scientists misusing Statistics
In ScienceNews this month, there’s controversial article exposing the fact that results claimed to be “statistically significant” in scientific articles aren’t always what they’re cracked up to be. The article — titled “Odds Are, It’s Wrong” is interesting, but I take a bit of an issue with the sub-headline, “Science fails to ...
2346 sym