Publications by David Smith
Dress your R code for the Web with Pretty R
If you have some R code to include in a document, especially a Web-based document like a blog post, the new “Pretty R” feature on inside-R.org can help you make it look its best. Given some raw R code, it will create a HTML version of the code, adding syntax highlighting elements and links. Functions, strings, comments and literals are all co...
1755 sym R (607 sym/1 pcs) 2 img
ACM Data Mining Camp 3
The San Francisco Bay Area chapter of the ACM is will hold its third data mining camp next Saturday (November 13) at the Ebay campus in San José. Like the previous camps, this will be a one-day “unconference”-style event, with an agenda developed ad-hoc on the day according to the interests of the attendee. With data scientists from the like...
1097 sym
Because it’s Friday: Epidemiology in 1632
I first got interested in epidemiology when I saw the famous John Snow chart (in a Tufte book, I think?) which pinpointed the pump which caused the 1854 cholera outbreak in London. For some reason I'd gotten the impression that this was essentially the birth of epidemiology as a discipline, but it's actually been around a lot longer than that. 20...
1251 sym 2 img
The Dataists answer your questions
The fine bloggers (and R experts) at the Dataists have volunteered to answer questions about data analysis on Reddit: A few months ago, a group of likeminded folks in New York and the San Francisco Bay area decided it was time to start a blog about data, and we can up with the Dataists. Since then we thought about a taxonomy of data science, c...
1249 sym
Using R and Hadoop to analyze VOIP data
Last month, the newest member of Revolution's engineering team, Saptarshi Guha, gave a presentation at Hadoop World 2010 on using R and Hadoop to analyze 1.3 billion voice-over-IP packets to identify calls and measure call quality. Saptarshi, of course, is the author of RHIPE, which lets R programmers write map-reduce algorithms in the Hadoop fra...
1199 sym
New R User Group in Houston
The latest local R user group to form is located in Houston, Texas. The first meeting of the Houston R Users Group is tonight at Rice University (in conjunction with the Houston chapter of the ASA). R hackr (typo intended!) Hadley Wickham will be giving a presentation on writing R packages, and you can check out the slides on his SlideShare page....
881 sym
Promote your favorite R functions
The 27 base and recommended libraries of the standard R 2.12 distribution together contain 3556 functions (you can check using the code posted after the jump). Many of the functions are commonly used: c, data.frame, rnorm, lm. But some of those functions, while being extremely useful, may be less well known to many R users. Some examples I'd wish...
1414 sym R (569 sym/1 pcs)
R co-creator Ross Ihaka wins Lifetime Achievement Award in Open Source
The co-creator of R, University of Auckland Associate Professor of Statistics Dr. Ross Ihaka, was yesterday awarded the Catalyst Lifetime Achievement in Open Source Award at the 2010 New Zealand Open Source Awards. From the announcement: Dr. Ihaka is one of the originators of the world-renown ‘R’ programming language and software environme...
1099 sym
Tell Forbes how you use R
Steve McNally of the Forbes Mean Business blog says R is a name you need to know for 2011. He cites some great examples of R in action: Facebook has used R to figure out that “just two data points are significantly predictive of whether a user remains on Facebook: (i) having more than one session as a new user, and (ii) entering basic profile ...
1624 sym
Help Mozilla visualize how people use Firefox
You might recall we posted a couple of weeks ago this chart summarizing the times of the day Firefox users switch on Private Browsing mode: The chart, based on data from the Mozilla Test Pilot program tells an interesting story about the habits of Web users. But what other interesting stories could be told, to reveal more insights into how peopl...
2073 sym 2 img