Publications by David Smith
A visual data summary for data frames
If you want to get a quick numerical summary of a data set, the summary function gives a nice overview for data frames: > require(ggplot2) Loading required package: ggplot2 > data(diamonds) > summary(diamonds) carat cut color clarity depth table Min. :0.2000 Fair : 1610 D: 6775...
846 sym R (1347 sym/2 pcs)
The grade level of Congress speeches, analyzed with R
As widely reported by CNN, the Huffington Post, Talking Points Memo, the sophistication of speeches by US politicians has declined in recent years, dropping from an 11th-grade level in 2005 to a 10th-grade level today. The reports are based on an analysis by the Sunlight Foundation, based on textual analysis of congressional speeches given since...
1871 sym 4 img
NYT charts the Facebook IPO with R
In conjunction with Facebook's record-setting IPO last Thursday, the New York Times created an infographic to put the size of the offer in context with other recent IPOs. A detail of the graphic as it appeared in the print edition appears below: ChartsNThings gives a fascinating peek into the weeklong process that went into creating this chart, ...
2552 sym 4 img
Facebook-class social network analysis with R and Hadoop
In computing, social networks are traditionally represented as graphs: a connection of nodes (people), pairs of which may be connected by edges (friend relationships). Visually, the social networks can then be represented like this: Social network analysis often amounts to calculating the statistics on a graph like this: the number of edges (fri...
2021 sym R (142 sym/1 pcs) 4 img
R 2.15.1 scheduled for June 22
The next release of open-source R, codenamed “Roasted Marshmallows”, is scheduled to be released on June 22, according to this announcement on the r-announce mailing list. Don't expect too many changes in this update: despite the fact that “there have been very few issues with 2.15.0 … some people may be waiting superstitiously for a .1 ...
1084 sym
R Tops Data Mining Software Poll
For the past 12 years, KDNuggets has conducted an annual poll asking “What analytics/data mining software you used in the past 12 months for a real project (not just evaluation)”. In this year's poll, R was the top-ranked data mining solution, selected by 30.7% of poll respondents. Microsoft Excel was second, at 29.8%. Rapidminer, which took ...
1186 sym
The influences that shaped R: Inferno-ish R
Patrick Burns, author of the excellent R Inferno, gave a presentation about R at the Cambridge R User Group this week. (Revolution Analytics is a proud sponsor of CambR.) I wasn’t at the presentation myself, but Pat always gives a great talk, and he’s generously provided his slides with copious notes. They’re definitely worth a read if ...
2453 sym 2 img
Applications of R in Government
Following the announcement of the US Government Big Data Initiative, I was asked to write a small article about applications of R in government. The article has just appeared in Government Security News (and I believe will appear in their daily newsletter tomorrow). In the article, I highlighted several R applications that been highlighted here ...
1965 sym
Announcing Revolution R Enterprise 6.0
Revolution Analytics is proud to announce the latest update to our enhanced, production-grade distribution of R, Revolution R Enterprise. This update expands the range of supported computation platforms, adds new Big Data predictive models, and updates to the latest stable release of open source R (2.14.2), which improves performance of the R i...
3162 sym
June 29: The Royal Statistical Society talks R
On June 29, the Royal Statistical Society will host six hours of presentations about R, under the banner “To R, or not to R, that is the question“. Speakers include R Core member Prof Uwe Ligges, Dr Wayne Jones (from Shell UK), and Dr Peter Nash (from Imperial College London), among others. The event takes place June 29 at the RSS's histori...
1035 sym