Publications by David Smith

Who did HBGary contact the most?

21.02.2011

Following on from Friday's post about the travails of internet security firm HBGary, R user Michael Bommarito has done an analysis of the leaked emails to find the top 20 most contacted email addresses and the top 20 most referenced internet domains. There are some interesting names on those lists, to be sure. Check them out at the link below. M...

799 sym

Course: Machine Learning with R

22.02.2011

Starting on March 5 at the Hacker Dojo in Mountain View (CA), Mike Bowles and Patricia Hoffmann will present a course on Machine Learning where R will be the “lingua franca” for looking at homework problems, discussing them and comparing different solution approaches. The class will begin at the level of elementary probability and statistic...

981 sym

What’s the best platform for a high score on Canabalt?

23.02.2011

The Web-based Flash game Canabalt, whose scores have been analyzed by R before, is now available as an iOS App. Because the app is configured to work on three different platforms: the iPad, iPhone and iPod Touch; and because players are invited to tweet their best scores at the end of the game, like this: the Twitter stream again becomes a grea...

1742 sym 4 img

Packages for By-Group Processing in R

24.02.2011

Analyst and BI expert Steve Miller takes a look at the facilities in R for doing “by-group” processing of data. The task consisted of: … read several text files, merge the results, reshape the intermediate data, calculate some new variables, take care of missing values, attend to meta data, execute a few predictive models and graph the res...

1917 sym

Setting up a parallel computing cluster for R with OpenSSH and doSNOW

25.02.2011

Responding to yesterday's post which included an aside on using parallel processing for by-group computations in R, reader Christian Gunning mused about the possibility of using doSNOW on his network, with OpenSSH to manage the authentication: I sit on a fast campus network and have at least 10 remote cores available that I could farm out for bi...

1238 sym

R 2.12.2 is available

25.02.2011

As previously announced, R 2.12.2 is available for download today. Browsing through the various mirrors (using the Download R tool on inside-R.org), it looks like the Windows version is already available on many mirrors; the Mac and Linux versions will follow soon (and of course, sources are available now). The complete list of changes is in the ...

835 sym 2 img

New open-source IDE for R in beta test

28.02.2011

Rstudio, a new open-source IDE for R, has just been announced and is now available for beta test on Windows, MacOS, and Ubuntu. It works with an existing installation of R (2.11.1 or above) on your desktop, and is also available as a server version for Linux, where the IDE runs in your browser. It looks like an interesting project, and I look for...

843 sym

Mapping the Chicago mayoral election

01.03.2011

Rahm Emanuel is now Mayor of Chicago, having successfully defended a court challenge to his candidacy and then 5 rivals in the February 22 election. The Chicago Tribune has put together an interactive map of the results (color-coded by the winner in each precinct), but for R hackers who would like to create similar maps of other elections, the bl...

1211 sym 2 img

Calling R from JasperReports Server

02.03.2011

Revolution Analytics today announced a partnership with Jaspersoft, the makers of the most widely-used business intelligence software in the world. With this partnership, Revolution R and Jaspersoft software work together to bring the power of analytics coded in R to business users working with Business Intelligence (BI) dashboards and reports. I...

1507 sym

Keep an Eye on the emerging Open-Source Analytics Stack

03.03.2011

This post is contributed by Revolution Analytics CEO Norman Nie, and cross-posted from the Future of Open Source Forum. A lot of attention has been focused recently on Big Data, and rightly so: Big Data is a Big Deal. (See this LinuxInsider article, Big Data, Big Open Source Tools, for a compehensive overview of Big Data issues.) But what, exactl...

6583 sym