Publications by David Smith
R 2.12.1 scheduled for December 16
The next update to R will be a patch release: R 2.12.1 will be released on December 16, as announced today by the R Core Team. As is typical for a patch release, this version will include some minor bug fixes plus a few new features (from the current build's NEWS file): The DVI/PDF reference manual now includes the help pages for all the standar...
1570 sym
Webinar: Revolution R is 100% R and More
I'll be hosting a webinar tomorrow (Wednesday) aimed at R users who want to know more about how Revolution R Enterprise extends open source R for big data, Web services, multi-core processing, debugging and more. For R users at schools and universities, I'll also explain how you can download and use Revolution R Enterprise free of charge. The ful...
2168 sym 1 tbl
Slides from Revolution R: 100% R and More
If you missed today's webcast on Revolution R Enterprise: 100% R and more, the slides from the presentation are now available for download, and a replay of the webcast (in WMV format) will be available at that same link very soon. And if you missed some of the links I mentioned in the presentation, here they are for your convenience: Overview of...
1480 sym
Choosing colors for your charts with RColorBrewer
If you're creating a bar chart in R, how do you decide what colors the bars should be? Or if you're creating an image plot, what range of images should you use? The colors you choose can not only affect the viewer's interpretation of the graphic, it can also determine its aesthetic appeal, too. That's where the RColorBrewer package comes in: it h...
1156 sym 2 img
An R interface to the Google Prediction API
An the New York R User Group* last night, 100 R users heard Ni Wang and Max Lin talk explain how “R is one of the important tools used by analysts and engineers at Google for analyzing data”. During the talk, Lin revealed that Google plans to make “R more integrated with internal machine learning algorithms and infrastructure”, and one co...
2160 sym R (645 sym/1 pcs)
Machine Learning and Data Mining with R
The San Francisco Bay Area ACM runs several courses on data mining and machine learning with R. Machine Learning 101 deals primarily with supervised learning problems, and Machine Learning 102 covers unsupervised learning and fault detection. Machine Learning 101 & 102 were most recently presented by Mike Bowles & Tricia Hoffman in September, an...
2499 sym
Facebook’s Social Network Graph
Paul Butler, an intern on Facebook’s data infrastructure engineering team, was interested in visualizing the “locality of friendship”. Luckily, he has some great data to work with: Facebook's social network of the friendships between its 500 million members. But visualizing that much data can be a challenge in its own right — it takes ski...
2973 sym 2 img
Data Driven Journalism
Last night at the Bay Area UseR Group meeting, Peter Aldhous, San Francisco Bureau Chief of New Scientist Magazine, gave an inspiring presentation about Data Driven Journalism. Even though the newspaper industry is faltering as a business model, there's a beacon of light: journalists can be the driving force behind bringing the meaning in the hug...
3001 sym
R 2.12.1 is out
As promised, the latest patch to R is out with the release of R 2.12.1, as announced today by the R Core Team. If you build R yourself, sources are available now at your local CRAN mirror, and binaries for Windows, Mac and Linux will be available in the next few days. There are a few new features: The DVI/PDF reference manual now includes the he...
1712 sym
Programming languages, ranked by popularity
In a presentation to the Chicago R User Group last night, Drew Conway used his new Infochimps package in R to assess the relative popularity of programming languages. Drew used the word.stats function in the Infochimps package to count the frequency of common computer languages mentioned in Twitter messages, and displayed the results in this bar ...
2057 sym 4 img