Publications by David Smith
Slides and replay from "R and Hadoop" webinar
So … there's clearly a lot of interest in integrating R and Hadoop. Today's webinar was a record-setter for Revolution Analytics, with more than 1000 people signing up to learn how to access Hadoop data from R with the packages from the open-source RHadoop project. If you didn't catch the live webinar, don't fret: the slides and replay are avai...
1261 sym 2 img
Are new SEC rules enough to prevent another Flash Crash?
At 2:42PM on March 10 2010, without warning, the Dow Jones Industrial Index plunged more than 1000 points in just 5 minutes. It remains the biggest one-day decline in this stock market index in history. On an intra-day basis, anyway: by the end of the day, the market had regained 600 points of the drop. At the time, the cause of the 2010 Flash C...
4297 sym 2 img
Data Visualization doesn’t need to be biased
At the FlowingData blog, data visualization commentator and Visualize This author Nathan Yau lists 5 misconceptions about visualization: Software does everything (Nathan notes “Personally, I use a lot of R and have a lot of fun in Illustrator”, but uses a lot of other tools as well.) Visualization is for making data flashy The more informa...
3152 sym 4 img
Revolution Analytics partners with Cloudera
Revolution Analytics today announced that it has partnered with Cloudera, the leader in Apache Hadoop-based software and services, to make big-data analytics with Hadoop and R available to Revolution R Enterprise users. As we announced earlier this month, we have created three open-source R packages which make it possible for R users to write ma...
3309 sym
Five new local R user groups
Looks like there's been a lot of activity in the R user community in the Northern hemisphere now that the summer break is over. I've just added several new groups to the Local R User Group Directory: Tokyo, Japan: The Tokyo.R R study group has already had 17 meetings, but has just been added to the directory. Shanghai/East China: The Shangha...
1317 sym
Data Science: a literature review
Just what is Data Science, anyway? Here's one take: Ever since the term “Data Scientist” was coined by DJ Patil and Jeff Hammerbacker in 2009, there's been a vigorous debate on what the team actually means. More than 80% of statisticians consider themselves data scientists, but Data Science is more than just Statistics. (My own take is tha...
3002 sym 4 img
The R Graph Gallery goes social
The R Graph Gallery, the website from Romain François that showcases hundreds of examples of data visualization with R, has new social features. Now, when you find a graph or chart you find appealing or useful, you can “Like” it on Facebook or “+1” it on Google+. This should be a great way of highlighting the best charts and graphs in th...
1052 sym 2 img
A brief introduction to R for SAS and SPSS users
If you've used SAS or SPSS and want a jump-start into the basics of the popular R language, next week's webinar, Introduction to R for SAS and SPSS Users will be of interest to you. While R, SAS and SPSS are all three software systems for data analysis and graphics, the underlying concepts in R are quite different to those in SAS and SPSS. To ...
1839 sym
Obama 2012 campaigning with analytics
The campaign to re-elect US president Barack Obama is hiring — and the RDataMining blog noticed that several of the open positions seek R skills. If you want to be a Communications Analyst, Digital Strategy Analyst, or Statistical Modeling Analyst and you know R, there may be a job opening for you. Just goes to show there's no corner of life ...
829 sym
R 2.13.2 released
The R core team announced today that R 2.13.2 is now available: The byte pixies have rolled up R-2.13.2.tar.gz at 9:00 this morning. This is intended to be the final release of the 2.13 series, for the benefit of those apprehensive of putting 2.14.x into production use. This update fixes a number of minor bugs (for example, pch=”.” will g...
1324 sym Python (5305 sym/1 pcs)