Publications by David Smith
CRAN now has 5000 R packages
Prof. Ripley today announced on the r-devel mailing list that CRAN now has it's 5000th R package: Package 'quint' brought the number of packages on CRAN (for all platforms: some are Windows-only or non-Windows only) to 5000 a few minutes ago: see http://cran.r-project.org/web/packages/index.html. That's quite a milestone! The number of CRAN pa...
1584 sym 2 img
In case you missed it: October 2013 Roundup
In case you missed them, here are some articles from October of particular interest to R users: Joe Rickert recounts the R presence at the Strata + Hadoop World conference, including slides from the R and Hadoop tutorial. Hadley Wickham's favorite tools, gadgets and software (including of course R). Revolution R Enterprise 7 is announced, wit...
2756 sym
A detailed guide to memory usage in R
R is designed as an in-memory application: all of the data you work with must be hosted in the RAM of the machine you're running R on. This optimizes performance and flexibility, but does place contraints on the size of data you're working with (since it must all work in RAM). When working with large data sets in R, it's important to understand h...
1717 sym
What Data Science can learn from small-data Statistics
Last month I joined Gregory Piatetsky (KDnuggets editor) for a webinar presentation Data Science: Not Just for Big Data, hosted by Kalido. In my portion of the prentation (you can see my slides below), I wanted to react to the Big Data focus which is so much a part of the Data Science movement today, to focus on the issues that with all data sets...
1534 sym
Iterators in R: a tutorial
Iterators — object-oriented programming constructs that act as a pointer in an ordered sequence — are familiar to programmers of languages like Python, but are not a standard part of the R language. Nonetheless, by installing the iterators package (an open-source contribution by Revolution Analytics) you can create and manipulate iterator ob...
1281 sym
Getting started with R, for Stata users
If you learned statistics using Stata software but have an interest in learning the R language, it's worth checking out R~Stata: Notes on Exporing Data by Princeton's Oscar Torres-Reyna. D-Lab's Laura Nelson provides an overview, but in short it's a collection of 30 PDF slides that introduces R for Stata users, and provides translation tables...
1093 sym 2 img
The rise of R as the language of analytics
It's no coincidence that while the usage of the R language is skyrocketing (as shown in the recent Rexer Analytics and KDNuggets polls), the growth in data scientist jobs is also skyrocketing. R is the lingua franca of data science, and as the pervasive statistical software in the academic sector, there's a steady-stream of newly-minted graduates...
1898 sym
Try out R online with R-Fiddle
It's pretty easy (and free!) to download R and install it on your own PC, Mac or Linux machine, but if you don't have one of those or simply aren't ready to commit to installing it, you can now try it out online. R-Fiddle (from DataMind) provides an easy-to-use interactive R console that you can run from your browser. Here's an example: The top ...
1274 sym 2 img
Happy Thanksgiving from Revolution Analytics
> require(devtools) > install_github("cowsay","SChamberlain") > require(cowsay) > say("Happy Thanksgiving!",by="chicken") ----- Happy Thanksgiving! ------ \ \ _ _/ } `>' \ `| \ | /'-. .-. \' ';`--' .' \'. `'-./ '.`-..-;` `;-..' ...
565 sym R (376 sym/1 pcs)
Tutorial: Basic data processing with R
R can do a lot of really amazing things, but to use just about any of R's many features you need to first import your data and get it into the appropriate shape. For R beginners, this “data wrangling” task can be daunting. Fortunately, ComputerWorld's Sharon Machlis has created an in-depth tutorial for many data preparation tasks, which is...
1473 sym