Publications by David Smith

CRAN now has 5000 R packages

08.11.2013

Prof. Ripley today announced on the r-devel mailing list that CRAN now has it's 5000th R package: Package 'quint' brought the number of packages on CRAN (for all platforms: some are Windows-only or non-Windows only) to 5000 a few minutes ago: see http://cran.r-project.org/web/packages/index.html. That's quite a milestone! The number of CRAN pa...

1584 sym 2 img

In case you missed it: October 2013 Roundup

11.11.2013

In case you missed them, here are some articles from October of particular interest to R users: Joe Rickert recounts the R presence at the Strata + Hadoop World conference, including slides from the R and Hadoop tutorial. Hadley Wickham's favorite tools, gadgets and software (including of course R).  Revolution R Enterprise 7 is announced, wit...

2756 sym

A detailed guide to memory usage in R

12.11.2013

R is designed as an in-memory application: all of the data you work with must be hosted in the RAM of the machine you're running R on. This optimizes performance and flexibility, but does place contraints on the size of data you're working with (since it must all work in RAM). When working with large data sets in R, it's important to understand h...

1717 sym

What Data Science can learn from small-data Statistics

15.11.2013

Last month I joined Gregory Piatetsky (KDnuggets editor) for a webinar presentation Data Science: Not Just for Big Data, hosted by Kalido. In my portion of the prentation (you can see my slides below), I wanted to react to the Big Data focus which is so much a part of the Data Science movement today, to focus on the issues that with all data sets...

1534 sym

Iterators in R: a tutorial

18.11.2013

Iterators — object-oriented programming constructs that act as a pointer in an ordered sequence — are familiar to programmers of languages like Python, but are not a standard part of the R language. Nonetheless, by installing the iterators package (an open-source contribution by Revolution Analytics) you can create and manipulate iterator ob...

1281 sym

Getting started with R, for Stata users

19.11.2013

If you learned statistics using  Stata software but have an interest in learning the R language, it's worth checking out R~Stata: Notes on Exporing Data by Princeton's Oscar Torres-Reyna. D-Lab's Laura Nelson provides an overview, but in short it's a collection of 30 PDF slides that introduces R for Stata users, and provides translation tables...

1093 sym 2 img

The rise of R as the language of analytics

22.11.2013

It's no coincidence that while the usage of the R language is skyrocketing (as shown in the recent Rexer Analytics and KDNuggets polls), the growth in data scientist jobs is also skyrocketing. R is the lingua franca of data science, and as the pervasive statistical software in the academic sector, there's a steady-stream of newly-minted graduates...

1898 sym

Try out R online with R-Fiddle

25.11.2013

It's pretty easy (and free!) to download R and install it on your own PC, Mac or Linux machine, but if you don't have one of those or simply aren't ready to commit to installing it, you can now try it out online. R-Fiddle (from DataMind) provides an easy-to-use interactive R console that you can run from your browser. Here's an example: The top ...

1274 sym 2 img

Happy Thanksgiving from Revolution Analytics

28.11.2013

> require(devtools) > install_github("cowsay","SChamberlain") > require(cowsay) > say("Happy Thanksgiving!",by="chicken") ----- Happy Thanksgiving! ------ \ \ _ _/ } `>' \ `| \ | /'-. .-. \' ';`--' .' \'. `'-./ '.`-..-;` `;-..' ...

565 sym R (376 sym/1 pcs)

Tutorial: Basic data processing with R

02.12.2013

R can do a lot of really amazing things, but to use just about any of R's many features you need to first import your data and get it into the appropriate shape. For R beginners, this “data wrangling” task can be daunting.  Fortunately, ComputerWorld's Sharon Machlis has created an in-depth tutorial for many data preparation tasks, which is...

1473 sym