Publications by Joseph Rickert

ACM Data Mining Camp 2011: Report

18.10.2011

(By Joseph Rickert.) In San Jose topics like big data, map reduce, predictive models, mobile analytics and crowdsourcing draw a crowd even on a Saturday. So it turned out that the ACM data Mining Camp and “un-conference” was a very “happening” way to spend a Saturday. Over 500 people attended the event at the Ebay “Town Hall” on North...

4452 sym

Review of "The Art of R Programming" by Norman Matloff

29.11.2011

By Joseph Rickert Anyone seeking to learn R faces two major challenges: (1) learning how to swim in the sea of information: R packages, books, websites, blog posts, message boards etc. that threatens to drown a newbie and (2) and coming to grips with the structure, syntax and features of the language itself. Having some idea of what one wants to ...

3895 sym

The Bay Area R User Group Meeting on Data Mining with R

16.12.2011

By Joseph Rickert Put up a poster that says something like “Data Mining with R” anywhere in the Bay Area and you will surely draw a crowd. But it was still a bit of a surprise that the monthly meeting of the Bay Area R User’s group was so well attended. At one point there were 160 people on the meetup list signed-up to attend the event, and...

5235 sym

Review of ‘R in Action’ by Robert I. Kabacoff

20.12.2011

By Joseph Rickert Yesterday, the cosmic randomizer placed me next to a newly minter lawyer in a crowed Los Gatos coffee shop. In three minutes of conversation I learned that that the fellow was interested in corporate law, was about to take a job that would give him a seat in the great VC/start-up game and that he had some understanding of statis...

3761 sym R (254 sym/1 pcs) 2 img

Coefplot: New Package for Plotting Model Coefficients

03.01.2012

By Joseph Rickert Even to the practiced eye, looking at coefficients in R model summaries can be tedious. And, capturing information about the significance of coefficients from scores or maybe even hundreds of models in a way that makes writing the final report a bit easier is a time consuming and thankless task. Of course, once you know what you...

2398 sym 2 img 1 tbl

Simple tools for building a recommendation engine

19.04.2012

By Joseph Rickert Revolution’s resident economist, Saar Golde, is very fond of saying that “90% of what you might from a recommendation engine can be achieved with simple techniques”. To illustrate this point (without doing a lot of work), we downloaded the million row movie dataset from www.grouplens.org with the idea of just taking the fi...

5076 sym 3 tbl

Simulating the Birthday Problem with data derived probabilities

06.06.2012

You've probably heard of the Birthday Paradox: it only takes a small gathering of people before it's quite likely that two of them share the same birthday. You can solve the problem analytically or with simulation, but usually in either case simplifying assumptions are made (no-one born on February 29, for example). Joe Rickert uses Revolution R ...

6667 sym R (489 sym/1 pcs) 8 img

Benchmarking bigglm

13.11.2012

By Joseph Rickert In a recent blog post, David Smith reported on a talk that Steve Yun and I gave at STRATA in NYC about building and benchmarking Poisson GLM models on various platforms. The results presented showed that the rxGlm function from Revolution Analytics’ RevoScaleR package running on a five node cluster outperformed a Map Reduce/ H...

5113 sym R (1133 sym/3 pcs)

A Review of the R Graphics Cookbook

11.02.2013

A common criticism of R, especially from data scientists who are new to R but proficient in multiple programming languages, is that R is “quirky” and annoying because there is almost always more than one way to do simple things.  I usually counter that they are trying to say that R is “flexible” and “rich”, but by the time we get aro...

6330 sym 4 img

Data Science Education gets personal

14.03.2013

by Joseph B. Rickert It is difficult to imagine that there is anyone on the planet with an internet connection and a desire to learn something new who has not at least looked into taking a massive open online course (MOOC). Last Fall, in an 11/4/12 article, the New York Time declared the Year of the MOOC and quoted one of Coursera’s founders, A...

4435 sym