Publications by David Smith

Ready-made model comparison tables for journals

15.10.2012

If you're reporting on the results of a statistical analysis for a journal or report, you'll probably be building a table comparing two or models. Such tables may include variables in the model, parameter estimates, and p-values, and model summary statistics. If you want to include such tables based on lm, glm, svyglm, gee, gam, polr, survreg or ...

1301 sym 2 img

Vendor news: TIBCO’s proprietary R runtime; Teradata’s appliance integrates R

17.10.2012

In a webinar today previewing Spotfire 5 (scheduled for release this November), TIBCO announced that it will include TERR: The Tibco Enterprise Runtime for R. TERR is a closed-source reimplementation of the R language engine, and not based on the GPL-licensed R project from the R Foundation. Here's the relevant slide from the webinar: By making ...

2763 sym 2 img

The rapidly increasing ideology of the US Republican Party

18.10.2012

The chart below comes by way of the is.R blog and shows the average ideology of the members of the United State House of Representatives within the Republican (red) and Democratic (blue) parties. (Other parties are shown in green.) The chart is shown as a time series, from the first US congress in 1789, to the most recent full congress (the 111th...

2216 sym 2 img

Because it’s Friday: 7 billion-person ‘continents’

19.10.2012

The population of the world has been over 7 billion for about a year now. But those seven billion aren't distributed equally around the globe. 1.2 billion people — about  in India alone (despite it havingjust 2% of the world's land area). At the other end of the spectrum, the entire continent of Australia houses about 0.3% of Australia. So wha...

2068 sym 2 img

Eight new R User Groups worldwide

22.10.2012

There are new local R user groups in eight (!) countries to announce this month: Sweden is host to the first R user group in Scandinavia. StockholmR has been holding meetings since September, and their next meeting on November 28 will be on Teaching R and Data visualization using R. In Taiwan, the Taipei-based Taiwan useR Group holds regular ...

2202 sym

Two Talks on Data Science, Big Data and R

23.10.2012

On Thursday next week (November 1), I'll be giving a new webinar on the topic of Big Data, Data Science and R. Titled “The Rise of Data Science in the Age of Big Data Analytics: Why Data Distillation and Machine Learning Aren’t Enough“, this is a provocative look at why data scientists cannot be replaced by technology, and why R is the idea...

4387 sym

Quick notes from Strata NYC 2012

24.10.2012

The O'Reilly Strata conferences are always great fun to attend, and this latest installment in New York City is no exception. This one is super-busy though; the conference has been sold out for weeks — and not just marketing-sold-out, it's fire-department-sold out. It's non-stop conversations and presentations, and it's tough to move through th...

2718 sym

Allstate compares SAS, Hadoop and R for Big-Data Insurance Models

25.10.2012

At the Strata conference in New York today, Steve Yun (Principal Predictive Modeler at Allstate's Research and Planning Center) described the various ways he tackled the problem of fitting a generalized linear model to 150M records of insurance data. He evaluated several approaches: Proc GENMOD in SAS Installing a Hadoop cluster Using open-sour...

3309 sym 2 img 1 tbl

R 2.15.2 now available

26.10.2012

As promised, the source distribution for R 2.15.2 is now available for download from the master CRAN repository. (Binary distributions for Windows, MacOS and Linux will be available from the CRAN mirror network in the coming days.) This latest point-update — codenamed “Trick or Treat” — improves the performance of the R engine and adds a ...

1630 sym

Tracking Hurricane Sandy with Open Data and R

29.10.2012

Hurricane Sandy is shaping up to be a major, and very dangerous, meteorological event for the US's East coast. Naturally, everyone is looking for the latest information and forecasts. Fortunately, the wealth of public meteorological data available on the open web, combined with real-time on-the-ground updates via social media, means that an ecosy...

1572 sym 2 img