Publications by David Smith

In case you missed it: April 2016 roundup

09.05.2016

In case you missed them, here are some articles from April of particular interest to R users.  Lukasz Piwek recreates classic graphs from Tufte's 'The Visual Display of Quantitative Information' in R. A preview of upcoming R conferences in Europe. Andrie de Vries updates the data on R package growth on CRAN, and finds a segmented regression mo...

2736 sym

Free e-book: Effective Graphs with Microsoft R Open

11.05.2016

R is a powerful system for creating data visualizations. In fact, R gives you so many options for creating charts that it can be hard to know the best way to communicate effectively. To help you present your data as effectively as possible using R, there's a new (and free) e-book available to download: Effective Graphs with Microsoft R Open. Wri...

2259 sym 4 img

What’s in Pasta Carbonara?

13.05.2016

Apparently, people have strong feelings about how pasta carbonara should be made. A 45-second French video showing a one-pot preparation of the dish with farfalle instead of spaghetti and substituting crème fraîche for most of the cheese — and not even stirring the egg into the pasta to cook it! and it was just the yolk! — caused outrag...

2202 sym 2 img

Documentation for Microsoft R Server now online

16.05.2016

If you've been thinking about trying the big-data capabilities of Microsoft R Server but wanted to check out the documentation first, you're in luck: the complete Microsoft R Server documentation is now available on MSDN (and is accessible to anyone). There's lots to explore here, but a few highlights you might want to check out include: Gettin...

1487 sym 2 img

Spark 2.0: more performance, more statistical models

18.05.2016

Apache Spark, the open-source cluster computing framework, will soon see a major update with the upcoming release of Spark 2.0. This update promises to be faster than Spark 1.6, thanks to a run-time compiler that generates optimized bytecode. It also promises to be easier for developers to use, with streamlined APIs and a more complete SQL imple...

3437 sym R (128 sym/1 pcs) 2 img

Microsoft R Open 3.2.5 now available

20.05.2016

Microsoft R Open 3.2.5 is now available for download. There are no changes of note in the R langauge engine with this release (R 3.2.5 was just a largely a version number increment). There's lots new on the packages front though: Microsoft R Open 3.2.5 has a default CRAN snapshot date of May 1, 2016 and there was plenty of updates on CRAN in the ...

1361 sym 2 img

Feather: fast, interoperable data import/export for R

23.05.2016

Unlike most other statistical software packages, R doesn't have a native data file format. You can certainly import and export data in any number of formats, but there's no native “R data file format”. The closest equivalent is the saveRDS/loadRDS function pair, which allows you to serialize an R object to a file and then load it back into a...

3365 sym 4 img

Predictive Maintenance for Aircraft Engines

25.05.2016

Recently, I wrote about how it's possible to use predictive models to predict when an airline engine will require maintenance, and use that prediction to avoid unpleasant (and expensive!) delays for passengers on the ground. Planes generate a lot of data that can be used to make such predictions: today’s engines have hundreds of sensors and ...

2499 sym 2 img

An object has no name

27.05.2016

No, it's not a Jaqen H'ghar quote. Recently, Hadley Wickham tweeted the following image: While this image isn't included in Hadley's Advanced R book, he does discuss many of the implications there. The most significant of these is that creating a copy of an object in R doesn't consume any additional memory. (Most of the time, anyway: there ar...

2660 sym 2 img

Visualizing a flood with R

03.06.2016

As more settlements in Texas and France are impacted by severe flooding, this is a good time to thank the hydrologists at the NOAA who forecast river level rises in advance and give residents in affected areas time to move to higher ground. Along with topgraphic, rainfall, and weather data, monitoring stations maintained by NOAA and the USGS al...

1574 sym 2 img