Publications by Andrie de Vries

Using survival models for marketing attribution

23.07.2013

by Andrie de Vries Prior to joining Revolution Analytics in March this year, I spent several years in the field of market research and survey analytics. During this period, I spent a few months consulting to a digital marketing agency based in London. My role was to help build their capability in building customer surveys and integrating these ...

3389 sym

Reading data from the new version of Google Spreadsheets

03.06.2014

Spreadsheets remain an important way for people to share and work with data. Among other providers, Google has provided the ability to create online spreadsheets and other documents. Back in 2009, David Smith posted a blog entry on how to use R, and specifically the XML package to import data from a Google Spreadsheet. Once you marked your Google...

2335 sym R (2917 sym/5 pcs) 2 img

Dependencies of popular R packages

08.07.2014

With the growing popularity of R, there is an associated increase in the popularity of online forums to ask questions. One of the most popular sites is StackOverflow, where more than 60 thousand questions have been asked and tagged to be related to R. On the same page, you can also find related tags. Among the top 15 tags associated with R, sever...

2126 sym R (2525 sym/3 pcs) 2 img

Revisiting package dependencies

29.07.2014

by Andrie de Vries In my previous post I wrote about how to identify and visualize package dependencies.  Within hours, Duncan Murdoch (member of R-core) identified some discrepancies between my list of dependencies and the visualisation.  Since then, I fixed the dispecrancies. In this blog post I attempt to clarify the issues involved in lis...

3383 sym R (2498 sym/3 pcs) 4 img

Introducing miniCRAN: an R package to create a private CRAN repository

03.10.2014

by Andrie deVries One of the reasons that R is so popular is the CRAN archive of useful packages. However, with more than 5,900 packages on CRAN, many organisations need to maintain a private mirror of CRAN with only a subset of packages that are relevant to them. The package miniCRAN makes this possible by determining the dependency tree for a g...

2165 sym R (50 sym/1 pcs)

Introducing the Reproducible R Toolkit and the checkpoint package

13.10.2014

The ability to create reproducible research is an important topic for many users of R. So important, that several groups in the R community have tackled this problem. Notably, packrat from RStudio, and gRAN from Genentech (see our previous blog post). The Reproducible R Toolkit is a new open-source initiative from Revolution Analytics. It takes ...

3552 sym 6 img

How the MKL speeds up Revolution R Open

22.10.2014

by Andrie de Vries Last week we announced the availability of Revolution R Open, an enhanced distribution of R.  One of the enhancements is the inclusion of high performance linear algebra libraries, specifically the Intel MKL. This library significantly speeds up many statistical calculations, e.g. the matrix algebra that forms the basis of man...

4505 sym R (24 sym/1 pcs) 4 img 1 tbl

An update to the checkpoint package

18.02.2015

by Andrie de Vries During October 2014 we announced RRT (the Reproducible R Toolkit) that consists of the checkpoint package and the MRAN. In January, David Smith followed up with another post about reproducibility using Revolution R Open. Since then, we've had several requests for new features and enhancements. The development code for checkpo...

3402 sym

Monitoring progress of a foreach parallel job

24.02.2015

by Andrie de Vries R has strong support for parallel programming, both in base R and additional CRAN packages. For example, we have previously written about foreach and parallel programming in the articles Tutorial: Parallel programming with foreach and Intro to Parallel Random Number Generation with RevoScaleR. The foreach package provides simpl...

5711 sym 2 img

Creating progress bars with foreach parallel processing

10.03.2015

by Andrie de Vries In my previous post, I demonstrated how to get some status of running jobs on a parallel back end. However, I stopped short of actually demonstrating progress bars. In this post I demonstrate how to do this. The StackOverflow question How do you create a progress bar when using the “foreach()” function in R? () asks the qu...

1996 sym 2 img