Publications by Andrie de Vries

Parameters and percentiles (the gamma distribution)

07.10.2015

by Andrie de Vries In one of John D. Cooke‘s blog posts of 2010 (Parameters and Percentiles), he poses the following problem: The doctor says 10% of patients respond within 30 days of treatment and 80% respond within 90 days of treatment. Now go turn that into a probability distribution. That’s a common task in Bayesian statistics, capturing ...

2785 sym 4 img

A preview of using Revolution R Enterprise inside SQL Server

14.10.2015

by Andrie de Vries The second week of SQLRelay (#SQLRelay) kicked off in London earlier this week. SQLRelay is a series of conferences, spanning 10 cities in the United Kingdom over two weeks. The London agenda included 4 different streams, with tracks for the DBA, BI and Analytics users, as well as a workshop track with two separate tutorials. ...

2180 sym

Updates to the foreach package and its friends

21.10.2015

By Andrie de Vries Earlier this month Rich Calaway, programme manager at Microsoft and maintainer of the foreach package, published some updates to the foreach suite of packages, including: foreach iterators doMC doParallel doSNOW Most of the changes were cosmetic, or to conform to CRAN policy. However, the last two packages (doParallel and doS...

2221 sym 2 img

Edge cases in using the Intel MKL and parallel programming

26.10.2015

by Andrie de Vries Recently we had a question on the public mailing list for Revolution R Open (RRO), on the topic of “MKL multithreaded library and mclapply do not play well together“. If you're not familiar with these topics, here is a quick primer: The Intel MKL is a fast, multi-threaded math library. We bundle the MKL with RRO. The prim...

3742 sym 2 img

Using the wakefield package to easily generate reproducible sample data

05.11.2015

by Andrie de Vries Back in 2011, I asked a question on StackOverflow: “How to make a great R reproducible example?“. This question attracted some great answers, including answers by Hadley Wickham and Joris Meys (co-author of R for Dummies). In June of this year Tyler Rinker added a new answer. Tyler published the wakefield package.  In his...

2062 sym 2 img

Best practices for handling packages in R projects

11.11.2015

by Andrie de Vries For much of my data science work, I want to have the very latest package from CRAN or github.  However, once any work finds it way into production server (where it runs on a regular schedule), I want my environment to be stable. Most importantly, for these projects I want to ensure I have reproducible results. In these cases I...

5612 sym R (336 sym/2 pcs) 2 img

Enhancements to the AzureML package to connect R to AzureML Studio

18.11.2015

by Andrie de Vries We have written on several occasions about AzureML, the Microsoft machine learning studio that is part of the Cortana Analytics suite: Running R in the Azure ML cloud Call R functions from any application with the AzureML package Using miniCRAN in Azure ML In September we announced that the AzureML package for R allows you to...

3226 sym R (905 sym/3 pcs)

How to store and use webservice keys and authentication details with R

25.11.2015

by Andrie de Vries (@RevoAndrie) I frequently get asked the question how you can safely store login details and passwords for use by R, without exposing these details in your script.  Yesterday Jennifer Bryan asked this question on twitter and a small storm of views and tweets erupted. Do we have any sort of consensus whether user’s API key...

4775 sym R (933 sym/6 pcs)

Setting up an Azure Resource Manager virtual machine with RStudio

02.12.2015

by Andrie de Vries I am preparing for a demonstration of functionality of R at a conference next week. For maximum impact, I wanted to use a fast virtual machine in Azure. It is actually very easy to build a fresh machine (cloud or otherwise) that contains R as well as RStudio server.  In essence, all you have to do is: Stand up a minimal machi...

2870 sym 6 img

Securely storing your secrets in R code

16.12.2015

by Andrie de Vries Last month I wrote about How to store and use webservice keys and authentication details, a summary of the options mentioned in a twitter discussion started by Jennifer Bryan. All of the options in my article really stored the secrets in plain text somewhere on your system, but in such a way to minimize the risk of accidental...

4674 sym R (292 sym/1 pcs)