Publications by David Smith

Studying disease with R: RECON, The R Epidemics Consortium

14.06.2017

For almost a year now, a collection of researchers from around the world has been collaborating to develop the next generation of analysis tools for disease outbreak response using R. The R Epidemics Consortium (RECON) creates R packages for handling, visualizing, and analyzing outbreak data using cutting-edge statistical methods, along with ge...

1783 sym 2 img

Demo: Real-Time Predictions with Microsoft R Server

15.06.2017

At the R/Finance conference last month, I demonstrated how to operationalize models developed in Microsoft R Server as web services using the mrsdeploy package. Then, I used that deployed model to generate predictions for loan delinquency, using a Python script as the client. (You can see slides here, and a video of the presentation below.) With...

1704 sym

Applications of R at EARL San Francisco 2017

16.06.2017

The Mango team held their first instance of the EARL conference series in San Francisco last month, and it was a fantastic showcase of real-world applications of R. This was a smaller version of the EARL conferences in London and Boston, but with that came the opportunity to interact with R users from industry in a more intimate setting. Hopefull...

3134 sym

Using sparklyr with Microsoft R Server

19.06.2017

The sparklyr package (by RStudio) provides a high-level interface between R and Apache Spark. Among many other things, it allows you to filter and aggregate data in Spark using the dplyr syntax. In Microsoft R Server 9.1, you can now connect to a a Spark session using the sparklyr package as the interface, allowing you to combine the data-prepara...

1598 sym

R leads, Python gains in 2017 Burtch Works Survey

20.06.2017

For the past four years, recruiting firm Burtch Works has conducted a simple survey of data scientists with just one question: “Which do you prefer to use — SAS, R or Python“. The results for this year's survey of 1,046 respondents are in: R: 40% (2016: 42%) SAS: 34% (2016: 39%) Python: 26% (2016: 20%) Compared to last year's results, Pyt...

1443 sym 2 img

Updated Data Science Virtual Machine for Windows: GPU-enabled with Docker support

21.06.2017

The Windows edition of the Data Science Virtual Machine (DSVM), the all-in-one virtual machine image with a wide-collection of open-source and Microsoft data science tools, has been updated to the Windows Server 2016 platform. This update brings built-in support for Docker containers and GPU-based deep learning.  GPU-based Deep Learning. While...

2126 sym 2 img

Interactive R visuals in Power BI

22.06.2017

Power BI has long had the capability to include custom R charts in dashboards and reports. But in sharp contrast to standard Power BI visuals, these R charts were static. While R charts would update when the report data was refreshed or filtered, it wasn't possible to interact with an R chart on the screen (to display tool-tips, for example). But...

2110 sym 2 img

The R community is one of R’s best features

23.06.2017

R is incredible software for statistics and data science. But while the bits and bytes of software are an essential component of its usefulness, software needs a community to be successful. And that's an area where R really shines, as Shannon Ellis explains in this lovely ROpenSci blog post. For software, a thriving community offers developers, e...

2620 sym 2 img

Useful tricks when including images in Rmarkdown documents

26.06.2017

Rmarkdown is an enormously useful system for combining text, output and graphics generated by R into a single document. Images, in particular, are a powerful means of communication in a report, whether they be data visualizations, diagrams, or pictures. To maximize the power of those images, Zev Ross has created a comprehensive list of tips and t...

2293 sym

How R is used by the FDA for regulatory compliance

29.06.2017

I was recently alerted (thanks Maëlle and Mikhail!) to an enlightening presentation from last years' useR! conference. (This year's useR! conference takes place next week in Belgium.) Paul H Schuette, Scientific Computing Coordinator at the FDA Center for Drug Evaluation and Research (CDER), talked about how R is used in the process of regulatin...

2779 sym