Publications by David Smith

Video: Applied Predictive Modeling with R

01.02.2016

There's more to Iowa than just today's presidential primary. Last month, the Central Iowa R User Group hosted Dr. Max Kuhn, Director of Non-Clinical Statistics at Pfizer Global R&D, via video-chat to present on Applied Predictive Modeling with R. Max is the co-author of the excellent book Applied Predictive Modeling (read our review here), and...

1817 sym 2 img

Mapping the world’s longest plane fights

03.02.2016

If you're one of those people that dreads long plane flights, this map by Matt Strimas-Mackey will help you find routes to avoid. It shows Wikipedia's list of the top 30 scheduled commercial flights by distance (with code-share duplicates removed), represented as a map showing the routes colour-coded by the time spent in the air. Don't be distr...

2179 sym 2 img

Introducing Microsoft R Open: Replay and slides

05.02.2016

We had a fantastic turnout to last week's webinar, Introduction to Microsoft R Open. If you missed it, you can watch the replay below. In the talk, I gives some background on the R language and its applications, describe the performance and reproducibility benefits of Microsoft R Open, and give a demonstration of the basics of the R language alon...

1017 sym

Tutorial: Credit Card Fraud Detection with SQL Server 2016 R Services

08.02.2016

If you have a database of credit-card transactions with a small percentage tagged as fraudulent, how can you create a process that automatically flags likely fraudulent transactions in the future? That's the premise behind the latest Data Science Deep Dive on MSDN. This tutorial provides a step by step to using the R language and the big-data st...

1899 sym 2 img

In case you missed it: January 2016 roundup

10.02.2016

In case you missed them, here are some articles from January of particular interest to R users.  Animated visualizations and analysis of data from NYC's municipal bike program, created with R. Many local R user groups are sharing materials from meetups using Github. A detailed R tutorial on analyzing your Twitter archive and performing sentimen...

2270 sym

You can now extend RStudio with add-ins

12.02.2016

The latest update to RStudio, the cross-platform open-source integrated development environment for the R language from the team at RStudio, adds many new features for R developers. But perhaps the most significant update is one which allows R developers to add their own new features to RStudio: add-ins.  RStudio Add-ins appear under the new “...

2116 sym 2 img

Using Microsoft R Server to Address Scalability Issues in R

15.02.2016

If you missed the recent webinar presented by Derek Norton, Using Microsoft R Server to Address Scalability Issues in R, you can now catch up with the replay below. In the webinar, Derek compares Microsoft R Open and Microsoft R Server, and demonstrates using Microsoft R Server to model a 40-million-row data file using logistic regression. (The...

1195 sym

New syntax proposed for R language

17.02.2016

R user and developer Lionel Henry proposes a number of changes to R syntax: Use square brackets to create lists. You could use [1, 2:5, “hello”] to create a list of three elements. Nested lists would be possible as well, with syntax like or [ [1, 2], [2, 3] ] (much easier than list(list(1,2),list(2,3))). Define lambda functions with square br...

2392 sym

Microsoft R Open 3.2.3 update

19.02.2016

We've released a minor update to Microsoft R Open 3.2.3 to address issues that some people were experiencing. The update available now on MRAN fixes the following issues: The Windows R GUI (RGUI.exe) could crash when typing beyond the bounds of the visible window. The “R” command wasn't available from the Terminal in OS X El Capitan. TCL/TK...

1177 sym

Latest Redmonk and Tiobe language rankings for R

22.02.2016

Analyst firm RedMonk have updated their (near-)biannual Programming Lanuage rankings as of January 2016, and the R language ranks at #13, unchanged since the last ranking in June 2015. Redmonk's rankings are based on number of projects in GitHub and number of questions tagged in StackOverflow, and the most recent data is visualized (using R's ggp...

1322 sym 2 img