Publications by David Smith

Watch presentations from R/Finance 2017

31.05.2017

It was another great year for the R/Finance conference, held earlier this month in Chicago. This is normally a fairly private affair: with attendance capped at around 300 people every year, it's a somewhat exclusive gathering of the best and brightest minds from industry and academia in financial data analysis with R. But for the first time this ...

3044 sym 1 tbl

Python and R top 2017 KDnuggets rankings

01.06.2017

The results of KDnuggets' 18th annual poll of data science software usage are in, and for the first time in three years Python has edged out R as the most popular software. While R increased its share of usage from 45.7% in last year's poll to 52.1% this year, Python's usage among data scientists increased even more, from 36.6% of users in 2016 ...

1506 sym 2 img

Teach kids about R with Minecraft

02.06.2017

As I mentioned earlier this week, I was on a team at the ROpenSci Unconference (with Brooke Anderson, Karl Broman, Gergely Daróczi, and my Microsoft colleagues Mario Inchiosa and Ali Zaidi) to work on a project to interface the R language with Minecraft. The resulting R package, miner, is now available to install from Github. The goal of t...

3486 sym 2 img

Powe[R] BI: Free e-book on using R with Power BI

05.06.2017

A new (and free!) e-book on extending the capabilities of Power BI with R is now available for download, from analytics consultancy BlueGranite. The introduction to the book explains why R and Power BI are a great match together:  As a specialized, open source statistical environment, R represents the primary analysis language for a large numbe...

1958 sym 2 img

In case you missed it: May 2017 roundup

06.06.2017

In case you missed them, here are some articles from May of particular interest to R users. Many interesting presentations recorded at the R/Finance 2017 conference in Chicago are now available to watch. A review of some of the R packages and projects implemented at the 2017 ROpenSci Unconference. An example of applying Bayesian Learning with the...

3007 sym

How to create dot-density maps in R

07.06.2017

Choropleths are a common approach to visualizing data on geographic maps. But choropleths — by design or necessity — aggregate individual data points into a single geographic region (like a country or census tract), which is all shaded a single colour. This can introduce interpretability issues (are we seeing changes in the variable of intere...

2021 sym 2 img

Run massive parallel R jobs cheaply with updated doAzureParallel package

08.06.2017

At the EARL conference in San Francisco this week, JS Tan from Microsoft gave an update (PDF slides here) on the doAzureParallel package . As we've noted here before, this package allows you to easily distribute parallel R computations to an Azure cluster. The package was recently updated to support using automatically-scaling Azure Batch clu...

3553 sym 4 img

Schedule for useR!2017 now available

09.06.2017

The full schedule of talks for useR!2017, the global R user conference, has now been posted. The conference will feature 16 tutorials, 6 keynotes, 141 full talks, and 86 lightning talks starting on July 5 in Brussels. That's a lot to fir into 4 days, but I'm especially looking forward to the keynote presentations: 20 years of CRAN (Uwe Ligges) ...

1787 sym

Interfacing with APIs using R: the basics

12.06.2017

While R (and its package ecosystem) provides a wealth of functions for querying and analyzing data, in our cloud-enabled world there's now a plethora of online services with APIs you can use to augment R's capabilities. Many of these APIs use a RESTful interface, which means you will typically send/receive data encoded in the JSON format using HT...

1781 sym

Syberia: A development framework for R code in production

13.06.2017

Putting R code into production generally involves orchestrating the execution of a series of R scripts. Even if much of the application logic is encoded into R packages, a run-time environment typically involves scripts to ingest and prepare data, run the application logic, validate the results, and operationalize the output. Managing those scrip...

4812 sym R (202 sym/1 pcs)