Publications by David Smith
Watch presentations from R/Finance 2017
It was another great year for the R/Finance conference, held earlier this month in Chicago. This is normally a fairly private affair: with attendance capped at around 300 people every year, it's a somewhat exclusive gathering of the best and brightest minds from industry and academia in financial data analysis with R. But for the first time this ...
3044 sym 1 tbl
Python and R top 2017 KDnuggets rankings
The results of KDnuggets' 18th annual poll of data science software usage are in, and for the first time in three years Python has edged out R as the most popular software. While R increased its share of usage from 45.7% in last year's poll to 52.1% this year, Python's usage among data scientists increased even more, from 36.6% of users in 2016 ...
1506 sym 2 img
Teach kids about R with Minecraft
As I mentioned earlier this week, I was on a team at the ROpenSci Unconference (with Brooke Anderson, Karl Broman, Gergely Daróczi, and my Microsoft colleagues Mario Inchiosa and Ali Zaidi) to work on a project to interface the R language with Minecraft. The resulting R package, miner, is now available to install from Github. The goal of t...
3486 sym 2 img
Powe[R] BI: Free e-book on using R with Power BI
A new (and free!) e-book on extending the capabilities of Power BI with R is now available for download, from analytics consultancy BlueGranite. The introduction to the book explains why R and Power BI are a great match together: As a specialized, open source statistical environment, R represents the primary analysis language for a large numbe...
1958 sym 2 img
In case you missed it: May 2017 roundup
In case you missed them, here are some articles from May of particular interest to R users. Many interesting presentations recorded at the R/Finance 2017 conference in Chicago are now available to watch. A review of some of the R packages and projects implemented at the 2017 ROpenSci Unconference. An example of applying Bayesian Learning with the...
3007 sym
How to create dot-density maps in R
Choropleths are a common approach to visualizing data on geographic maps. But choropleths — by design or necessity — aggregate individual data points into a single geographic region (like a country or census tract), which is all shaded a single colour. This can introduce interpretability issues (are we seeing changes in the variable of intere...
2021 sym 2 img
Run massive parallel R jobs cheaply with updated doAzureParallel package
At the EARL conference in San Francisco this week, JS Tan from Microsoft gave an update (PDF slides here) on the doAzureParallel package . As we've noted here before, this package allows you to easily distribute parallel R computations to an Azure cluster. The package was recently updated to support using automatically-scaling Azure Batch clu...
3553 sym 4 img
Schedule for useR!2017 now available
The full schedule of talks for useR!2017, the global R user conference, has now been posted. The conference will feature 16 tutorials, 6 keynotes, 141 full talks, and 86 lightning talks starting on July 5 in Brussels. That's a lot to fir into 4 days, but I'm especially looking forward to the keynote presentations: 20 years of CRAN (Uwe Ligges) ...
1787 sym
Interfacing with APIs using R: the basics
While R (and its package ecosystem) provides a wealth of functions for querying and analyzing data, in our cloud-enabled world there's now a plethora of online services with APIs you can use to augment R's capabilities. Many of these APIs use a RESTful interface, which means you will typically send/receive data encoded in the JSON format using HT...
1781 sym
Syberia: A development framework for R code in production
Putting R code into production generally involves orchestrating the execution of a series of R scripts. Even if much of the application logic is encoded into R packages, a run-time environment typically involves scripts to ingest and prepare data, run the application logic, validate the results, and operationalize the output. Managing those scrip...
4812 sym R (202 sym/1 pcs)