Publications by David Smith
AirBnB grows by sharing data scientist knowledge
This animation of AirBnB host locations from 2011-2014, presented by Ricardo Bion (data scientist manager at AirBnb) at the EARL Boston conference earlier this week, shows the dramatic growth in properties to rent through the service along with the most common routes of travellers. (You can find the R code that created this animation here.) How ...
2059 sym 4 img
A computer vision challenge: finding boats in the Mona Lisa
About a decade or so photomosaics were all the rage: a near-recreation of a famous image by using many smaller images as elements. Here, for example, is the Mona Lisa, created using the Metapixel program by overlaying 32×32 images of vehicles and animals. An image like this presents an interesting computer vision challenge: can you use deep ...
2067 sym 4 img
In case you missed it: October 2016 roundup
In case you missed them, here are some articles from October of particular interest to R users. A brief summary of the R 3.3.2 release. “Data Science with SQL Server 2016“, a free E-book featuring several in-depth R examples, is now available for download. The ReporterRs package makes it easy to insert R output, tables and graphics into Wor...
2725 sym
How to call Cognitive Services APIs with R
Microsoft Cognitive Services is a set of cloud-based machine-intelligence APIs that you can use to extract structured data from complex sources (unstructured text, images, video and audio), and add “AI” type features to applications. A good example is the “Seeing AI” glasses in the video below: the image descriptions, emotion inference, a...
2143 sym
Notable New and Updated R packages (to October 2016)
As we prepare for the upcoming release of Microsoft R Open, I've been preparing the list of new and updated packages for the spotlights page. This involves scanning the CRANberries feed (with gracious thanks to Dirk Eddelbuettel) for newly-released packages and significant updates to existing ones. This is a lot of data to process. For context, ...
7316 sym
The 5 most popular R packages
The good folks at DataCamp track activity related to R packages on the RDocumentation.org Trends page. As of this writing, it tracks statistics on 11,768 packages (distributed across CRAN, BioConductor and Github) comprising over 1.7 million R functions in total. On that page, you can find current rankings on the most downloaded R packages, the m...
2283 sym
Tutorial: Build a live rental prediction service with SQL Server R Services
A great way to learn is by doing, so if you've been thinking about how to enable R-based computations within SQL Server, a new tutorial will take you through all the steps of building an intelligent application. In a few simple steps, you'll set up all the necessary software and code to build a live service that predicts demand for a ski rental s...
1549 sym 2 img
Happy Thanksgiving! (2016)
It's Thanksgiving day here in the US, so we're taking the rest of the week off to reflect on what we're thankful for. And even if you're not in the US, today is a great day to send thanks to the R Core Group for providing their dedication, time, and expertise to make the R Project what it is today. (Sadly, cowsay doesn't feature a Thanksgivin...
849 sym 2 img
A heat map of Divvy bike riders in Chicago
Chicago's a great city for a bike-sharing service. It's pretty flat, and there are lots of wide roads with cycle lanes. I love Divvy and use it all the time. Not so much in the winter though: it gets very cold here. Nonetheless, this heat map of Divvy riders, created in R by Austin Wehrwein, reveals a hardcore set of riders that use the service ...
1127 sym 2 img
Free online course: Analyzing big data with Microsoft R Server
If you're already familiar with R, but struggling with out-of-memory or performance problems when attempting to analyze large data sets, you might want to check out this new EdX course, Analyzing Big Data with Microsoft R Server, presented by my colleague Seth Mottaghinejad. In the course, you'll learn how to build models using the RevoScaleR pa...
1550 sym