Publications by David Smith

Sentiment analysis of Trump’s tweets with R

18.08.2016

Data Scientist David Robinson caused a bit of a stir in the media when he analyzed Donald Trump's tweets and revealed that those sent from an Android device were likely sent by the candidate himself, while those sent from an iPhone were likely sent by campaign staffers. The difference? As seen in the chart below, Android-based tweets used angrier...

2189 sym 4 img

Five problems (and one solution) with dual-axis time series plots

19.08.2016

If you need to present two time series spanning the same period, but in wildly different scales, it's tempting to use a time series chart with two separate vertical axes, one for each series, like this one from the Reserve Bank of New Zealand: Charts like this typically have one or more crossover points, and that crossing imparts meaning to the...

3720 sym 4 img

Five great charts in 5 lines of R code each

22.08.2016

Sharon Machlis is a journalist with Computerworld, and to show other journalists how great R is for data visualization she shows them these five data visualizations, each of which can be created in 5 lines of R code or less. I’ve reproduced Sharon’s code and charts below. I did make a couple of tweaks to the code, though. I added a call to c...

1322 sym 10 img

Edward Tufte Keynote Presenter at Data Science Summit, Sep 26-27

23.08.2016

I'm excited to share that one of my data science heroes will be a presenter at the Microsoft Data Science Summit in Atlanta, September 26-27. Edward Tufte, the data visualization pioneer, will deliver a keynote address on the future of data analysis and the how to make more credible conclusions based on data. If you're not familiar with Tufte, a ...

2929 sym 2 img

R with Power BI: Import, Transform, Visualize and Share

25.08.2016

Power BI, Microsoft's data visualization and reporting platform, has made great strides in the past year integrating the R language. This Computerworld article describes the recent advances with Power BI and R. In short, you can: import data into Power BI by using an R script cleanse and transform other data sources coming into Power BI using R ...

1759 sym 2 img

Microsoft R Open 3.3.1 now available for Windows, Mac and Linux

26.08.2016

Microsoft R Open 3.3.1, our enhanced disstribution of open source R, is now available for download for Windows, Mac, and Linux. This update upgrades the R langauge engine to version 3.3.1, streamlines the installation process, and bundles some additional packages for parallel programming. R version 3.3.1 fixes a few rarely-encountered bugs, for ...

2358 sym 2 img

Video series: Introduction to Microsoft R Server

29.08.2016

Microsoft R Server extends the base R language and Microsoft R Open with big-data capabilities. Specifically, it adds the RevoScaleR package, which creates an out-of-memory “CDF” data structure (so you can process data larger than available RAM), and algorithms that allow you to perform computations on such data using parallel and distribute...

1924 sym

The elements of scaling R-based applications with DeployR

08.09.2016

If you want to build an application using R that serves many users simultaneously, you're going to need to be able to run a lot of R sessions simultaneously. If you want R to run in the cloud, you can publish R functions as a Web service (and you can do this directly from R with the azureML package). But if you want to run R on your own server, ...

2244 sym 2 img

In case you missed it: August 2016 roundup

09.09.2016

In case you missed them, here are some articles from August of particular interest to R users.  An amusing short video extols the benefits of reproducible research with R. A guide to implementing a churn model for mobile phone customers with Microsoft R Services. Computerworld's Sharon Machlis presents 5 data visualizations each using 5 lines o...

2525 sym

Volunteer to help improve R’s documentation

12.09.2016

The R Consortium, in its most recent funding round, awarded a grant of $10,000 to The R Documentation Task Force, whose mission is to design and build the next generation R documentation system. (Microsoft is a Platinum Member of the R Consortium.) The task force has the support and participation of R Core members Duncan Murdoch, Michael Lawre...

2046 sym