Publications by David Smith
Microsoft R Open 3.3.2 now available
Microsoft R Open 3.3.2, Microsoft's enhanced distribution of open source R, is now available for download for Windows, Mac, and Linux. This update upgrades the R language engine to version 3.3.2, adds new bundled packages and updates others, and upgrades the Intel Math Kernel Libraries. The updated R 3.3.2 engine includes some performance improv...
2397 sym
Stylometry: Identifying authors of texts using R
Few people expect politicians to write every word they utter themselves; reliance on speechwriters and spokepersons is a long-established political practice. Still, it's interesting to know which statements are truly the politician's own words, and which are driven primarily by advisors or influencers. Recently, David Robinson established a way ...
2123 sym 2 img
In case you missed it: November 2016 roundup
In case you missed them, here are some articles from November of particular interest to R users. Microsoft R Open 3.3.2, based on R 3.3.2, has been released for Windows, Mac and Linux. A new, free course on EdX focuses on the big-data extensions of Microsoft R Server. Using ggplot2 to create a calendar heat map of city bike usage in Chicago. A ...
2295 sym
Microsoft R Server 9.0 now available
Microsoft R Server 9.0, Microsoft's R distribution with added big-data, in-database, and integration capabilities, was released today and is now available for download to MSDN subscribers. This latest release is built on Microsoft R Open 3.3.2, and adds new machine-learning capabilities, new ways to integrate R into applications, and additional ...
3380 sym 2 img
R Consortium Projects Update
The R Consortium has already funded 8 projects (and 3 more just in July) proposed by the R community, and the call for proposals for yet more projects is now open. If you have an idea for a projects that would advance R or the R Community, get your submission in by February 10, 2017. Meanwhile, the already-funded projects are making good progres...
2358 sym 2 img
The Value of R’s Open Source Ecosystem
I was thrilled to be invited to speak at the Monktoberfest conference, held this past October in Portland, Maine. Not only have I been a great fan of the analysis from the Redmonk team for many years, I'd heard that it was one of the most interesting and diverse tech conferences around. (Also, beer.) And indeed, it turned out to be all of those...
1651 sym
How housing prices have increased around the world
Len Kiefer, Deputy Chief Economist at Freddie Mac, recently posted an analysis of global housing price trends based on the international house price database (from the Dallas Fed). Using those data, Kiefer compared housing trends price increases (and in a couple of places like Spain and Ireland, decreases) across 24 countries. He also looks at ...
2065 sym 2 img
Visualizing taxi trips between NYC neighborhoods with Spark and Microsoft R Server
by Ali Zaidi, Data Scientist at Microsoft In previous post we showcased the use of the sparklyr package for manipulating large datasets using a familiar dplyr syntax on top of Spark HDInsight Clusters. In this post, we will take a look at the RxSpark API for R, part of the RevoScaleR package and the Microsoft R Server distribution of R on HDInsi...
11641 sym R (26567 sym/19 pcs) 6 img
One Page R: A Survival Guide to Data Science with R
If you're looking to get started with data science in R, a great place to start is OnePageR by Graham Williams. (Graham is the creator of Rattle, author of Data Mining with Rattle and R, and Director of Data Science at Microsoft.) This free (CC-licensed) resource is a series of hands-on mini-chapters and associated R code, organized into four mai...
1595 sym
How the State of Indiana uses R and Azure to forecast employment
“Big Data” generates a lot of news these days, but sometimes small data still means big computation. Indiana's Department of Workforce Development has the responsibility to forecast future employment rates in the State of Indiana. And not just the number of jobs available: the department also needs to forecast the types of jobs that will be a...
3860 sym 4 img