Publications by David Smith
Free guide to text mining with R
Jilia Silge and David Robinson are both dab hands at using R to analyze text, from tracking the happiness (or otherwise) of Jane Austen characters, to identifying whether Trump's tweets came from him or a staffer. If you too would like to be able to make statistical sense of masses of (possibly messy) text data, check out their book Tidy Tidy Tex...
2269 sym 4 img
Upcoming R Conferences
Since a few new events have been announced recently, I thought I'd give a run-down on some major R conferences coming up in the next six months. February 18: satRdays, Cape Town (South Africa). This is the second in a series of one-day conferences inspired by an R Consortium proposal. The first event in Budapest was a great success, and the line...
2692 sym
Building a machine learning model with the MicrosoftML package
Microsoft R Server 9 includes a new R package for machine learning: MicrosoftML. (So do the Data Science Virtual Machine and the free Microsoft R Client edition, incidentally.) This package includes a suite of fast predictive modeling functions implemented by Microsoft Research, including: Linear (rxFastLinear) and logistic (rxLogisticRegressi...
2597 sym 2 img
New Zealand bank replaces SAS server with R Server
Heartland Bank, a rapidly growing bank in New Zealand, has adopted a data-driven approach to analyzing risk, evaluating credit lines, and understanding cash flows. But they found their legacy SAS system to be labor-intensive and time consuming when it came to updating financial models, and it was expensive to boot. (Being licensed on a per-user ...
2131 sym
Kung Fu R
A great way to hone your skills as a data scientist is to pick a topic you're passionate about, find some data related to it, and analyze the heck out of it. Jim Vallandingham is clearly passionate about old Kung Fu movies — particularly those from the Shaw Brothers Studio — and has used R to analyze data the studio's oeuvre: 260 films ov...
1688 sym 2 img
CRAN now has 10,000 R packages. Here’s how to find the ones you need.
CRAN, the global repository of open-source packages that extend the capabiltiies of R, reached a milestone today. There are now more than 10,000 R packages available for download*. (Incidentally, that count doesn't even include all the R packages out there. There are also another 1294 packages for genomic analysis in the BioConductor repository...
3891 sym 2 img
List of R conferences and user groups (2017-01-30)
For 8 years now, we've maintained a list of local R user groups here at the Revolutions blog. This is a list that began with a single group (the Bay Area RUG, the first and still one of the largest groups), and now includes 360 user groups worldwide (including 27 specifically for women). As the list has grown in size, it's become harder to manage...
1793 sym
Data Science Virtual Machine updated, now includes RStudio, JuliaPro
The Windows edition of the Data Science Virtual Machine (DSVM) was recently updated on the Azure Marketplace. This update upgrades some existing components and adds some new ones as well. You now have your choice of integrated development environment to use with R. RStudio Desktop is now included in the Data Science Virtual Machine image — n...
2146 sym 2 img
A look back at the year in R and Microsoft
Thomas Dinsmore's ML/DL blog recently concluded a look back on significant advancements in data science, machine learning and deep learning — many of which involved R and/or Microsoft. Here are those highlights (reproduced with permission): The R Project R and Python maintained their leadership as primary tools for open data science. The Pytho...
6872 sym
fst: Fast serialization of R data frames
If you want to get data out of R and into another application or system, simply copying the data as it resides in memory generally isn't an option. Instead you have to serialize the data (into a file, usually), which the other application can then deserialize to recreate the original data. R has several options to serialize data frames: You can...
3021 sym 2 img