Publications by David Smith
Webinar and free e-book on data preparation with R
Just a quick heads up that Nina Zumel, co-founder and principal consultant at Win-Vector LLC will be presenting a webinar at 10AM Pacific Time on Thursday March 17, Data Preparation Techniques with R. Nina is the co-author of Practical Data Science with R and blogs frequently at the Win-Vector blog (and contributes the occasional guest blog her...
1928 sym 2 img
Creating a March Madness bracket with Machine Learning
March Madness is upon us here in the US. This annual college basketball competition pits 64 teams in a single-elimination tournament, and the team that goes undefeated for all 6 rounds will be named NCAA Champion. Predicting the winners of the competition, and in particular completing a “bracket” of the teams you predict to make it to the fin...
1791 sym 4 img
R Consortium announces new grants for R projects and working groups
Five months ago, the R Consortium asked the R Community to propose projects to benefit R users and the R project. Today, the R Consortium announced that it has awarded grants to fund seven of those projects. A unified framework for distributed computing with R An improved database interface A one-day workshop to unite R language developers, i...
1776 sym
Introductions to R and predictive analytics
If you're new to the concept of predictive models, or just want to review the background on how data scientists learn from past data to predict the future, you may be interested in my talk from the Data Insights Summit, Introduction to Real-Time Predictive Modeling. In the talk above I gave a brief introduction to the R language and mentioned se...
1056 sym
About those weird things in R…
There's no denying that for a language as popular as R, it has more than its fair share of quirks. If you've ever wondered why, for example, R has a non-standard assignment operator, or that periods are allowed in symbols (and don't signify method calls), or that character data imports as factors (not strings) by default, then this blog post by ...
1114 sym 2 img
Two fun plots with R
Data visualization with R doesn't always have to be serious. Here are a couple of fun charts created recently by R users. First, here's a minimalist rendition of the characters in The Simpsons, by an anonymous blogger: And from Alex Whan, here's a near-perfect recreation of the classic cover of the Joy Division album Unknown Pleasures, based on ...
881 sym 4 img
Help improve treatment for brain injuries using machine learning and R
The field of neuroscience — the study of brains and the nervous system — has taken some major leaps in recent years. Scientists can now gather real-time electrical activity from the brain during actions and thoughts, which is helping to pinpoint the exact location of brain lesions caused by strokes, and is leading to promising treatments for ...
2554 sym 2 img
AirbnB uses R to scale data science
Airbnb, the property-rental marketplace that helps you find a place to stay when you're travelling, uses R to scale data science. Airbnb is a famously data-driven company, and has recently gone through a period of rapid growth. To accommodate the influx of data scientists (80% of whom are proficient in R, and 64% use R as their primary data an...
2554 sym 6 img
In case you missed it: March 2016 roundup
In case you missed them, here are some articles from February of particular interest to R users. Reviews of new CRAN packages RtutoR, lavaan.shiny, dCovTS, glmmsr, GLMMRR, MultivariateRandomForest, genie, kmlShape, deepboost and rEDM. You can now create and host Jupyter notebooks based on R, for free, in Azure ML Studio. Calculating learning ...
2892 sym
The FBI’s aerial surveillance program, visualized with R
Buzzfeed's Peter Aldhous and Charles Seife broke a major news story last week: the US Federal Bureau of Investigation and Department of Homeland Security operate more than 200 small aircraft (mainly Cessnas and some helicopters) which routinely circle various sites near US cities, presumably to gather data with onboard cameras and electonic equi...
2487 sym 4 img