Publications by David Smith
2016 Data Science Salary Survey results
O'Reilly has released the results of the 2016 Data Science Salary Survey. This survey is based on data from over 900 respondents to a 64-question survey about data-related tasks, tools, and the salary they receive from doing/using them. The median salary reported in the survey was US$87,000; amongst data scientists in the US, the median salary wa...
2237 sym 2 img
Reflections on EARL London 2016
The Mango Solutions team have done it again: another excellent Effective Applications of R (EARL) conference just wrapped up here in London. The conference was attended by almost 400 R users from companies all around the world, and was a really fun experience. I was honored to deliver a keynote presentation, alonside keynotes from Joe Cheng and...
2544 sym 2 img
YaRrr! The Pirate’s (video) Guide to R
Today is Talk Like A Pirate Day, the perfect day to learn R, the programming language of pirates (arrr, matey!). If you have two-and-a-bit hours to spare, Nathaniel Phillips has created a video tutorial YaRrr! The Pirate's Guide to R which will take you through the basics: installation, basic R operations, and the matrix and data frame obects. ...
1432 sym 2 img
Linux Data Science Virtual Machine: new and upgraded tools
The Linux edition of the Data Science Virtual Machine on Microsoft Azure was recently upgraded. The Linux DSVM includes Microsoft R, Anaconda Python, Jupyter, CNTK and many other data science and machine learning tools, new or upgraded for this release. This eWeek story gives an overview of the improvements, but the highlights are: Microsoft R S...
1985 sym 2 img
Welcome to the Tidyverse
Hadley Wickham, co-author (with Garrett Grolemund) of R for Data Science and RStudio's Chief Scientist, has focused much of his R package development on the un-sexy but critically important part of the data science process: data management. In the Tidy Tools Manifesto, he proposes four basic principles for any computer interface for handling data...
2354 sym
Microsoft R at the EARL Conference
Slides have now been posted for many of the talks given at the recent Effective Applications of the R Language (London) conference, and I thought I'd highlight a few that featured Microsoft R. Chris Cole manages the deployment of R at Investec, supporting investment and risk teams worldwide. Despite some initial grumbling about having to use “o...
2938 sym 8 img
The Financial Times uses R for Quantitative Journalism
At the 2016 EARL London conference senior data-visualisation journalist John Burn-Murdoch, described how the Financial Times uses R to produce high-quality, striking data visualisations. Until recently, charts were the realm of an information designer using tools like Adobe Illustrator: the output was beautiful, but the process was a long and wi...
1472 sym 4 img
Using R to detect fraud at 1 million transactions per second
In Joseph Sirosh's keynote presentation at the Data Science Summit on Monday, Wee Hyong Took demonstrated using R in SQL Server 2016 to detect fraud in real-time credit card transactions at a rate of 1 million transactions per second. The demo (which starts at the 17:00 minute mark) used a gradient-boosted tree model to predict the probability o...
1700 sym 2 img
Watch: Highlights of the Microsoft Data Science Summit
I just got back from Atlanta, the host of the Microsoft Machine Learning and Data Science Summit. This was the first year for this new conference, and it was a blast: the energy from the 1,000 attendees was palpable. I covered Joseph Sirosh's keynote presentation yesterday, but today I wanted to highlight a few other talks from the program now th...
2631 sym
All the R Ladies
Two groups are making and impact in improving the gender diversity of R users worldwide. The R-Ladies organization is creating chapters worldwide to facilitate female R programmers meeting and working together, and the Taskforce on Women in the R Community is working to improve the participation and experience of women in the R community. R has ...
2800 sym 2 img