Publications by David Smith

Catterplots: Plots with cats

17.02.2017

As a devotee of Tufte, I'm generally against chartjunk. Graphical elements that obscure interpretation of the data occasionally have a useful role to play, but more often than not that role is to entertain the expense of enlightenment, or worse, to actively mislead. So it's with mixed feelings that I refer you to catterplots, an R package by Davi...

1354 sym R (259 sym/1 pcs) 2 img

Finding Radiohead’s most depressing song, with R

22.02.2017

Radiohead is known for having some fairly maudlin songs, but of all of their tracks, which is the most depressing? Data scientist and R enthusiast Charlie Thompson ranked all of their tracks according to a “gloom index”, and created the following chart of gloominess for each of the band's nine studio albums. (Click for the interactive versio...

3270 sym 2 img

The difference between R and Excel

22.02.2017

If you're an Excel user (or any other spreadsheet, really), adapting to learn R can be hard. As this blog post by Gordon Shotwell explains, one of the reasons is that simple things can be harder to do in R than Excel. But it's worth perservering, because complex things can be easier. While Excel (ahem) excels at things like arithmetic and tabul...

1168 sym 2 img

Preview: R Tools for Visual Studio 1.0

23.02.2017

After more than a year in preview R Tools for Visual Studio, the open-source extension to the Visual Studio IDE for R programming, is nearing its official release. RTVS Release Candidate 1 is now available for download, giving you the opportunity to try out the new features ahead of the official announcement. We'll cover the features in detail ...

1648 sym 2 img

Prophet: How Facebook operationalizes time series forecasting at scale

24.02.2017

Facebook is a famously data-driven organization, and an important goal in any data science activity is forecasting. Now, Facebook has released Prophet, an open-source package for R and Python that implements the time-series methodology that Facebook uses in production for forecasting at scale. Prophet has a very simple interface: you pass it a ...

2860 sym 2 img

ggraph: ggplot for graphs

27.02.2017

A graph, a collection of nodes connected by edges, is just data. Whether it's a social network (where nodes are people, and edges are friend relationships), or a decision tree (where nodes are branch criteria or values, and edges decisions), the nature of the graph is easily represented in a data object. It might be represented as a matrix (where...

3252 sym 2 img

Forecasting gentrification in city neighborhoods, with R

28.02.2017

If you've lived in a big city, you're likely familiar with the impact of gentrification. For longtime residents of a neighbourhood, it can represent a decline in the culture and vibrancy of your community; for recent or prospective residents, it can represent a financial opportunity in rising home prices. For those that live in a gentrifying neig...

2586 sym 4 img

Scholarships encourage diversity at useR!2017

01.03.2017

While representation of women and minorities at last year's useR! conference was the highest it's ever been, there is always room for more diversity. To encourage more underrepresented individuals to attend, the useR! committee has taken several steps, including asking attendees to adhere to a supportive code of conduct and by providing childcare...

1186 sym

Predicting the length of a hospital stay, with R

02.03.2017

I haven't been admitted to hospital many times in my life, but every time the only thing I really cared about was: when am I going to get out? It's also a question that weighs heavily on hospital managers: by knowing ahead of time how long each patient's stay is likely to be, they can better manage facilities and staff, and know whether the hospi...

2192 sym 2 img

Find modern, interactive web-based charts for R at the htmlwidgets gallery

03.03.2017

While R's base graphics library is almost limitlessly flexible when it comes to create static graphics and data visualizations, new Web-based technologies like d3 and webgl open up new horizons in high-resolution, rescalable and interactive charts. Graphics built with these libraries can easily be embedded in a webpage, can be dynamically resized...

2152 sym 2 img