Publications by Matt.0

Reversing the order of axis in a ggplot2 scatterplot

24.01.2018

Taking the advice of David Robinson I’ve decided to start a blog and write about data science, not only to create a portfolio of my work, but as a repository I can check back on when I scratch my head and think “now how did I do that?”A nice post I saw on twitter about how to reverse the order of a ggplot2 legend got me thinking about a gr...

1739 sym R (790 sym/3 pcs) 6 img

Analyzing extreme skiing and snowboarding in R: Freeride World Tour 1996–2018

07.05.2018

The Freeride World Tour (FWT) has been hosting extreme skiing & snowboarding events since 1996. Having just wrapped up the 2018 season in March I did an analysis on rankings and past FWT winners using R.If you haven’t heard of the FWT yet it’s an exciting sport where riders choose gnarley-looking lines through cliff-faces, cornices and nasty...

6220 sym R (3577 sym/7 pcs) 28 img

Recreating data visualizations from the book “Knowledge is Beautiful”

06.06.2018

n this series of posts I will set out to recreate some of the visualizations from the book “Knowledge is Beautiful” by David McCandless in R.David McCandless is author of two bestselling infographics books and gave a TED talk about data visualization. I bought his second book “Knowledge is Beautiful”, in 2015, which contains 196 beautifu...

5625 sym R (4940 sym/12 pcs) 8 img

I love the idea of a permanent side-bar for the TOC.

11.06.2018

I love the idea of a permanent side-bar for the TOC. Can’t wait until you provide the open-source code. Can you share any resources for those interested in learning CSS rules and hacks to customize their blogs and reports? Related To leave a comment for the author, please follow the link and comment on their blog: Stories by Matt.0 on Mediu...

632 sym 2 img

Recreating (more) data visualizations from the book “Knowledge is Beautiful”: Part II

22.06.2018

In part II of this series I continue to recreate some of the visualizations from the book “Knowledge is Beautiful” by David McCandless in R.David McCandless is author of two bestselling infographics books and gives a great TED talk about data visualization. His second book Knowledge is Beautiful , published in 2015, contains 196 beautiful i...

5154 sym R (8035 sym/7 pcs) 12 img

Recreating (more) data visualizations from the book “Knowledge is Beautiful”: Part III

05.07.2018

Welcome to the third installment of the series where I recreate data visualizations, in R, from the book Knowledge is Beautiful by David McCandless.Here are the links for part I and part II of the series if you haven’t checked them out yet.The list of frustrations in data science are many, for example:Consider point 4 above. Even when the dat...

3287 sym R (3020 sym/6 pcs) 14 img

Recreating (more) data visualizations from the book “Knowledge is Beautiful”: Part IV

16.07.2018

Welcome to the last part of the series where I recreate data visualizations in R from the book Knowledge is Beautiful by David McCandless.Links to part I, II, III of the series can be found here.Plane CrashesThis dataset will be used for a couple of visualizations.The first visualization is a stacked-barplot showing causes of crashes for every p...

2156 sym R (4584 sym/3 pcs) 8 img

Hi Pawel, I’m glad you enjoyed it.

25.07.2018

Hi Pawel, I’m glad you enjoyed it. I was trying to play around with facet_grid() earlier but I guess I didn’t stumble upon the proper parameters. Your suggestion works perfectly; not only does it keep each grid x-axis width proportional to its length, but it also keeps appropriate space-between-variables. Thank you for sharing that! Related...

744 sym 2 img

Great post!

16.08.2018

Great post!I wanted to mention that although many previous studies have used the area under receiver operating characteristic curve (auROC) statistic to benchmark the precision, it misleads evaluators when the test data is (highly) imbalanced see: PLOS One, 10(3):e0118432, 2015 & bioRxiv, 2017 doi: 10.1101/142760In my field (bioinformatics), ENCO...

1076 sym 2 img

The “Gold Standard” of Data Science Project Management

06.10.2018

The “Gold Standard” for Data Science Project ManagementThe inspiration for this post came most recently from a slide-deck by Ming Tang, a Bioinformatician at Harvard, and a new Chromebook Data Science course offered by Jeffery Leek from John Hopkins University.However, this has been a topic I’ve been thinking about for some time. A number o...

6958 sym R (143 sym/1 pcs) 20 img