Publications by Matt.0
Reversing the order of axis in a ggplot2 scatterplot
Taking the advice of David Robinson I’ve decided to start a blog and write about data science, not only to create a portfolio of my work, but as a repository I can check back on when I scratch my head and think “now how did I do that?”A nice post I saw on twitter about how to reverse the order of a ggplot2 legend got me thinking about a gr...
1739 sym R (790 sym/3 pcs) 6 img
Analyzing extreme skiing and snowboarding in R: Freeride World Tour 1996–2018
The Freeride World Tour (FWT) has been hosting extreme skiing & snowboarding events since 1996. Having just wrapped up the 2018 season in March I did an analysis on rankings and past FWT winners using R.If you haven’t heard of the FWT yet it’s an exciting sport where riders choose gnarley-looking lines through cliff-faces, cornices and nasty...
6220 sym R (3577 sym/7 pcs) 28 img
Recreating data visualizations from the book “Knowledge is Beautiful”
n this series of posts I will set out to recreate some of the visualizations from the book “Knowledge is Beautiful” by David McCandless in R.David McCandless is author of two bestselling infographics books and gave a TED talk about data visualization. I bought his second book “Knowledge is Beautiful”, in 2015, which contains 196 beautifu...
5625 sym R (4940 sym/12 pcs) 8 img
I love the idea of a permanent side-bar for the TOC.
I love the idea of a permanent side-bar for the TOC. Can’t wait until you provide the open-source code. Can you share any resources for those interested in learning CSS rules and hacks to customize their blogs and reports? Related To leave a comment for the author, please follow the link and comment on their blog: Stories by Matt.0 on Mediu...
632 sym 2 img
Recreating (more) data visualizations from the book “Knowledge is Beautiful”: Part II
In part II of this series I continue to recreate some of the visualizations from the book “Knowledge is Beautiful” by David McCandless in R.David McCandless is author of two bestselling infographics books and gives a great TED talk about data visualization. His second book Knowledge is Beautiful , published in 2015, contains 196 beautiful i...
5154 sym R (8035 sym/7 pcs) 12 img
Recreating (more) data visualizations from the book “Knowledge is Beautiful”: Part III
Welcome to the third installment of the series where I recreate data visualizations, in R, from the book Knowledge is Beautiful by David McCandless.Here are the links for part I and part II of the series if you haven’t checked them out yet.The list of frustrations in data science are many, for example:Consider point 4 above. Even when the dat...
3287 sym R (3020 sym/6 pcs) 14 img
Recreating (more) data visualizations from the book “Knowledge is Beautiful”: Part IV
Welcome to the last part of the series where I recreate data visualizations in R from the book Knowledge is Beautiful by David McCandless.Links to part I, II, III of the series can be found here.Plane CrashesThis dataset will be used for a couple of visualizations.The first visualization is a stacked-barplot showing causes of crashes for every p...
2156 sym R (4584 sym/3 pcs) 8 img
Hi Pawel, I’m glad you enjoyed it.
Hi Pawel, I’m glad you enjoyed it. I was trying to play around with facet_grid() earlier but I guess I didn’t stumble upon the proper parameters. Your suggestion works perfectly; not only does it keep each grid x-axis width proportional to its length, but it also keeps appropriate space-between-variables. Thank you for sharing that! Related...
744 sym 2 img
Great post!
Great post!I wanted to mention that although many previous studies have used the area under receiver operating characteristic curve (auROC) statistic to benchmark the precision, it misleads evaluators when the test data is (highly) imbalanced see: PLOS One, 10(3):e0118432, 2015 & bioRxiv, 2017 doi: 10.1101/142760In my field (bioinformatics), ENCO...
1076 sym 2 img
The “Gold Standard” of Data Science Project Management
The “Gold Standard” for Data Science Project ManagementThe inspiration for this post came most recently from a slide-deck by Ming Tang, a Bioinformatician at Harvard, and a new Chromebook Data Science course offered by Jeffery Leek from John Hopkins University.However, this has been a topic I’ve been thinking about for some time. A number o...
6958 sym R (143 sym/1 pcs) 20 img