Publications by Once Upon a Data

A shout Out to R bloggers

14.06.2016

Since I started to work with R, I became a frequent visitor to R-bloggers web site where I find a variety of helpful tips and tutorials. Now, as I started my own blog, it is time to give a shout-out to them! Related To leave a comment for the author, please follow the link and comment on their blog: Once Upon a Data. R-bloggers.com offers dai...

606 sym

Leverage and Influence in a Nutshell

15.06.2016

Once upon a data, there were outliers and influential observations in regression models. Using these models, we learnt that a common practice was to perform diagnostics checks to dig deeper and see how different points affect the fitted model or its coeffecients. So here we go! In this post, we will focus on two concepts (leverage and influence),...

4525 sym R (86 sym/1 pcs) 4 img

R googleVis Line Motion Charts with Modified Options

21.06.2016

Using googleVis via R provides lots of options to create nice google visualizations. I was trying to create some charts while exploring the Annual Nominal Fish Catches Data on Kaggle. I wanted to create a line motion chart and exclude the default bubble chart. So I played with the options to get the desired result. The following is a quick explan...

2178 sym R (2520 sym/5 pcs) 1 tbl

Lessons Learnt About Data Viz – Why a Boxplot Is Sometimes The Worst Choice?

22.06.2016

Data visualization is a means of visual communication that should help people understand the significance of data easily and see interesting trends, patterns, distributions,..etc. If your audience fails to grasp the message that was intended to be conveyed by the graph, they are not to be blamed. You are! or to be precise, your choice of the grap...

4433 sym 4 img

The Power of (purrr,tidy,broom)-Exploring Climate Change Trends

27.06.2016

Few days ago, I wanted to explore the Climate Change: Earth Surface Temperature Data dataset published on Kaggle and originally compiled by Berkeley Earth. The dataset is relatively large as it contains entries from 1750-2014! This was shortly after watching Hadley Wickham’s talk about managing many models with R. So I thought about using the p...

5659 sym R (3898 sym/9 pcs) 8 img

Yet Another Post on Logistic Regression

21.07.2016

Everyday statisticians, analysts and data enthusiasts perform data analysis for different purposes. But when it comes to presenting analyses to wider audience, the good work is not the complex one with big words. It is the one that highlights interesting relations, answers business questions or predict outcomes, and explain all that in the simple...

8245 sym R (688 sym/6 pcs) 6 img

A Glimpse into The Daily Life of a Data Scientist

24.01.2017

A couple of weeks ago, I had a discussion with a co-worker regarding a project I was involved in, I felt that there was no clear understanding of the daily challenges data scientists face. Few days later, I was at Rstudio::Conf 2017 where I met lots of data scientists from academia and industry. Later on, I described one of the conference’s pos...

8915 sym 8 img 9 tbl