Publications by Murtaza Haider
Data Science 101, now online
We are delighted to note that IBM’s BigDataUniversity.com has launched the quintessential introductory course on data science aptly named Data Science 101.The target audience for the course is the uninitiated cohort that is curious about data science and would like to take the baby steps to a career in data and analytics. Needless to say, the...
1622 sym
Is it time to ditch the Comparison of Means (T) Test?
For over a century, academics have been teaching the Student’s T-test and practitioners have been running it to determine if the mean values of a variable for two groups were statistically different. It is time to ditch the Comparison of Means (T) Test and rely instead on the ordinary least squares (OLS) Regression. My motivation for this sugge...
7548 sym 14 img
R: simple for complex tasks, complex for simple tasks
When it comes to undertaking complex data science projects, R is the preferred choice for many. Why? Because handling complex tasks is simpler in R than other comparable platforms.Regrettably, the same is not true for performing simpler tasks, which I would argue is rather complex in base R. Hence, the title — R: simple for complex tasks, compl...
5988 sym R (371 sym/1 pcs) 2 img
Edward Tufte’s Slopegraphs and political fortunes in Ontario
With fewer than three weeks left in the June 7 provincial elections in Ontario, Canada’s most populous province with 14.2 million persons, the expected outcome is far from certain.The weekly opinion polls reflect the volatility in public opinion. Progressive Conservatives (PC), one of the main opposition parties, is in the lead with the support...
8299 sym 10 img
A question and an answer about recoding several factors simultaneously in R
Data manipulation is a breeze with amazing packages like plyr and dplyr. Recoding factors, which could prove to be a daunting task especially for variables that have many categories, can easily be accomplished with these packages. However, it is important for those learning Data Science to understand how the basic R works.In this regard, I seek h...
2621 sym 10 img
A question and an answer about recoding several factors simultaneously in R
Data manipulation is a breeze with amazing packages like plyr and dplyr. Recoding factors, which could prove to be a daunting task especially for variables that have many categories, can easily be accomplished with these packages. However, it is important for those learning Data Science to understand how the basic R works.In this regard, I seek h...
2621 sym 10 img
Modern Data Science with R: A review
Some say data is the new oil. Others equate its worth to water. And then there are those who believe that data scientists will be (in fact, they already are) one of the most sought-after workers in knowledge economies.Millions of data-centric jobs require millions of trained data scientists. However, the installed capacity of graduate and undergr...
6018 sym 2 img
Modern Data Science with R: A review
Some say data is the new oil. Others equate its worth to water. And then there are those who believe that data scientists will be (in fact, they already are) one of the most sought-after workers in knowledge economies.Millions of data-centric jobs require millions of trained data scientists. However, the installed capacity of graduate and undergr...
6015 sym 2 img