Publications by Bob Muenchen

Forrester’s 2017 Take on Tools for Data Science

16.03.2017

In my ongoing quest to track The Popularity of Data Science Software, I’ve updated the discussion of the annual report from Forrester, which I repeat here to save you from having to read through the entire document. If your organization is looking for training in the R language, you might consider my books, R for SAS and SPSS Users or R for Sta...

3500 sym 2 img

The Tidyverse Curse

23.03.2017

I’ve just finished a major overhaul to my widely read article, Why R is Hard to Learn. It describes the main complaints I’ve heard from the participants to my workshops, and how those complaints can often be mitigated. Here’s the only new section: The Tidyverse Curse There’s a common theme in many of the sections above: a task that is ha...

5899 sym R (3452 sym/6 pcs)

Python and R Vie for Top Spot in Kaggle Competitions

04.04.2017

I’ve just updated the Competition Use section of The Popularity of Data Science Software. Here’s just that section for your convenience. Competition Use Kaggle.com is a web site that sponsors data science contests. People post problems there along the amount of money they are willing pay the person or team who solves their problem the best. B...

1255 sym 2 img

Keeping Up with Your Data Science Options

12.04.2017

The field of data science is changing so rapidly that it’s quite hard to keep up with it all. When I first started tracking The Popularity of Data Science Software in 2010, I followed only ten packages, all of them classic statistics software. The term data science hadn’t caught on yet, data mining was still a new thing. One of my recent bl...

2280 sym 2 img

Group-By Modeling in R Made Easy

18.04.2017

There are several aspects of the R language that make it hard to learn, and repeating a model for groups in a data set used to be one of them. Here I briefly describe R’s built-in approach, show a much easier one, then refer you to a new approach described in the superb book,  “>R for Data Science, by Hadley Wickham and Garrett Grolemund. F...

7013 sym R (8555 sym/12 pcs) 6 img

Dueling Data Science Surveys: KDnuggets & Rexer Go Live

16.05.2017

What tools do we use most for data science, machine learning, or analytics? Python, R, SAS, KNIME, RapidMiner,…? How do we use them? We are about to find out as the two most popular surveys on data science tools have both just gone live. Please chip in and help us all get a better understanding of the tools of our trade. For 18 consecutive year...

1742 sym

Data Science Tool Market Share Leading Indicator: Scholarly Articles

19.06.2017

Below is the latest update to The Popularity of Data Science Software. It contains an analysis of the tools used in the most recent complete year of scholarly articles. The section is also integrated into the main paper itself. New software covered includes: Amazon Machine Learning, Apache Mahout, Apache MXNet, Caffe, Dataiku, DataRobot, Domino D...

10676 sym 10 img

jamovi for R: Easy but Controversial

13.02.2018

jamovi is software that aims to simplify two aspects of using R. It offers a point-and-click graphical user interface (GUI). It also provides functions that combines the capabilities of many others, bringing a more SPSS- or SAS-like method of programming to R. The ideal researcher would be an expert at their chosen field of study, data analysis, ...

12861 sym R (1281 sym/2 pcs) 10 img

Gartner’s 2018 Take on Data Science Tools

26.02.2018

I’ve just updated The Popularity of Data Science Software to reflect my take on Gartner’s 2018 report, Magic Quadrant for Data Science and Machine Learning Platforms. To save you the trouble of digging though all 40+ pages of my report, here’s just the new section: IT Research Firms IT research firms study software products and corporate st...

5836 sym 4 img

Using Excel for Data Entry

28.03.2018

This article shows you how to enter data so that you can easily open in statistics packages such as R, SAS, SPSS, or jamovi (code or GUI steps below). Excel has some statistical analysis capabilities, but they often provide incorrect answers. For a comprehensive list of these limitations, see http://www.forecastingprinciples.com/paperpdf/McCull...

10848 sym R (264 sym/2 pcs) 2 tbl