Publications by matloff

Bad Coder, Bad Coder!

07.07.2016

My title here is in the sense of “Bad dog, bad dog!”, a scolding I sometimes see dog owners use to tame their pets, and is also an allusion to Bad Reporter, a sometimes hilarious and always irreverent political comic strip in the San Francisco Chronicle. And my title is intended to convey the point that I think that “good programming practi...

4244 sym 12 img

New Release of partools Package

17.07.2016

My new release of partools is now on CRAN. The package is aimed at doing parallel data science in what I call an “un-MapReduce” manner. It takes the point of view that MapReduce-based frameworks such as Hadoop and Spark are fine for the types of applications their designers had in mind, namely rather simple SQL actions, but have fundamental h...

1785 sym 6 img

StatET IDE for R

22.07.2016

I personally do not use Integrated Development environments (IDEs) for R, or for that matter for any programming language. From my point of view, they take up too much precious real estate on the screen, and most important, they generally do not allow me to use my own text editor and my own abbreviations and macros. Since I want to have a uniform...

1537 sym 6 img

Manny Parzen Used R!

20.08.2016

Prof. Manny Parzen, a pioneer of modern statistics, passed away in February, aged 87. I should have commented back then, but it’s still worth saying something today. I happened to be thinking of him this morning. I did not know Manny personally. This makes it odd that I refer to him by his first name, but I do so in the same spirit that people ...

2367 sym 4 img

My regtools Package Is Now on CRAN

07.11.2016

In my posts to this blog (less frequent than I would like, hopefully more frequent in the future), I’ve often mentioned my R package regtools, which contains a number of functions useful for regression and classification. None of them duplicate what is available in the excellent packages on CRAN, so I will dare characterize regtools as innovati...

1129 sym 8 img

Using CART: Implementation Matters

11.12.2016

In preparing the following example for my forthcoming book, I was startled by how different two implementations of Classification and Regression Trees (CART) performed on a particular data set. Here is what happened: For my example, I first used the Vertebral Column data set from the UCI Machine Learning Repository. The task is to classify patie...

2796 sym R (563 sym/4 pcs) 4 img

Threading in R?

11.12.2016

I was pleased to see today’s post, “(A Very) Experimental Threading in R,” by Lukasz Bartnik, as this is a long-standing interest of mine. My own effort in this direction has been my package Rdsm. The notion of threading, for those who may not have this background, refers to several instances of a program, in this case, several instances o...

2376 sym 4 img

Any Forward Progress on p-Values?

09.02.2017

Statisticians have long known that the use of p-values has major problems. Some of us have long called for reform, weaning the profession away from these troubling beasts. At one point, I was pleased to see Frank Harrell suggest that R should stop computing them. That is not going to happen, but last year the ASA shocked many people by producing ...

3761 sym 4 img

Discrete Event Simulation in R (and, Why R Is Different)

05.04.2017

I was pleased to see the announcement yesterday of simmer 3.61. a discrete-event simulation (DES) package for R. I’ve long had an interest in DES, and as I will explain below, implementing DES in R brings up interesting issues about R that transcend the field of DES. I had been planning to discuss them in the context of my own DES package for R...

6419 sym R (43 sym/1 pcs) 8 img

A Python-Like walk() Function for R

08.04.2017

A really nice function available in Python is walk(), which recursively descends a directory tree, calling a user-supplied function in each directory within the tree. It might be used, say, to count the number of files, or maybe to remove all small files and so on. I had students in my undergraduate class write such a function in R for homework, ...

2814 sym R (863 sym/4 pcs) 4 img