Publications by John Mount

A comment on preparing data for classifiers

04.12.2014

I have been working through (with some honest appreciation) a recent article comparing many classifiers on many data sets: “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?” Manuel Fernández-Delgado, Eva Cernadas, Senén Barro, Dinani Amorim; 15(Oct):3133−3181, 2014 (which we will call “the DWN paper” in ...

12384 sym 4 img

Is there a Kindle edition of Practical Data Science with R?

21.12.2014

We have often been asked “why is there no Kindle edition of Practical Data Science with R on Amazon.com?” The short answer is: there is an edition you can read on your Kindle: but it is from the publisher Manning (not Amazon.com). The long answer is: when Amazon.com supplies a Kindle edition readers have to deal with the following: Amazon....

4132 sym 4 img

R bracket is a bit irregular

17.01.2015

While skimming Professor Hadley Wickham’s Advanced R I got to thinking about nature of the square-bracket or extract operator in R. It turns out “[,]” is a bit more irregular than I remembered. The subsetting section of Advanced R has a very good discussion on the subsetting and selection operators found in R. In particular it raises the ...

10150 sym 6 img

Check your return types when modeling in R

27.01.2015

Just a warning: double check your return types in R, especially when using different modeling packages. We consider ourselves pretty familiar with R. We have years of experience, many other programming languages to compare R to, and we have taken Hadley Wickham’s Master R Developer Workshop (highly recommended). We already knew R’s predict...

7323 sym 2 img

Announcing: Introduction to Data Science video course

25.02.2015

Win-Vector LLC’s Nina Zumel and John Mount are proud to announce their new data science video course Introduction to Data Science is now available on Udemy. We designed the course as an introduction to an advanced topic. The course description is: Use the R Programming Language to execute data science projects and become a data scientist. Imp...

3823 sym 6 img

The Win-Vector R data science value pack

11.03.2015

Win-Vector LLC is proud to announce the R data science value pack. 50% off our video course Introduction to Data Science (available at Udemy) and 30% off Practical Data Science with R (from Manning). Pick any combination of video, e-book, and/or print-book you want. Instructions below. Please share and Tweet! For 50% off the video course Intr...

1361 sym 6 img

Using closures as objects in R

27.03.2015

For more and more clients we have been using a nice coding pattern taught to us by Garrett Grolemund in his book Hands-On Programming with R: make a function that returns a list of functions. This turns out to be a classic functional programming techique: use closures to implement objects (terminology we will explain). It is a pattern we strongly...

12286 sym R (2284 sym/9 pcs) 2 img

How and why to return functions in R

03.04.2015

One of the advantages of functional languages (such as R) is the ability to create and return functions “on the fly.” We will discuss one good use of this capability and what to look out for when creating functions in R. Why wrap/return functions? One of my favorite uses of “on the fly functions” is regularizing R’s predict() function...

13002 sym 4 img

New video course: Campaign Response Testing

08.04.2015

I am proud to announce a new Win-Vector LLC statistics video course: Campaign Response Testing John Mount, Win-Vector LLC This course works through the very specific statistics problem of trying to estimate the unknown true response rates one or more populations in responding to one or more sales/marketing campaigns or price-points. This is ...

4448 sym 6 img 1 tbl

What can be in an R data.frame column?

09.04.2015

As an R programmer have you every wondered what can be in a data.frame column? The documentation is a bit vague, help(data.frame) returns some comforting text including: Value A data frame, a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and so on). If you ask an R programmer the commonly...

8846 sym 2 img