Publications by John Mount

Let’s Have Some Sympathy For The Part-time R User

04.08.2017

When I started writing about methods for better “parametric programming” interfaces for dplyr for R dplyr users in December of 2016 I encountered three divisions in the audience: dplyr users who had such a need, and wanted such extensions. dplyr users who did not have such a need (“we always know the column names”). dplyr users who found...

14264 sym R (6403 sym/21 pcs) 2 img

More on “The Part-Time R-User”

06.08.2017

I have some more thoughts on the topic: “the part-time R-user.” I am thinking a bit more about the diversity R users. It occurs to me simply dividing R users into two groups, beginning and advanced, neglects a very important group: the part-time R user. This leaves us teachers and package developers with an unfortunate bias. The concept of ...

2018 sym

Supervised Learning in R: Regression

13.08.2017

We are very excited to announce a new (paid) Win-Vector LLC video training course: Supervised Learning in R: Regression now available on DataCamp The course is primarily authored by Dr. Nina Zumel (our chief of course design) with contributions from Dr. John Mount. This course will get you quickly up to speed covering: What is regression? (Hint...

1188 sym 2 img

Thank You For The Very Nice Comment

16.08.2017

Somebody nice reached out and gave us this wonderful feedback on our new Supervised Learning in R: Regression (paid) video course. Thanks for a wonderful course on DataCamp on XGBoost and Random forest. I was struggling with Xgboost earlier and Vtreat has made my life easy now :). Supervised Learning in R: Regression covers a lot as it treats pre...

1484 sym 4 img

Is dplyr Easily Comprehensible?

19.08.2017

dplyr is one of the most popular R packages. It is powerful and important. But is it in fact easily comprehensible?dplyr makes sense to those of us who use it a lot. And we can teach part time R users a lot of the common good use patterns. But, is it an easy task to study and characterize dplyr itself? Please take our advanced dplyr quiz to ...

854 sym 2 img

Some Neat New R Notations

22.08.2017

The R package seplyr supplies a few neat new coding notations. An Abacus, which gives us the term “calculus.” The first notation is an operator called the “named map builder”. This is a cute notation that essentially does the job of stats::setNames(). It allows for code such as the following: library("seplyr") names <- c('a', 'b') n...

2386 sym R (190 sym/4 pcs) 2 img

wrapr: R Code Sweeteners

25.08.2017

wrapr is an R package that supplies powerful tools for writing and debugging R code. Primary wrapr services include: let() %.>% (dot arrow pipe) := (named map builder) λ() (anonymous function builder) DebugFnW() let() let() allows execution of arbitrary code with substituted variable names (note this is subtly different than binding values fo...

3554 sym R (1136 sym/9 pcs) 2 img

Neat New seplyr Feature: String Interpolation

28.08.2017

The R package seplyr has a neat new feature: the function seplyr::expand_expr() which implements what we call “the string algebra” or string expression interpolation. The function takes an expression of mixed terms, including: variables referring to names, quoted strings, and general expression terms. It then “de-quotes” all of the variab...

2863 sym R (2325 sym/16 pcs) 2 img

Why to use the replyr R package

31.08.2017

Recently I noticed that the R package sparklyr had the following odd behavior: suppressPackageStartupMessages(library("dplyr")) library("sparklyr") packageVersion("dplyr") #> [1] '0.7.2.9000' packageVersion("sparklyr") #> [1] '0.6.2' packageVersion("dbplyr") #> [1] '1.1.0.9000' sc <- spark_connect(master = 'local') #> * Using Spark: 2.1.0 d <- d...

3414 sym R (545 sym/2 pcs) 4 img

Permutation Theory In Action

02.09.2017

While working on a large client project using Sparklyr and multinomial regression we recently ran into a problem: Apache Spark chooses the order of multinomial regression outcome targets, whereas R users are used to choosing the order of the targets (please see here for some details). So to make things more like R users expect, we need a way to t...

2340 sym R (682 sym/8 pcs)