Publications by John Mount
On “Competition” in the R Ecosystem
I’ve been thinking a bit on “competition” in the R ecosystem. I guess the closest I can come to a fair and coherent view on “competition” in the R ecosystem is some variation of the following. I, of course, should not be treating things as a competition. We are all doing work and hoping for a bit of public mind share. We all want our ...
3297 sym
Parameterizing with bquote
One thing that is sure to get lost in my long note on macros in R is just how concise and powerful macros are. The problem is macros are concise, but they do a lot for you. So you get bogged down when you explain the joke. Let’s try to be concise. Below is an extension of an example taken from the Programming with dplyr note. First let’s loa...
1664 sym R (1375 sym/7 pcs)
Dot-Pipe Paper Accepted by the R Journal!!!
We are thrilled to announce our (my and Nina Zumel’s) paper on the dot-pipe has been accepted by the R-Journal! A huge “thank you” to the reviewers and editors for helping us with this! You can find our article here (pdf here)! Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector...
633 sym 2 img
Using a Column as a Column Index
We recently saw a great recurring R question: “how do you use one column to choose a different value for each row?” That is: how do you use a column as an index? Please read on for some idiomatic base R, data.table, and dplyr solutions. Let’s say we have some example data: df <- data.frame(x = c(1, 2, 3, 4), y = c(5, 6, 7,...
2661 sym R (1298 sym/5 pcs)
Timing Column Indexing in R
I’ve ended up (almost accidentally) collecting a number of different solutions to the “use a column to choose values from other columns in R” problem. Please read on for a brief benchmark comparing these methods/solutions. What we did is: build a 1,000,000 row variation of the original example. In this variation we ensured that the select...
4654 sym R (426 sym/1 pcs) 4 img
A Subtle Flaw in Some Popular R NSE Interfaces
It is no great secret: I like value oriented interfaces that preserve referential transparency. It is the side of the public debate I take in R programming. “One of the most useful properties of expressions is that called by Quine referential transparency. In essence this means that if we wish to find the value of an expression which contains ...
8775 sym R (2377 sym/41 pcs)
A Better Example of the Confused By The Environment Issue
Our interference from then environment issue was a bit subtle. But there are variations that can be a bit more insidious. Please consider the following. library("dplyr") # unrelated value that happens # to be in our environment z <- "y" data.frame(x = 1, y = 2, z = 3) %>% select(-z) # x y # 1 1 2 data.frame(x = 1, y = 2) %>% # oops, no...
783 sym R (299 sym/1 pcs)
Modeling muti-category Outcomes With vtreat
vtreat is a powerful R package for preparing messy real-world data for machine learning. We have further extended the package with a number of features including rquery/rqdatatable integration (allowing vtreat application at scale on Apache Spark or data.table!). In addition vtreat and can now effectively prepare data for multi-class classificat...
1937 sym R (3529 sym/12 pcs) 2 tbl
Quick Significance Calculations for A/B Tests in R
Introduction Let’s take a quick look at a very important and common experimental problem: checking if the difference in success rates of two Binomial experiments is statistically significant. This can arise in A/B testing situations such as online advertising, sales, and manufacturing. We already share a free video course on a Bayesian treatmen...
6929 sym R (2686 sym/35 pcs)
Running the Same Task in Python and R
According to a KDD poll fewer respondents (by rate) used only R in 2017 than in 2018. At the same time more respondents (by rate) used only Python in 2017 than in 2016. Let’s take this as an excuse to take a quick look at what happens when we try a task in both systems. For our task we picked the painful exercise of directly reading a 50,000,...
1698 sym 6 img