Publications by John Mount
Using closures as objects in R
For more and more clients we have been using a nice coding pattern taught to us by Garrett Grolemund in his book Hands-On Programming with R: make a function that returns a list of functions. This turns out to be a classic functional programming techique: use closures to implement objects (terminology we will explain). It is a pattern we strongly...
12286 sym R (2284 sym/9 pcs) 2 img
How and why to return functions in R
One of the advantages of functional languages (such as R) is the ability to create and return functions “on the fly.” We will discuss one good use of this capability and what to look out for when creating functions in R. Why wrap/return functions? One of my favorite uses of “on the fly functions” is regularizing R’s predict() function...
13002 sym 4 img
New video course: Campaign Response Testing
I am proud to announce a new Win-Vector LLC statistics video course: Campaign Response Testing John Mount, Win-Vector LLC This course works through the very specific statistics problem of trying to estimate the unknown true response rates one or more populations in responding to one or more sales/marketing campaigns or price-points. This is ...
4448 sym 6 img 1 tbl
What can be in an R data.frame column?
As an R programmer have you every wondered what can be in a data.frame column? The documentation is a bit vague, help(data.frame) returns some comforting text including: Value A data frame, a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and so on). If you ask an R programmer the commonly...
8846 sym 2 img
What is new in the vtreat library?
The Win-Vector LLC vtreat library is a library we supply (under a GPL license) for automating the simple domain independent part of variable cleaning an preparation. The idea is you supply (in R) an example general data.frame to vtreat’s designTreatmentsC method (for single-class categorical targets) or designTreatmentsN method (for numeric tar...
5120 sym 2 img
My favorite R bug
In this note am going to recount “my favorite R bug.” It isn’t a bug in R. It is a bug in some code I wrote in R. I call it my favorite bug, as it is easy to commit and (thanks to R’s overly helpful nature) takes longer than it should to find. The original problem I was working on was generating a training set as a subset of a simple dat...
4606 sym R (3136 sym/8 pcs) 4 img
R in a 64 bit world
32 bit data structures (pointers, integer representations, single precision floating point) have been past their “best before date” for quite some time. R itself moved to a 64 bit memory model some time ago, but still has only 32 bit integers. This is going to get more and more awkward going forward. What is R doing to work around this limita...
7104 sym R (977 sym/6 pcs) 4 img
A bit about Win-Vector LLC
Win-Vector LLC is a consultancy founded in 2007 that specializes in research, algorithms, data-science, and training. (The name is an attempt at a mathematical pun.) Win-Vector LLC can complete your high value project quickly (some examples), and train your data science team to work much more effectively. Our consultants include the a...
820 sym
What is a good Sharpe ratio?
We have previously written that we like the investment performance summary called the Sharpe ratio (though it does have some limits). What the Sharpe ratio does is: give you a dimensionless score to compare similar investments that may vary both in riskiness and returns without needing to know the investor’s risk tolerance. It does ...
802 sym
A dynamic programming solution to A/B test design
Our last article on A/B testing described the scope of the realistic circumstances of A/B testing in practice and gave links to different standard solutions. In this article we will be take an idealized specific situation allowing us to show a particularly beautiful solution to one very special type of A/B test. For this article … C...
815 sym