Publications by John Myles White
Criticism 4 of NHST: No Mechanism for Producing Substantive Cumulative Knowledge
[Note to the Reader: This is a much rougher piece than the previous pieces because the argument is more complex. I ask that you please point out places where things are unclear and where claims are not rigorous.] In this fourth part of my series of criticisms of NHST, I’m going to focus on broad questions of epistemology: I want to ask what typ...
16393 sym R (68 sym/1 pcs)
The Great Julia RNG Refactor
Many readers of this blog will know that I’m a big fan of Bayesian methods, in large part because automated inference tools like JAGS allow modelers to focus on the types of structure they want to extract from data rather than worry about the algorithmic details of how they will fit their models to data. For me, the ease with which we can const...
4780 sym Python (1098 sym/8 pcs) 4 tbl
Bayesian Nonparametrics in R
On July 25th, I’ll be presenting at the Seattle R Meetup about implementing Bayesian nonparametrics in R. If you’re not sure what Bayesian nonparametric methods are, they’re a family of methods that allow you to fit traditional statistical models, such as mixture models or latent factor models, without having to fully specify the number of ...
2943 sym 14 img
Optimization Functions in Julia
Over the last few weeks, I’ve made a concerted effort to develop a basic suite of optimization algorithms for Julia so that Matlab programmers used to using fminunc() and R programmers used to using optim() can start to transition code over to Julia that requires access to simple optimization algorithms like L-BFGS and the Nelder-Mead method. A...
3538 sym R (1196 sym/2 pcs) 4 img 1 tbl
Criticism 5 of NHST: p-Values Measure Effort, Not Truth
Introduction In the third installment of my series of criticisms of NHST, I focused on the notion that a p-value is nothing more than a one-dimensional representation of a two-dimensional space in which (1) the measured size of an effect and (2) the precision of this measurement have been combined in such a way that we can never pull those two di...
6917 sym 2 img
Automatic Hyperparameter Tuning Methods
At MSR this week, we had two very good talks on algorithmic methods for tuning the hyperparameters of machine learning models. Selecting appropriate settings for hyperparameters is a constant problem in machine learning, which is somewhat surprising given how much expertise the machine learning community has in optimization theory. I suspect ther...
5716 sym
My New Book: Developing, Deploying and Debugging Multi-Armed Bandit Algorithms
I’m happy to announce that I’ve started writing a new book for O’Reilly, which will focus on teaching readers how to use Multi-Armed Bandit Algorithms to build better websites. My hope is that the book can help web developers build up an intuition for the core conundrum facing anyone who wants to build a successful business: you have to con...
3766 sym
The Social Dynamics of the R Core Team
Recently a few members of R Core have indicated that part of what slows down the development of R as a language is that it has become increasingly difficult over the years to achieve consensus among the core developers of the language. Inspired by these claims, I decided to look into this issue quantitatively by measuring the quantity of commits ...
2258 sym 4 img 1 tbl
DataGotham
As some of you may know already, I’m co-organizing an upcoming conference called DataGotham that’s taking place in September. To help spread the word about DataGotham, I’m cross-posting the most recent announcement below: We’d like to let you know about DataGotham: a celebration of New York City’s data community! http://datagotham.com ...
1591 sym
Will Data Scientists Be Replaced by Tools?
The Quick-and-Dirty Summary I was recently asked to participate in a proposed SXSW panel that will debate the question, “Will Data Scientists Be Replaced by Tools?” This post describes my current thinking on that question as a way of (1) convincing you to go vote for the panel’s inclusion in this year’s SXSW and (2) instigating a larger d...
5076 sym