Publications by John Myles White
The FourierDescriptors Package
Introduction I’ve just uploaded a new package to CRAN based on a stimulus generation algorithm that I use for my experiments on vision. The FourierDescriptors package provides methods for creating, manipulating and visualizing Fourier descriptors, which are a representational scheme used to describe closed planar contours. The canonical referen...
1654 sym R (709 sym/16 pcs) 10 img 8 tbl
Response Times, The Exponential Distribution and Poisson Processes
I’m currently reading Luce’s “Response Times”. If you don’t know anything about response times, they are very easily defined: a response time is the length of time it takes a person to respond to a simple request, measured from the moment when the request is made to the moment when the person’s response is recorded. In principle, you ...
5719 sym 2 img
Why Use ProjectTemplate or Any Other Framework?
We use frameworks like Ruby on Rails or ProjectTemplate to minimize the time we spend on irrelevant details. By definition, an irrelevant detail isn’t of interest to us. But how can we tell which details are irrelevant? This isn’t a trivial task and it seems to be, on the surface, a profoundly subjective matter. Thankfully, it’s much simple...
9549 sym R (479 sym/4 pcs) 2 tbl
Seeing the Big Picture
Here’s a nice snippet from a 2009 article by Kass that I read yesterday: According to my understanding, laid out above, statistical pragmatism has two main features: it is eclectic and it emphasizes the assumptions that connect statistical models with observed data. The pragmatic view acknowledges that both sides of the frequentist-Bayesian de...
2417 sym
Higher Order Functions in R
Introduction Because R is, in part, a functional programming language, the ‘base’ package contains several higher order functions. By higher order functions, I mean functions that take another function as an argument and then do something with that function. If you want to know more about the usefulness of writing higher order functions in ge...
4967 sym R (1803 sym/32 pcs) 16 tbl
Two New R Packages: log4r and SortableHTMLTables
I’ve just released two new packages for R: log4r and SortableHTMLTables. log4r is a minimal logging utility for R that’s inspired by the log4j family of logging tools. It has substantially fewer features than other logging tools for R, but it’s hopefully easier to use. SortableHTMLTables uses brew and the jQuery Tablesorter plugin to provid...
982 sym
Three-Quarter Truths: Correlation Is Not Causation
Other than our culture’s implicit association between lies, damned lies and statistics, I think no idea has stifled the growth of statistical literacy as much as the endless repetition of the words correlation is not causation. This phrase seems to be primarily used to suppress intellectual inquiry by encouraging the unspoken assumption that co...
11121 sym R (83 sym/2 pcs) 2 img 1 tbl
ProjectTemplate Version 0.1-3 Released
I’ve just released the newest version of ProjectTemplate. The primary change is a completely redesigned mechanism for automatically loading data. ProjectTemplate can now read compressed CSV files, access CSV data files over HTTP, read Stata, SPSS and RData binary files and even load MySQL database tables automatically. For my own projects, this...
3684 sym R (214 sym/4 pcs) 2 tbl
Build a Recommendation System for R Packages
On Dataists, a new collaborative blog for data hackers that I’m contributing to, we’ve just announced a data contest that’s custom made for R users. To win the contest, you need to build a recommendation system for R packages. To find out more, check out the official announcement on Dataists. Then go to GitHub to get the data sets we’re p...
1227 sym
R Recommendation Contest Launches on Kaggle
The R Recommendation Engine contest is now live on Kaggle. Please head over there and start submitting your predictions for the test data set. Once you do, you can check the leaderboard to see how your algorithm compares with other people’s work. We know that there’s still plenty of progress that can be made, because we have other models that...
1300 sym