Publications by ALT
Mickey Mouse Models
My statistics professor once drew a little Markov chain on the board and called it “just a Mickey Mouse model,” because it was too simple to represent anything serious. Related To leave a comment for the author, please follow the link and comment on their blog: mickeymousemodels. R-bloggers.com offers daily e-mail updates abou...
580 sym 2 img
Flu Trends
Not a model, but certainly Mickey Mousey: here’s some R code that plots Google’s US flu data:df <- read.csv(url("http://www.google.org/flutrends/us/data.txt"), skip=11) df$Date <- as.Date(df$Date) dev.new(height=8, width=12) # Leave a thin outer margin par(oma=c(0.5, 0.5, 0.5, 0.5)) # Plot data; suppress x-axis plot(df$Date,...
1541 sym 4 img
Logistic Regression & Factors in R
Factors are R’s enumerated type. Suppose you define the variable cities — a vector of strings — whose possible values are “New York,” “Paris,” “London” and “Beijing.” Instead of representing each city as a string of characters, you might prefer to define an encoding, eg {1=”New York”, 2=”Paris”, 3=”Lo...
3702 sym 6 img
Of Height and Speed in Tennis, or Fuzziness and Techiness in College
I thought of this after reading this post and perhaps also this one, one the Cheap Talk blog. Here’s the puzzle: in general, being tall does not make you slow; but among professional tennis players, the tall athletes do tend to be relatively sluggish. Why does this happen? Cheap Talk gives a perfectly good written explanation,...
4152 sym 6 img
A Tiny Model of Evolution
I’ve always wanted to write a(n overly) simple model of evolution. The assumptions are minimalistic: only one species, for which each individual’s genotype is represented as a one-dimensional real number, e.g. 7.4. Now, the fun stuff: I define a function mapping genotype to probability of reproduction, like this:You might wond...
6597 sym 4 img
Schelling’s Neighborhood Model
The New York Times has created a beautiful visualization of the Census Bureau’s 2005-2009 American Community Survey data. The distribution of racial and ethnic groups in New York City is particularly fascinating:Chinatown appears in red toward the south-eastern end of Manhattan; Harlem, above Central Park, is solidly blue; nearby, ...
2792 sym R (3909 sym/1 pcs) 8 img
A Little R Counter
I recently read a great post about environments in R, which featured this little bit of code:> createCounter <- function(value) { function(i) { value <<- value+i} } > counter <- createCounter(0) > counter(1) > a <- counter(0) > a [1] 1 > counter(1) > counter(1) > a <- counter(1) > a [1] 4 > a <- counter(5) > a [1] 9I found this partic...
713 sym R (495 sym/3 pcs) 2 img
On Crows
Today I made the mistake of clicking on the “Next Blog” button, which took me to a rather inane post complaining that crows are (obviously) stupid (because they are sometimes hit by cars). I was reminded that crows are actually quite smart. Related To leave a comment for the author, please follow the link and comment on their b...
653 sym 2 img
Dependence and Correlation
In everyday life I hear the word “correlation” thrown around far more often than “dependence.” What’s the difference? Correlation, in its most common form, is a measure of linear dependence; the catch is that not all dependencies are linear. The set of correlated random variables lies entirely within of the larger set of ...
1519 sym R (1392 sym/6 pcs) 14 img
A Little Sampling Puzzle
Suppose you have 10 objects from which you take a sample of size 20 (with replacement, or you’re in trouble). What’s the probability that each object was chosen at least once? Getting an answer via simulation is pleasantly easy:f <- function(n=10, k=20) { x <- 1:n x.sample <- sample(x, size=k, replace=TRUE) return(length(u...
812 sym R (313 sym/2 pcs) 2 img