Publications by Millsy

Joe West vs. Bruce Froemming: A Crude Umpire LHB/RHB Bias Comparison

17.12.2010

In my last two posts, I have tinkered with the ‘gam’ package to create heat maps for individual umpire strike zones. I went ahead and grabbed Joe West’s data (which has a lot more pitches than Bruce Froemming in it, since Froemming’s data is only from 2007). Below, I have mapped them out with a new color scheme (those of you...

5012 sym R (4337 sym/1 pcs) 14 img

sab-R-metrics: Introduction to R

05.01.2011

In a recent post, I briefly mentioned that I may turn a majority of the focus of this blog to teaching R commands for use with sabermetric analysis. Only a few days later, Ricky Zanker began a new column at The Hardball Times doing just that. But that’s okay. Hopefully both his and mine can complement one another. If there is an...

13555 sym R (292 sym/1 pcs) 4 img

sab-R-metrics: Basics of Vectors and Data Calling

06.01.2011

Wednesday, I began a new series called “sab-R-metrics”. My hope is that it reduces the frustration that goes along with learning a new programming language and enhances others’ ability to perform their own analysis in baseball or other sports. However, these tutorials will hopefully allow you to use these skills in other areas...

11762 sym R (128 sym/1 pcs)

sab-R-metrics: Subsetting, Conditional Statements, ‘tapply()’, and VERY simple ‘for loops’

11.01.2011

In my last sab-R-metrics post, I went over some basics of calling data and creating vectors or new data from those. Here, I want to extend that to full subsets of data and go on to use some of the basic functions in R so that we can begin plotting in the next tutorial.Before I begin, I want to first give a data example that you can a...

11090 sym R (1978 sym/1 pcs)

sab-R-metrics: Beginning with Boxplots, Scatterplots, and Histograms

15.01.2011

Today I decided to begin more with visualizations and less with basic statistical analysis for sabermetrics using R. I’m not really here to teach the ins and outs of regressions and statistical tests, so once I get there, I’m hoping that those who have read this already have a decent understanding of those subjects before impleme...

8193 sym R (1922 sym/1 pcs) 10 img

sab-R-metrics: Intermediate Boxplots and Histograms

20.01.2011

Last week, I began talking about using the base graphics in R. Those graphics were pretty bland, and my hope for the next two posts is to introduce some interesting additions to the basic graphics that come from R: color, legends, lines, shapes, multiple graphs side-by-side, text, point types, and custom axes. If you have missed any of the prev...

12371 sym R (3259 sym/1 pcs) 10 img

sab-R-metrics: Intermediate Scatter Plots

25.01.2011

First off, I’ll say it’s been a whirlwind of a past few days. Thanks to David Smith at the Revolutions Blog for his kind words about the sab-R-metrics series and link back this way. Add in Ed Kupfer’s posts at the APBRmetrics board, Harry Pavlidis at THT, Dave Allen at Fangraphs and about 30 Twitterers, I’ve seen some serio...

13379 sym R (274 sym/1 pcs) 16 img

sab-R-metrics: Some Extra Visualization Customization

31.01.2011

Last post, I described a number of ways to show your data on a scatter plot. Ricky Zanker at THT has a similar post today for those looking to get some extra exposure and another take on R programming. Today, I plan to extend on this with a little more customization. First, if you’ve missed all of the previous sab-R-metrics posts...

17885 sym R (379 sym/1 pcs) 14 img

Fixing Up smoothScatter Heat Maps

02.02.2011

A while back, I posted an article using the smoothScatter function in R that builds a color representation of density for scatter plots. When I first found the function, I was extremely excited because it’s a very easy and automated way to make a heat map! Unfortunately, the more I messed with the function, the more annoying it be...

5671 sym 14 img

sab-R-metrics: Displaying Line Plots and Time Series Data

13.02.2011

It’s been a while since I’ve had the chance to add anything here, but last time I left everyone with some scatter plots and some customization tools for your graphics. This week will be a little more brief than the last few tutorials and what I’d like to do is show you how to display line graphs for time series data. For this,...

7218 sym R (2216 sym/1 pcs) 10 img