Publications by Millsy
Joe West vs. Bruce Froemming: A Crude Umpire LHB/RHB Bias Comparison
In my last two posts, I have tinkered with the ‘gam’ package to create heat maps for individual umpire strike zones. I went ahead and grabbed Joe West’s data (which has a lot more pitches than Bruce Froemming in it, since Froemming’s data is only from 2007). Below, I have mapped them out with a new color scheme (those of you...
5012 sym R (4337 sym/1 pcs) 14 img
sab-R-metrics: Introduction to R
In a recent post, I briefly mentioned that I may turn a majority of the focus of this blog to teaching R commands for use with sabermetric analysis. Only a few days later, Ricky Zanker began a new column at The Hardball Times doing just that. But that’s okay. Hopefully both his and mine can complement one another. If there is an...
13555 sym R (292 sym/1 pcs) 4 img
sab-R-metrics: Basics of Vectors and Data Calling
Wednesday, I began a new series called “sab-R-metrics”. My hope is that it reduces the frustration that goes along with learning a new programming language and enhances others’ ability to perform their own analysis in baseball or other sports. However, these tutorials will hopefully allow you to use these skills in other areas...
11762 sym R (128 sym/1 pcs)
sab-R-metrics: Subsetting, Conditional Statements, ‘tapply()’, and VERY simple ‘for loops’
In my last sab-R-metrics post, I went over some basics of calling data and creating vectors or new data from those. Here, I want to extend that to full subsets of data and go on to use some of the basic functions in R so that we can begin plotting in the next tutorial.Before I begin, I want to first give a data example that you can a...
11090 sym R (1978 sym/1 pcs)
sab-R-metrics: Beginning with Boxplots, Scatterplots, and Histograms
Today I decided to begin more with visualizations and less with basic statistical analysis for sabermetrics using R. I’m not really here to teach the ins and outs of regressions and statistical tests, so once I get there, I’m hoping that those who have read this already have a decent understanding of those subjects before impleme...
8193 sym R (1922 sym/1 pcs) 10 img
sab-R-metrics: Intermediate Boxplots and Histograms
Last week, I began talking about using the base graphics in R. Those graphics were pretty bland, and my hope for the next two posts is to introduce some interesting additions to the basic graphics that come from R: color, legends, lines, shapes, multiple graphs side-by-side, text, point types, and custom axes. If you have missed any of the prev...
12371 sym R (3259 sym/1 pcs) 10 img
sab-R-metrics: Intermediate Scatter Plots
First off, I’ll say it’s been a whirlwind of a past few days. Thanks to David Smith at the Revolutions Blog for his kind words about the sab-R-metrics series and link back this way. Add in Ed Kupfer’s posts at the APBRmetrics board, Harry Pavlidis at THT, Dave Allen at Fangraphs and about 30 Twitterers, I’ve seen some serio...
13379 sym R (274 sym/1 pcs) 16 img
sab-R-metrics: Some Extra Visualization Customization
Last post, I described a number of ways to show your data on a scatter plot. Ricky Zanker at THT has a similar post today for those looking to get some extra exposure and another take on R programming. Today, I plan to extend on this with a little more customization. First, if you’ve missed all of the previous sab-R-metrics posts...
17885 sym R (379 sym/1 pcs) 14 img
Fixing Up smoothScatter Heat Maps
A while back, I posted an article using the smoothScatter function in R that builds a color representation of density for scatter plots. When I first found the function, I was extremely excited because it’s a very easy and automated way to make a heat map! Unfortunately, the more I messed with the function, the more annoying it be...
5671 sym 14 img
sab-R-metrics: Displaying Line Plots and Time Series Data
It’s been a while since I’ve had the chance to add anything here, but last time I left everyone with some scatter plots and some customization tools for your graphics. This week will be a little more brief than the last few tutorials and what I’d like to do is show you how to display line graphs for time series data. For this,...
7218 sym R (2216 sym/1 pcs) 10 img