Publications by Martin Monkman
Trends in AL run scoring (using R)
I have started to explore the functionality of R, the statistical and graphics programming language. And with what better data to play than that of Major League Baseball?There have already been some good examples of using R to analyze baseball data. The most comprehensive is the on-going series at The Prince of Slides (Brian Mills, aka Millsy), ...
1661 sym
Trends in run scoring, NL edition (more R)
Last time around I used R to plot the average runs per game for the American League, starting in 1901. Now I’ll do the same for the National League. I'll save a comparison of the two leagues for my next post.A fundamental principal of programming is that code can be repurposed for different sets of datas. So much of what I’m going to descri...
1268 sym
Trends in run scoring – comparing the leagues
My previous two posts have looked a using R to create trend lines for the run scoring environments in the American and National leagues. This time around, I'll plot the two against each other to allow for some comparisons. (The code below assumes that you've read the data into your workspace and calculated the LOESS trend lines, as I did in the...
1099 sym
Comparing individual team run production
Or, The 2010 Mariners: How Bad Were They?In earlier posts, I used the statistical software R to plot the trends in league average run scoring since 1901. This was the first step to answering other questions I had on my mind:How poor was the offensive performance of the 2010 Seattle Mariners?Are they showing any signs of improvement?And how can I ...
2320 sym
Gist for previous posts
The more I use it, the more I understand the benefits and value of Github as a code-sharing resource. The gist found here is the R code for my posts on run scoring trends by league (found here, here, and here). I will continue to use Github for the code used in future posts.-30- Related To leave a comment for the author, please f...
687 sym
Run production, one team at a time
In a previous post, I used R to process data from the Lahman database to calculate index values that compare a team's run production to the league average for that year. For the purpose of that exercise, I started the sequence at 1947, but for what follows I re-ran the code with the time period 1901-2012. The R code I used can be found at this ...
1472 sym
MLB runs allowed by team
Or, How good were the Maddux/Glavine-era Braves?In this on-going series of posts about run scoring in Major League Baseball, for this installment I'll turn the equation around and look at runs allowed. In order to account for the changing run scoring environments, the runs allowed by individual teams is compared to the league average for that s...
1311 sym
Major League Baseball run scoring trends with R’s Lahman package
The statistical software R has an ever-expanding array of packages that provide pre-programmed functions and datasets. One such package is named Lahman, bundling the contents of the Lahman database into a quick-and-easy resource for R users. In addition to the data tables, the package resources also contain a variety of analyses and g...
2857 sym R (4141 sym/6 pcs) 6 img
Annotating select points on an X-Y plot using ggplot2
or, Is the Seattle Mariners outfield a disaster?The BackstoryEarlier this week (2013-06-10), a blog post by Dave Cameron appeared at USS Mariner under the title “Maybe It’s Time For Dustin Ackley To Play Some Outfield”. In the first paragraph, Cameron describes to the Seattle Mariners outfield this season as “a complete disaster” and R...
7528 sym 4 img
Fair weather fans? (An R scatter plot matrix)
The Victoria HarbourCats are roughly half way through their inaugural season in the West Coast League, and currently lead the league in average attendance. In a recent conversation with one of the team’s staff, he mentioned that after the first game in early June, the fans started to come out when the sun appeared and the weather got warmer.I...
4683 sym R (2667 sym/3 pcs) 6 img 2 tbl