Publications by Kris Eberwein
How to Add WAR Metrics to your Lahman Database
I get a lot of questions on how to calculate WAR in the Lahman database. In the past I’ve discussed ways to calculate wOBA and FIP in Lahman but WAR has always been difficult due to the “closed-source” nature of the calculation. But there is an answer While stumbling around on Baseball Reference one day, I found that BR makes their WAR tabl...
3569 sym R (5452 sym/1 pcs) 4 img
How to Upgrade R Without Losing Your Packages
Yup kids, it’s that time again. The new version of R was just released. In the past I’ve hesitated to upgrade my R version because I knew I would lose all of my packages during the new install, which makes me very grumpy. I found this neat little trick to save my current packages before the new install and re-load them into the new version. I...
1282 sym R (645 sym/3 pcs) 2 img
How To Forecast With Tableau And R
After working with Tableau for the last several years, I have to admit that I’m quite impressed with the statistical capabilities of the software. It’s nowhere near the analytical powerhouse that R is, but for visualization it does a pretty good job. As good as Tableau is, much of the statistical properties are “out of the box” and don’...
3677 sym 16 img
Using PL/R and PL/Python to find Medians and Quartiles in Postgres
I’ve recently been exploring options to calculate median and quartiles in my Postgres database. If you’re familiar with quartiles you know how handy they can be. There’s a few different options in the Postgres universe to accomplish this, so I figured I would give them all a whirl and see which was the friendliest (and fastest) on my CPU. T...
2515 sym R (786 sym/2 pcs) 2 img
Hacking The New Lahman Package 4.0-1 with R-Studio
The developers of the Lahman package for R have recently updated the package to include 2014 MLB stats! For those not familiar, this R package recreates Sean Lahman’s Baseball Database into a quick and handy little R package. I’ve written on the Lahman package before, and even suggested adding a few advanced statistics to the battingStats() f...
1848 sym R (2196 sym/4 pcs)
Install Shiny Server for R on Ubuntu the Right Way
Is it time to spin up a new instance of Shiny Server? This tutorial is baseed on a fresh install of Ubuntu Server 14.04, but I’m sure it could be tweaked to work on RHEL or CentOS as well. There’s no real secret sauce to the install but there are several “gotcha’s” that most people overlook. The walk-through below can get you up and run...
2002 sym R (693 sym/10 pcs) 2 img
Data Science Workbench for Ubuntu 14.04
I found myself installing the same things over and over again on my VMs, so I decided to pack all my good DSR workbench action into one giant shell script that I could run and walk away from. Below is my markdown file, you can grab the shell scripts at my GitHub page. The script takes about 30 min. to finish. It’s tested on Ubuntu Server, but I...
1063 sym
Calculate Inflation with R
I was surprised to see there weren’t more of these types of calculators in the R community. Inflation and adjusted payments seem like they would be more common. I was able to find a way to gather Consumer Price Index data using the quantmod package but quantmod leaves you to your own devices in converting the data. So, I whipped up a formula, t...
1512 sym R (2146 sym/1 pcs) 2 img 1 tbl
Upgrade R on Windows with the installr Package
It’s that time again—time for a new R version! The latest version 3.2.3 “Wooden Christmas Tree” is a small upgrade for most, but a huge step for Windows users. Of the new features included in Wooden, half of them are Windows-specific. Several months back I wrote a tutorial on how to upgrade R on a Mac without losing your packages, so in l...
2090 sym R (140 sym/4 pcs)
How to Pimp Your .Rprofile
After you’ve been using R for a little bit, you start to notice people talking about their .Rprofile as if it’s some mythical being. Nothing magical about it, but it can be a big time-saver if you find yourself typing things like, “summary()” or, the ever-hated, “stringasfactors=FALSE” ad nauseam. Where is my .Rprofile? The simple ans...
1945 sym R (2467 sym/2 pcs)