Publications by R | JLaw's R Blog
Exploring Types of Subway Fares with Hierarchical Forecasting
In my prior post I used forecasting to look at the effect of COVID on the expected amount of New York City subway swipes. In this post I will drill a level deeper to run forecasts for various types of subway fares to see if any particularly type has recovered better or worse than any others. The goal for this post will be to create a top-level fo...
7652 sym R (6087 sym/13 pcs) 8 img 4 tbl
When Will NYC’s Subway Ridership Recover?
While writing my posts about COVID’s effect on NYC Subway ridership the New York Times published an article called The Pandemic Wasn’t Supposed to Hurt New York Transit This Much. The article states: I believe the 80% target by 2026 comes from a McKinsey study. While I don’t know the details of the study I do have some subway fare data sit...
4541 sym R (3738 sym/9 pcs) 6 img
ML for the Lazy: Can AutoML Beat My Model?
In this fourth (and hopefully final) entry in my “Icing the Kicker” series of posts, I’m going to jump back to the first post where I used tidymodels to predict whether or not a kick attempt would be iced. However, this time I see if using the h2o AutoML feature and the SuperLearner package can improve the predictive performance of my initi...
9287 sym R (6154 sym/25 pcs) 8 img 5 tbl
How much has COVID cost the NYC Subway system in “lost fares”?
With things in NYC beginning to return to normal after two years of COVID I found myself thinking about how much money was lost in Subway fares in the 2+ years where people were working from home. Seeing an opportunity to mess around with some forecasting packages, I set out to determine *how much money in lost rides has COVID cost the MTA?“. F...
6974 sym R (5260 sym/15 pcs) 8 img 1 tbl
Ain’t Nothin But A G-Computation (and TMLE) Thang: Exploring Two More Causal Inference Methods
In my last post I looked at the causal effect of icing the kicker using weighting. Those results found that icing the kicker had a non-significant effect on the success of the field goal attempt with a point estimate of -2.82% (CI: -5.88%, 0.50%). In this post I will explore two other methodologies for causal inference with observational data, G-...
8942 sym R (3491 sym/12 pcs) 2 img 3 tbl
A Racing Barplot of Top US Baby Names 1880-2018
A few month’s back Mrs. JLaw and I were discussing baby names (purely for academic purposes) and it got me thinking about how have popular names changed over time. It was a particular interest to me as someone who had a name that was somewhat popular for a while and has since fallen out of fashion. This also provided me an opportunity to try o...
5080 sym R (4161 sym/6 pcs) 8 img 2 tbl
A Racing Barplot of Top US Baby Names 1880-2018
A few month’s back Mrs. JLaw and I were discussing baby names (purely for academic purposes) and it got me thinking about how have popular names changed over time. It was a particular interest to me as someone who had a name that was somewhat popular for a while and has since fallen out of fashion. This also provided me an opportunity to try o...
5080 sym R (4161 sym/6 pcs) 8 img 2 tbl
What’s the Difference Between Instagram and TikTok? Using Word Embeddings to Find Out
TL;DR Instagram – Tiktok = Photos, Photographers and Selfies Tiktok – Instagram = Witchcraft and Teens but read the whole post to find out why! Purpose The original intent of this post was to learn to train my own Word2Vec model, however, as is a running theme.. my laptop is not great and training a neural network would never work. However...
10570 sym R (6332 sym/10 pcs) 6 img 1 tbl
What’s the Difference Between Instagram and TikTok? Using Word Embeddings to Find Out
TL;DR Instagram – Tiktok = Photos, Photographers and Selfies Tiktok – Instagram = Witchcraft and Teens but read the whole post to find out why! Purpose The original intent of this post was to learn to train my own Word2Vec model, however, as is a running theme.. my laptop is not great and training a neural network would never work. However...
10570 sym R (6332 sym/10 pcs) 6 img 1 tbl
COVID-19s Impact on the NYC Subway System
At 8pm on March 22nd, 2020, the “New York State on PAUSE” executive order became effective and New York City went on lockdown until June 8th, when the Phase 1 reopening began. During this time usage of the public transit systems had a sudden drop as all non-essential services needed to close. In this analysis, I look at MTA Subway Fare data t...
14308 sym R (9852 sym/15 pcs) 8 img