Publications by Jacob Martin

STAT 5230: Lab 2 Solutions

11.02.2025

The data for Practice 2 has national track records of 55 countries and 8 difference track races. The meter100, meter200, and meter400 are recorded in seconds The meter800, meter1500, meter5000, meter10000, and Marathon are measured in minutes. Question 1: PCA with the Covariance Matrix The first set of questions will be performing PCA using th...

4358 sym 4 img

DS 1870: Module 2 Practice - Diamonds Graphs

10.02.2025

Setup knitr::opts_chunk$set(echo = F, fig.align = "center") # Load the tidyverse packages: library(tidyverse) The diamonds data ## # A tibble: 53,940 × 10 ## carat cut color clarity depth table price x y z ## <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl> ## 1 0.23 Ideal ...

1130 sym R (1640 sym/3 pcs) 7 img

What are the ten most important words on each of Taylor Swift's albums?

09.01.2025

knitr::opts_chunk$set(echo = TRUE, warning = F, message = F, fig.align = "center") # Loading needed packages pacman::p_load(tidyverse, tidytext) # Reading in the taylor swift data set and removing rows with no lyrics swift <- taylor::taylor_album_songs |> # Renami...

2404 sym Python (7179 sym/10 pcs) 2 img 1 tbl

NFL: How Many Times Each Team Has Scored or Allowed More Than Thirty Points

06.01.2025

Getting and cleaning the data With the news that the Patrick Mahomes led 15 win Kansas City Chiefs never scored more than thirty points this season, let’s look at how often each team has scored more than 30 points in a game by season since 1999. We’ll be using the nflfastR package to get the play-by-play results for each game ## # A tibble:...

2503 sym 4 img 3 tbl

NFL: Scoring and Allowing 30, 40, and 50 plus points

06.01.2025

\[\\[2in]\]...

17 sym 2 img

Does Zapf's Law Apply to Taylor Swift's Lyrics

04.01.2025

Tokenizing the data tidy_taylor <- swift |> # tokenizing the lyrics of each song unnest_tokens(output = word, input = lyrics) tibble(tidy_taylor) ## # A tibble: 56,632 × 4 ## index album song_name word ## <int> <chr> <chr> <chr> ## 1 0 Taylor Swift Mary's Song (Oh My My ...

1192 sym Python (5031 sym/11 pcs) 2 img

Clustering US Counties with Education and Economic Features

02.01.2025

Loading the data and initial cleaning Cleaning the education data education <- read_xlsx("Education.xlsx", skip = 3) |> # Making the names R friendly janitor::clean_names() |> # Connecticut has missing values for 2022, we'll use the next newest year mutate( # # rucc # x2023_rurual_urban_continuum_code = if_else( # ...

17913 sym Python (26420 sym/44 pcs) 22 img 10 tbl

DS 2870: Homework 8 - Fall 2024 - key

05.12.2024

Data Description: The used cars.csv file has information about 1000 randomly sampled used sedans (4 door cars) in 2021. The variables are: manufactor: The company that makes the car model: The model of the car price: The sale price of the used car (our response variable) year: The year are the car was manufactured age: The age of the car when i...

3995 sym Python (5489 sym/13 pcs) 3 img

Does Receiving the Second Half Kickoff Have an Advantage in the NFL?

22.11.2024

Does the second half kickoff have an impact on who wins an NFL game? We’ll be looking at the probability the team that receives the kickoff after halftime wins an NFL game. We’ll be accounting for who is winning at the half. The data is the last 10 NFL season results, collected from the nflfastR package and the load_pbp() function. pbp <- ...

7586 sym Python (12378 sym/15 pcs) 6 img 4 tbl

Is it easier to kick field goals in indoor stadiums in the NFL?

21.11.2024

Introduction In the NFL, teams can score points in three ways: Safety: Two points Occurs less than 1% of possessions Field Goal: Three points Occurs about 40% of teams’ offensive possessions Touchdown: Six points Occurs about 20% of teams’ offensive possessions A field goal attempt occurs when a team attempts to kick the ball through go...

13180 sym R (26504 sym/27 pcs) 8 img 7 tbl