Publications by Jacob Martin

DS 2870: Module 4 Homework Key - Fall 2024

07.10.2024

Data Description The movies data set has 44010 rows about the amount of explicit content (drugs, language, sex, nudity, and violence) found in 1467 movies released since 1958. Each movie is represented by 30 rows (1 row = movie & tag_name type combo). The relevant variables in the data set are: imdb_id: The identifier used by IMDB to uniquely ...

4345 sym Python (7454 sym/11 pcs) 1 img

DS 2870: Module 5 - Adding text to a scatter plot

04.10.2024

Set Up Your Project and Load Libraries knitr::opts_chunk$set(echo = F, fig.align = "center", warning = F, message = F, fig.height = 6, fig.width = 8) ## Set the default size of figures # knitr::opts_chunk$set(fig.width=8, fig.heigh...

3157 sym 12 img 4 tbl

DS 2870 - Homework 3 - Fall 2024 - Key

01.10.2024

Data Description The sp500 data set has the 502 companies in the Standards & Poors (S&P 500) that are the largest 500 (502) publicly traded companies in the US. The data set has 11 variables, with the important ones being: symbol: The 3 to 4 letter symbol used to ID the company on the stock market company: The name of the company sector: The s...

2301 sym Python (4431 sym/6 pcs) 6 img

DS 2870 - Homework 2 - Fall 2024 - key

23.09.2024

Question 1: Box Plot for Olympian Ages The data set olympics.csv (found at https://raw.githubusercontent.com/Shammalamala/DS-2870-Data-Sets/main/olympics.csv) has data on about 6000 Olympic athletes that completed in 2024 Olympic games in one of 10 sports: Athletics, Swimming, Rowing, Judo, Shooting, Sailing, Volleyball, Equestrian, Fencing, Box...

2327 sym Python (3493 sym/5 pcs) 5 img

DS 2870: Homework 9 - Fall 2024 - Key

10.09.2024

knitr::opts_chunk$set(echo = F, warning = F, message = F) # load packages pacman::p_load(tidyverse, class, skimr, caret, rpart, rpart.plot) # Setting the seed for the markdown RNGversion("4.1.0"); set.seed(2870) # Changing the default theme theme_set(theme_bw()) Data Description For the code ch...

4683 sym 3 img

DS 2870: Module 3 - Line graphs of Karen and Terry

27.08.2024

Line graphs A line graph is a type a graph that uses a line to play “connect the dots” with the data points represented. It’s not required, but the x-axis on most line graphs represent time in some way. We’ll start with an example for a single name picked at random (like “Jacob”) # We'll create a line graph for the popularity of the ...

4460 sym Python (3711 sym/11 pcs) 8 img

DS 2870: Module 10 Homework - Summer 2024

28.06.2024

Data Description: The used cars.csv file has information about 1000 randomly sampled used sedans (4 door cars) in 2021. The variables are: manufactor: The company that makes the car model: The model of the car price: The sale price of the used car (our response variable) year: The year are the car was manufactured age: The age of the car when i...

3995 sym Python (5464 sym/13 pcs) 3 img

DS 2870: Module 8 Homework - Summer 2024 - Key

19.06.2024

Set up knitr::opts_chunk$set(echo = TRUE, fig.align = "center") # load packages pacman::p_load(tidyverse, class, skimr, caret, rpart, rpart.plot) # Changing the default theme theme_set(theme_bw()) Question 1) Spam Email The data set “Spam_Email.csv” contains columns that measure how frequently certain characters (;...

3573 sym Python (11144 sym/28 pcs) 3 img 3 tbl

DS 2870: Module 7 - Additional Practice - Savings Calculator

17.06.2024

Question 1: Balance calculator Alex plans to save $500 in a savings account that pays 4% yearly interest, compounded monthly. This means that every month, the account earns 0.04/12 on top of the current value of the account. Ie, after the first month, they’ll have: \[500 + 0*(0.04/12) = 500.00\] After two months, Alex have the previous month�...

2359 sym 4 img

DS 2870: Module 5 Homework - Summer 2024 - Key

07.06.2024

For this assignment, we’ll be working through with data from the Recording Industry Association of America (RIAA) in an attempt to recreate an image similar to this one: The columns other than year measure the amount of sales for that music format in millions of dollars. Question 1) Manipulate the data Change the data to be in the proper for...

1888 sym Python (4013 sym/6 pcs) 5 img