Publications by Jacob Martin
Borel Simulations
Simulations for the cards in Borel, using 100,000 simulations per card Card 2 Roll all the dice. Will the sum of the results be at least 45? Card 18 Keep rolling the d30 until you roll an even number. Will the sum of all rolls be greater than 25? Card 24 Roll four d6. Will exactly half the rolls produce an even number? Card 30 Roll a d6 and t...
1923 sym 20 img
DS 1870 - Module 2: Diamond Practice Solutions
Setup knitr::opts_chunk$set(echo = T, fig.align = "center") # Load the tidyverse packages: library(tidyverse) The diamonds data # We'll use the diamonds data frame, stored in ggplot2. Take a look at it: data(diamonds) tibble(diamonds) ## # A tibble: 53,940 × 10 ## carat cut color clarity depth table price ...
1139 sym R (5180 sym/11 pcs) 7 img
STAT 5230 - k-means clustering - iris data
Exploratory analysis When conducting any sort of cluster analysis, it starts with visualizing the data. If there are two variables, we can make a scatter plot If there are three or more variables, we make a biplot(s) using the relevant PCs Since the iris data set has 4 numeric columns we’ll use to cluster the data, we’ll us PCA to see if t...
4938 sym Python (7695 sym/30 pcs) 14 img 1 tbl
DS 2870 - Module 4 - by argument to form groups
Using dplyr Set Up Your Project and Load Libraries knitr::opts_chunk$set(echo = F, fig.align = "center") ## Load the tidyverse package pacman::p_load(tidyverse) ## Change the default theme to theme_bw() theme_set(theme_bw()) ## Read in the "us counties.csv" data set and save it as counties counties <- read.csv(...
3796 sym 2 img
DS 2870 - Homework 2 Solutions - Spring 2024
knitr::opts_chunk$set(echo = T, warning = F, message = F, fig.align = "center") ## Load the required package: tidyverse library(tidyverse) ## Reading in the Dr Who data from github drwho <- read.csv("https://raw.githubusercontent.com/Shammalamala/DS-2870-Data-Sets/main/d...
2685 sym R (3313 sym/6 pcs) 5 img
DS 1870 - Module 2: Adding text to bar charts
Setting up the R Markdown File knitr::opts_chunk$set(echo = TRUE) # Start by loading the tidyverse, gt, and skimr package pacman::p_load(tidyverse, skimr, ggfittext) # Next, read in the Titanic Data set titanic <- read.csv("https://raw.githubusercontent.com/Shammalamala/DS-1870-Data/main/titanic.csv") # Changing class to a factor and the or...
1829 sym 5 img
STAT 5230: Spring 2024 - Homework 1 - Batting Data
MLB Batting Data The dataset mlb batting.csv has information on all the batting results for 314 MLB players for the 2023 season that played in at least 50 games. There are 10 variables in the data. First is player number (row_num), which you can ignore for now. Question 1: Summary Stats Mean vector \(\bar{\textbf{y}}\) mlb |> #removing row...
2986 sym Python (8337 sym/25 pcs) 7 img
DS 1870: Spring 2024 - Homework 1 Solutions
Question 1: Creating a data set For question 1, you’ll be creating a data set two different ways: Creating individual vectors, then combining them together Creating the data set without creating the vectors previously Part 1a: Beer names Create a vector called names that has the following five values: “Budlight”, “Fiddlehead”, “Bl...
3116 sym 3 tbl
DS 2870: Spring 2024 - Homework 1 Solutions
Question 1: Creating a data set For question 1, you’ll be creating a data set two different ways: Creating individual vectors, then combining them together Creating the data set without creating the vectors previously Part 1a: Beer names Create a vector called names that has the following five values: “Budlight”, “Fiddlehead”, “Bl...
3319 sym 2 img 3 tbl
STAT 5230: Lab 1 - Solutions
Make sure to place this R markdown file and the sparrow.xlsx file in the same folder! Then delete this line before you submit your results Data Description The sparrow.xlsx data set contains the following 5 measurements on 49 sparrows: length: The total length of the sparrow wing_span: The distance between full extended wing tips beak: The le...
5153 sym Python (4714 sym/18 pcs) 7 img