Publications by Justin Williams
Classification Metrics
Objective We will explore a slew of classification metrics through writing custom functions, than comparing the results to various R packages. Load Data First let’s load the data we’ll use for this exercise: # load df <- read.csv("./data/classification-output-data.csv") %>% clean_names() # preview head(df) ## pregnant glucose diastolic skin...
5457 sym Python (7761 sym/48 pcs) 2 img
Moneyball EDA
Objective Goal of this notebook is to import the data and explore. More specifically describe the size and the variables in the moneyball training data set. Data Let’s read in the training data and start the eda process. # read in data df <- read_csv("./data/moneyball-training-data.csv") %>% clean_names() # preview head(df) ## # A tibble: 6 ×...
6520 sym 32 img
Final Exam Pt1
Problem 1 Gamma PDF Probability Density 1: X~Gamma. Using R, generate a random variable X that has 10,000 random Gamma pdf values. A Gamma pdf is completely describe by \(n\) (a size parameter) and lambda (λ , a shape parameter). Choose any \(n\) greater 3 and an expected value (λ) between 2 and 10 (you choose). Helpful Info # Set the seed for ...
6882 sym Python (3959 sym/36 pcs)
Taylor Series Approximation
This week, we’ll work out some Taylor Series expansions of popular functions. \(f(x)=\frac{1}{(1-x)}\) \(f(x)=e^x\) \(f(x)=\ln (1+x)\) \(f(x)=x(1 / 2)\) For each function, only consider its valid ranges as indicated in the notes when you are computing the Taylor Series expansion. Function 1 \(f(x)=\frac{1}{(1-x)}\) \[ \begin{aligned} & f^{\pri...
2357 sym
Univariate & Multivariate Calculus
Question 1 Use integration by substitution to solve the integral below: \[\int4e^{-7x}dx\] Apply linearity: \[=4e^{-7}.\int x\;dx\] Now Solving \[\int x\;dx\] Apply power rule: \[\int x^{n}dx=\frac{x^{n+1}}{n+1}\;with\;n=1\\=\frac{x^2}{2}\] Plug in solved integrals: \[4e^{-7}. \int x\;dx\\=2e^{-7}x^2+C\] Question 2 Biologists are treating a pond c...
2992 sym 2 img
Multiple Regression 2
library(tidyverse) library(janitor) Load data # load data df <- read_csv("/Users/justinwilliams/Documents/CUNY SPS/605/assignments/data/who.csv") %>% clean_names() ## Rows: 190 Columns: 10 ## ── Column specification ──────────────────────────────────────────...
5485 sym R (4500 sym/28 pcs) 6 img
Multiple Linear Regression
library(tidyverse) library(janitor) Question Using R, build a multiple regression model for data that interests you. Include in this model at least one quadratic term, one dichotomous term, and one dichotomous vs. quantitative interaction term. Interpret all coefficients. Conduct residual analysis. Was the linear model appropriate? Why or why not...
2215 sym R (6004 sym/18 pcs) 3 img
Simple Linear Regression (SLR)
Using the “cars” dataset in R, build a linear model for stopping distance as a function of speed and replicate the analysis of your textbook chapter 3 (visualization, quality evaluation of the model, and residual analysis.) # get dataset data(cars) # plot relationship plot(cars[,"speed"],cars[,"dist"], main="Stopping Distance vs. Speed", xlab=...
3026 sym 4 img
Markov Chains/Random Walks
Question 1 Smith is in jail and has 1 dollar; he can get out on bail if he has 8 dollars. A guard agrees to make a series of bets with him. If Smith bets A dollars, he wins A dollars with probability .4 and loses A dollars with probability .6. Find the probability that he wins 8 dollars before losing all of his money if (a) he bets 1 dollar each ti...
1156 sym
Central Limit Theorem and MGF
Question 1 1) The price of one share of stock in the Pilsdorff Beer Company (see Exercise 8.2.12) is given by \(Yn\) on the \(n\)th day of the year. Finn observes that the differences \(X_n = Y_{n+1} − Y_n\) appear to be independent random variables with a common distribution having mean \(\mu\) = 0 and variance \(\sigma^2\) = \(\frac{1}{4}\). If...
2304 sym