Publications by Harris Wohl

Tooth Length Analysis

04.12.2020

Brief Overview of the Data The description of the ToothGrowth dataset in R Documentation is as follows: The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (...

2484 sym R (2078 sym/13 pcs) 2 img

Homework 6

01.11.2020

7.4) The file agehw.dat contains data on the ages of 100 married couples sampled from the U.S. population. ### a) Formulate a semiconjugate prior distribution for the mean husband and wife ages \(\theta\) = \((\theta_h, \theta_w)^T\) and covariance matrix \(\Sigma\). agehw <- read.table("./agehw.dat", header = T) ybar <- apply(agehw, 2, mean) y...

3905 sym R (7091 sym/39 pcs) 13 img

US Mask Data Analysis

13.12.2020

Introduction Wherever I look, I can’t seem to escape the narrative of a widespread “anti-mask” movement. This informal analysis sets out to shed some light on the following questions: 1) Is the open refusal to wear masks actually widespread, or is this being overblown? 2) If this variation in mask adoption does exist, is it correlated wit...

7919 sym R (28000 sym/43 pcs) 5 img

HW 1

02.02.2021

library(ISLR) library(MASS) Conceptual Exercises 1) Which method would be better: flexible or inflexible? a) A flexible method would work better in this case since the sample size is so large. Additionally, since the number of predictors p is small, the model might still be interpretable. b) For the opposite case, an inflexible method might ...

6935 sym R (9375 sym/48 pcs) 7 img

HW 3

17.02.2021

2) 4) a) a) Since the predictor is uniformly distributed, 10 percent of available observations will be used to make each prediction. b) In the case where p = 2, 1/10 of the X1 observations will be used, and 1/10 of the X2 observations will be used to make each prediction. If we think of this criteria visually as a “box”, the area of such ...

6401 sym R (6175 sym/49 pcs) 1 img

HW 2

12.02.2021

library(MASS) library(ISLR) library(car) Conceptual Exercises 3) a) iii If we fix IQ and GPA, the model for males is the following: salary = 50 + 20 * gpa + .07 * iq + .01 * (gpa * iq) and the model for females is the following: salary = 85 + 10 * gpa + .07 * iq + .01 * (gpa * iq) So, females have an intercept of 85 and males have an inte...

8978 sym R (29018 sym/95 pcs) 19 img

HW 7

08.04.2021

Conceptual Exercises 2 and 4) These were handwritten and turned in seperately. 5) For the majority vote method, we would choose red since 6/10 estimates were greater than .5. mean(0.1, 0.15, 0.2, 0.2, 0.55, 0.6, 0.6, 0.65, 0.7, 0.75) ## [1] 0.1 For the average probability approach, we would choose green since the average of the bootstrapped est...

4178 sym R (10578 sym/62 pcs) 13 img

HW 5

15.03.2021

Conceptual Exercises 2) a) Lasso is less flexible relative to least squares since the least squares model contains a coefficient estimate for every predictor input into the model, whereas lasso only contains coefficient estimates for a subset of the predictors, and shrinks the coefficients based on lambda. Since it is less flexible, it will ha...

5510 sym R (14292 sym/62 pcs) 8 img

HW 4

03.03.2021

2) Chapter 5 Conceptual Exercise 4 Use bootstrapping: take n samples of paired \((X_i, Y_i)\) from the original dataset (with replacement), fit the statistical learning model on the new dataset, and predict Y using the value of X given in the problem. Repeat this process a large number of times, which should give a distribution of predictions for...

3143 sym R (4655 sym/34 pcs) 9 img

Final Project Technical Report

05.05.2021

Introduction The goal of this analysis is to build a model that predicts house prices, and identifies the most important factors in house value. To me, this meant that the real estate company is mainly interested in a good prediction model, and would value predictive power over interpretability and inference. For this reason, my final prediction ...

7406 sym