Publications by Lucy Metz

Algorithms II Homework 2

19.06.2020

Problem 2 Carefully explain the differences between the KNN classifier and KNN regression methods. KNN classifier uses the neighbors (or closest observations) to assign a classification group while KNN regression averages the neighbors to estimate a prediction. Problem 9 This question involves the use of multiple linear regression on the Auto da...

5441 sym R (15443 sym/45 pcs) 7 img

Homework 4 - Algorithms 2

02.07.2020

##Problem 3 We now review k-fold cross-validation. (a) Explain how k-fold cross-validation is implemented. K-fold cross-validation divides the dataset into K number of groups or folds that have the same or close to the same number of observations. The fist fold acts as our testing data and we train with the remaining groups. This process is repea...

7242 sym R (13210 sym/63 pcs)

Algorithms II Homework 5

10.07.2020

Problem 2 - For parts (a) through (c), indicate which of i. through iv. is correct. Justify your answer. (a) The lasso, relative to least squares, is: iii. Less flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance. - Lasso reduces the number of variables which reduces the varia...

5520 sym R (11347 sym/84 pcs) 4 img

Homework 3 - Classification

25.06.2020

Problem 10 This question should be answered using the [Weekly] (https://rdrr.io/cran/ISLR/man/Weekly.html) data set, which is part of the ISLR package. This data is similar in nature to the [Smarket] (https://rdrr.io/cran/ISLR/man/Smarket.html) data from this chapter’s lab, except that it contains 1, 089 weekly returns for 21 years, from the be...

6702 sym R (35644 sym/252 pcs) 6 img

Algorithms II Homework 6

24.07.2020

Problem 6 In this exercise, you will further analyze the Wage data set considered throughout this chapter. library(ISLR) ## Warning: package 'ISLR' was built under R version 3.6.3 summary(Wage) ## year age maritl race ## Min. :2003 Min. :18.00 1. Never Married: 648 1. White:2480 ##...

2709 sym R (18422 sym/63 pcs) 6 img

Algorithms II Homework 7

30.07.2020

Problem 3 Consider the Gini index, classification error, and entropy in a simple classification setting with two classes. Create a single plot that displays each of these quantities as a function of ˆpm1. The x-axis should display ˆpm1, ranging from 0 to 1, and the y-axis should display the value of the Gini index, classification error, and ent...

6171 sym R (15144 sym/78 pcs) 7 img

Algorithms II Homework 8

01.08.2020

Problem 5 We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features. (a) Generate a data set with n = 500 and p = 2, s...

6492 sym R (19148 sym/113 pcs) 19 img