Publications by Madison Dorsey

Assignment #4

17.03.2020

Problem 3 We now reviewed k-fold cross validation. (a) Explain how k-fold cross-validation is implemented. k-fold cross validation is implemented by taking the set of n observations and randomly splitting into k non-overlapping groups. Each of these groups acts as a validation set and the remainder as a training set. The test error is estimated b...

5941 sym R (6826 sym/44 pcs)

Document

07.03.2020

Problem 10 This question should be answered using the Weekly data set, which is part of the ISLR package. This data is similar in nature to the Smarket data from this chapter’s lab, except that it contains 1, 089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010. (a) Produce some numerical and graphical summaries of the...

4621 sym R (15345 sym/91 pcs) 2 img

STA 4143-001 Assignment #2

20.02.2020

Question 2 Carefully explain the differences between the KNN classifier and KNN regression methods. They are quite similar. Given a value for K and a prediction point \(x_{0}\), KNN regression first identifies the K training observations that are closes to \(x_{0}\), represented by \(N_{i}\). It then estimates \(f(x_{0})\) using the average of al...

6914 sym R (10349 sym/39 pcs) 4 img

Assignment #5

22.04.2020

Problem 6 In this exercise, you will further analyze the Wage data set considered throughout this chapter. (a) Perform polynomial regression to predict wage using age. Use cross-validation to select the optimal degree d for the polynomial. What degree was chosen, and how does this compare to the results of hypothesis testing using ANOVA? Make a p...

2452 sym R (6710 sym/22 pcs) 6 img

Assignment #6

05.05.2020

Problem 3 Consider the Gini index, classification error, and entropy in a simple classification setting with two classes. Create a single plot that displays each of these quantities as a function of \(\hat{p}_{m1}\). The x-axis should display \(\hat{p}_{m1}\), ranging from 0 to 1, and the y-axis should display the value of the Gini index, classif...

4674 sym R (5418 sym/39 pcs) 6 img

Assignment #7

12.05.2020

Problem 5 We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features. (a) Generate a data set with n = 500 and p = 2, s...

7316 sym R (1923 sym/11 pcs) 3 img