Publications by Kelly Parks KZE480
STA 6543 Homework 2
Question 1 Carefully explain the differences between the KNN classifier and KNN regression methods. The primary difference is that one (KNN classifier) is a classifier and is meant to allow for the labeling of a data point (or classifying) based on the labels of the points in closest proximity to it. The other (KNN regression) is a method of de...
5895 sym R (27353 sym/72 pcs) 10 img
HW3 Alg II STA 6543
Assignment #3 Chapter 4 - Questions 10,11,13 This question should be answered using the Weekly data set, which is part of the ISLR package. This data is similar in nature to the Smarket data from this chapter’s lab, except that it contains 1,089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010. Produce some numerical...
7603 sym R (47889 sym/187 pcs) 12 img
HW 4 STA 6543 Kelly Parks KZE480
Q3. We now review k-fold cross-validation. (a) Explain how k-fold cross-validation is implemented. This approach involves randomly k-fold CV dividing the set of observations into k groups, or folds, of approximately equal size. The first fold is treated as a validation set, and the method is fit on the remaining k − 1 folds. The mean squared er...
7463 sym R (7291 sym/53 pcs)
HW 5 STA 6543 Kelly Parks KZE480
Q2. For parts (a) through (c), indicate which of i. through iv. is correct. Justify your answer. The lasso, relative to least squares, is: More flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance. More flexible and hence will give improved prediction accuracy when its increas...
4316 sym R (6419 sym/38 pcs) 6 img
HW 7 STA 6543 Kelly Parks KZE480
3. Consider the Gini index, classification error, and entropy in a simple classification setting with two classes. Create a single plot that displays each of these quantities as a function of ˆpm1. The xaxis should display ˆpm1, ranging from 0 to 1, and the y-axis should display the value of the Gini index, classification error, and entropy. p ...
4460 sym R (6331 sym/52 pcs) 7 img
HW 6 STA 6543 Kelly Parks KZE480
6. In this exercise, you will further analyze the Wage data set considered throughout this chapter. (a) Perform polynomial regression to predict wage using age. Use cross-validation to select the optimal degree d for the polynomial. What degree was chosen, and how does this compare to the results of hypothesis testing using ANOVA? Make a plot of ...
2997 sym R (6561 sym/24 pcs) 6 img
HW 8 STA 6543 Kelly Parks KZE480
5. We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary.We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features. (a) Generate a data set with n = 500 and p = 2, such that...
5290 sym R (15239 sym/90 pcs) 26 img