Publications by Jimmy De La Fuente
Homework 2
Problem 2 Carefully explain the differences between KNN Classifier and KNN regression methods. KNN Classifier’s final result is the classification output for Y and is qualitative, while the KNN Regression’s final output predicts the quantitative value for f(x). Problem 9 This question involves the use of multiple linear regression on the Aut...
5127 sym R (11906 sym/36 pcs) 3 img
Assignment 5
Problem 2 For parts (a) through (c), indicate which of i. through iv. is correct. Justify your answer. (a) The lasso, relative to least squares is: i. More flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance. ii. More flexible and hence will give improved prediction accuracy wh...
3434 sym R (4623 sym/40 pcs) 5 img
Assignment 3
Question 10 This question should be answered using the Weekly data set, which is part of the ISLR package. this data is similar in nature to the Smarket data from this chapter’s lab, except that it contains 1,089 weekly returns for 21 years, from the beginning of 1900 to the end of 2010. (a) Produce some numerical and graphical summaries of the...
5658 sym R (15401 sym/91 pcs) 8 img
Assignment 4
Problem 3 We now review k-fold cross validation. (a) Explain how k-fold cross validation is implemented. K-Fold cross validation is implemented by taking a set of observations and splitting them randomly into k groups/folds that are of equal size. The first is treated as a validation set and the method is then fit onto the remaining k - 1 folds. ...
6186 sym R (7962 sym/44 pcs)
Assignment 8
Question 5 We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features. Generate a data set with n=500 and p=2, such th...
4729 sym R (11167 sym/80 pcs) 26 img
Assignment 7
Question 3 Consider the Gini index, classification error, and entropy in a simple classification setting with two classes. Create a single plot that displays each of these quantities as a function of pm1. The x-axis should display pm1, ranging from 0 to 1, and the y-axis should display the value of the Gini index, classification error, and entrop...
4679 sym R (5503 sym/44 pcs) 6 img
Assignment 6
Question 6 In this exercise, you will further analyze the Wage data set considered throughout this chapter. (a) Perform polynomial regression to predict wage using age. Use cross-validation to select the optimal degree d for the polynomial. What degree was chosen, and how does this compare to the results of hypothesis testing using ANOVA? Make a ...
1805 sym R (6732 sym/21 pcs) 6 img