Publications by Bella Lin
hw8
5. We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features. (a) Generate a data set with n = 500 and p = 2, such that...
4541 sym 26 img
hw7
3. Consider the Gini index, classification error, and entropy in a simple classification setting with two classes. Create a single plot that displays each of these quantities as a function of pˆm1. The x-axis should display pˆm1, ranging from 0 to 1, and the y-axis should display the value of the Gini index, classification error, and entropy. Hi...
5633 sym 7 img
hw6
6. In this exercise, you will further analyze the Wage data set considered throughout this chapter. (a) Perform polynomial regression to predict wage using age. Use cross-validation to select the optimal degree d for the polynomial. What degree was chosen, and how does this compare to the results of hypothesis testing using ANOVA? Make a plot of t...
2333 sym Python (8358 sym/15 pcs) 5 img
hw5
2. For parts (a) through (c), indicate which of i. through iv. is correct. Justify your answer. (a) The lasso, relative to least squares, is: i. More flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance. ii. More flexible and hence will give improved prediction accuracy when it...
3417 sym 2 img
hw4
3. We now review k-fold cross-validation. (a) Explain how k-fold cross-validation is implemented. K-fold cross-validation is implemented by randomly divide the observations, n, into numbers of k folds of equal size. Each fold is treated as a validation set once only and the remainding folds are fitted for training at every k times. Then the test ...
7043 sym R (6423 sym/39 pcs)
HW3
13. This question should be answered using the Weekly data set, which is part of the ISLR2 package. This data is similar in nature to the Smarket data from this chapter’s lab, except that it contains 1,089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010. (a) Produce some numerical and graphical summaries of the Weekly ...
7368 sym 3 img
HW2
Carefully explain the differences between the KNN classifier and KNN regression methods. KNN classifier is used to predict class, while KNN regression is used to predict values. In KNN classification, it assigns a discrete class label to a new data point based on the majority class among its k nearest instances. KNN regression predicts a continuou...
5102 sym R (12251 sym/28 pcs) 4 img