Publications by Zlgee
An Introduction to Statistical Learning-Ch05-Exercise
Ch05-Exercise Zlgee 2021/6/23 Q5 In Chapter 4, we used logisitc regression to predict the probability of “default” using “income” and “balance” on the “Default” data set. We will now estimate the test error of this logistic regression model using the validation set approach. Do not forget to set a random seed before beginning you...
8852 sym R (13060 sym/71 pcs) 1 img
An Introduction to Statistical Learning-Ch08-Exercise
Ch08-Exercise Zlgee 2021/6/11 Q7 In the lab, we applied random forests to the “Boston” data using “mtry = 6” and using “ntree = 25” and “ntree = 500”. Create a plot displaying the test error resulting from random forests on this data set for a more comprehensive range of values for “mtry” and “ntree”. Describe the results...
5543 sym R (12207 sym/76 pcs) 8 img
An Introduction to Statistical Learning-Ch10-Exercise
Ch010-Exercise Zlgee 2021/6/18 Q7 In the chapter, we mentioned the use of correlation-based distance and Euclidean distance as dissimilarity measures for hierarchical clustering. It turns out that these two measures are almost equivalent : if each observation has been centered to have mean zero and standard deviation one, and if we let \(r_{ij}\...
5028 sym R (10013 sym/37 pcs) 6 img
An Introduction to Statistical Learning-Ch09-Exercise
Ch09-Exercise Zlgee 2021/6/16 Q4 Generate a simulated two-class data set with 100 observations and two features in which there is a visible but non-linear separation between the two classes. Show that in this setting, a support vector machine with a polynomial kernel (with degree greater than 1) or a radial kernel will outperform a support vecto...
9030 sym R (18634 sym/108 pcs) 35 img
An Introduction to Statistical Learning-Ch10
Unsupervised Learning Zlgee 2021/6/17 10.4 Principal Components Analysis We perform PCA on the USArrests data set. states = row.names(USArrests) states ## [1] "Alabama" "Alaska" "Arizona" "Arkansas" ## [5] "California" "Colorado" "Connecticut" "Delaware" ## [9] "Florida" "Georgia" ...
7569 sym R (12021 sym/77 pcs) 17 img
An Introduction to Statistical Learning-Ch09
Support Vector Machines Zlgee 2021/6/8 9.6.1 Support Vector Classifier e1071 1.Generating the observations, which belong to two classes: set.seed(1) x = matrix(rnorm(20*2),ncol=2) y = c(rep(-1,10),rep(1,10)) x[y==1,] = x[y==1,] + 1 2.Checking whether the classes are linearly separable: plot(x,col=(3-y)) They are not. 3.Fit the support vector...
5540 sym R (7488 sym/62 pcs) 12 img
An Introduction to Statistical Learning-Ch08
Tree-Based Methods-Decision Tree Zlgee 2021/6/8 8.3.1 Fitting Classification Trees The tree library is used to construct classification and regression trees. library(tree) We first use classification tree to analyze the Carseats data set.In these data, Sales is a continuous variable, and so we begin by encoding it as a binary variable. We use th...
7511 sym R (5942 sym/57 pcs) 13 img