Publications by H. K. Tseng

Statistical Learning - Week 13

16.05.2021

Stepwise Variable Selection # load required packages and data (Carseats data) library(olsrr) ## ## Attaching package: 'olsrr' ## The following object is masked from 'package:datasets': ## ## rivers library(ISLR) library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ...

1844 sym R (49509 sym/179 pcs) 30 img

Statistical Learning - Week 11

03.05.2021

## load required packages and data library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(ggplot2) library(tidyr) library(caret) ## Loading required package: lat...

604 sym R (43580 sym/187 pcs) 19 img

Support Hyperplane (SHP)

19.04.2021

Understanding separating hyperplane (or support hyperplane) # generate random data # random seed set.seed(123) #set n = 500 data points. n <- 500 #Generate data frame with two uniformly distributed predictors lying between 0 and 1. df <- data.frame(x1 = runif(n), x2 = runif(n)) # create a variable y whose value is -...

106 sym R (12331 sym/14 pcs) 4 img

Note: LASSO, Ridge, and Penalty explained

28.03.2022

LASSO vs. Ridge LASSO and Ridge regression are regression methods that perform \(variable\) \(selection\) and \(regularization\) to enhance the prediction accuracy and interpretability of the statistical model. In short, they do two things Variable selection: identify important variables in the data that explain major variation in the outcome v...

4710 sym R (9344 sym/39 pcs) 3 img

Statistical Learning: Week-7

28.03.2022

Ridge and LASSO regressions Fit ridge and LASSO regressions, interpret coefficients and visualize their variation across the range of \(\lambda\). # load required packages install.packages("glmnet") ## Installing package into 'C:/Users/hktse/Documents/R/win-library/3.6' ## (as 'lib' is unspecified) ## ## There is a binary version available ...

431 sym R (18405 sym/75 pcs) 5 img

Statistical Learning: Week 5

14.03.2022

Binary outcome as a generalization of linear regression model with limited dependent variables Estimating a model of binary outcome using lm() function library(ISLR) data(Default) names(Default) ## [1] "default" "student" "balance" "income" ## convert the outcome variable "default" to numeric: Yes = 1, No = 0 Default$default <- ifelse(Default$...

834 sym R (35287 sym/139 pcs) 7 img

Statistical Learning: Week 3

28.02.2022

Fitting a linear model A description of variable names and measures can be seen here install.packages("ISLR") ## Installing package into 'C:/Users/hktse/Documents/R/win-library/3.6' ## (as 'lib' is unspecified) ## ## There is a binary version available but the source version is later: ## binary source needs_compilation ## ISLR 1.2 ...

341 sym R (12364 sym/60 pcs) 5 img

Statistical Learning - Week 2

21.02.2022

Installing R and RStudio R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. RStudio is an integrated development environment for R, a programming language for statistical computing and graphics. You need to download base R in order to run RStudi...

2475 sym R (16078 sym/144 pcs) 12 img

Statistical Learning: Week-4

08.03.2022

Non-normality test ## load Prestige data from the "car" package library(car) ## Loading required package: carData lm.fit <- lm(prestige ~ education, data = Prestige) plot(lm.fit) ## Shapiro test of normality shapiro.test(Prestige$prestige) ## ## Shapiro-Wilk normality test ## ## data: Prestige$prestige ## W = 0.97198, p-value = 0.0287...

1087 sym R (31308 sym/100 pcs) 15 img

Statistical Learning: Week-6

22.03.2022

set.seed(123) n = 10 xr = seq(0, n, by=.1) # generate a random data from a sin function plus some random errors yr = sin(xr/2) + rnorm(length(xr))/2 # combine x and y into a df for easy manipulation df = data.frame(x = xr, y = yr) # plot the data plot(df) lm.fit = lm(y ~ x, data = df) abline(lm.fit, col = "red") # If the degree of the...

641 sym R (24990 sym/90 pcs) 20 img