Publications by H. K. Tseng
Statistical Learning: Week 14
U.S. Armament Cooperation Network at a glance: 1992-2017 Structural Topic Models (STM) A gentle introduction to STM using international armament cooperation (IAC) data # load required packages and data library(stm) ## stm v1.3.6 successfully loaded. See ?stm for help. ## Papers, resources, and other materials at structuraltopicmodel.com # lo...
220 sym R (9227 sym/31 pcs) 9 img
Statistical Learning: Week-13
Stripping raw text Stripping raw text using unnest_tokens() # load required packages # install.packages(c("tibble", "tidytext", "ggplot2", "gutenbergr", "tidyr")) library(tibble) library(tidytext) library(ggplot2) ## Warning: package 'ggplot2' was built under R version 4.3.3 library(gutenbergr) library(tidyr) # example text text <- c("A mar...
466 sym R (51235 sym/124 pcs) 11 img
Statistical Learning: Week-13 & 14
Stripping raw text Stripping raw text using unnest_tokens() # load required packages # install.packages(c("tibble", "tidytext", "ggplot2", "gutenbergr", "tidyr")) library(tibble) library(tidytext) library(ggplot2) ## Warning: package 'ggplot2' was built under R version 4.3.3 library(gutenbergr) library(tidyr) # example text text <- c("A mar...
466 sym R (50596 sym/122 pcs) 11 img
Statistical Learning: Week 12
Leave-One-Out cross validation (LOOCV) Model validation using LOOCV # load required packages and data (we will be using AmesHousing data) # install.packages(c("boot", "rsample", "AmesHousing")) library(ISLR) ## Warning: package 'ISLR' was built under R version 4.3.3 library(caret) ## Warning: package 'caret' was built under R version 4.3.3 ## Loa...
491 sym R (21679 sym/73 pcs) 2 img
Statistical Learning: Week 11
\(K\)-means and hierarchical clustering for wine types Apply \(K\)-means and hierarchical clustering on wine data # load required packages and data (wine data) # install.packages(c("HDclassif", "useful", "factoextra")) library(HDclassif) ## Warning: package 'HDclassif' was built under R version 4.3.3 ## Loading required package: MASS ## Warning...
374 sym R (18175 sym/165 pcs) 7 img
Statistical Learning: Week 10
Naïve Bayes ## load required packages and data library(dplyr) ## Warning: package 'dplyr' was built under R version 4.3.2 ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union li...
656 sym R (52637 sym/191 pcs) 18 img
Support Vector Machine: Support Hyperplane
Understanding separating hyperplane (or support hyperplane) # generate random data # random seed for reproducibility set.seed(123) #set n = 500 data points. n <- 500 #create a data frame with two uniformly distributed predictors lying between 0 and 1. # runif() generates n random deviates from a normal distribution df <- data.frame(x1 = ...
108 sym R (5298 sym/18 pcs) 4 img
Statistical Learning: Week 8
Classifying Legendary Pokemons using SVM! Using SVM techniques to classify if a Pokemon is a legendary one. # install required packages in one-swoop install.packages(c("dplyr", "ggplot2", "tidyr", "reshape2", "caret", "skimr", "psych", "e1071", "data.table", "Matrix", "keras")) ## Installing packages into 'C:/Users/hktse/AppData/Local/R/win-librar...
459 sym R (30769 sym/117 pcs) 3 img
Statistical Learning: Week 7
Ridge and LASSO regressions Fit ridge and LASSO regressions, interpret coefficients and visualize their variation across the range of \(\lambda\). # load the required packages and data library(glmnet) library(caret) library(plotmo) data(mtcars) names(mtcars) # as usual, check out what's inside the loaded dataframe ## [1] "mpg" "cyl" "disp"...
435 sym R (26038 sym/66 pcs) 4 img
Note: LASSO, Ridge, and Penalty (λ ) explained
LASSO vs. Ridge LASSO and Ridge regression are regression methods that perform \(variable\) \(selection\) and \(regularization\) to enhance the prediction accuracy and interpretability of the statistical model. In short, they do two things Variable selection: identify important variables in the data that explain major variation in the outcome ...
4700 sym Python (10359 sym/45 pcs) 3 img