Publications by H. K. Tseng

Statistical Learning: Week 14

17.05.2024

U.S. Armament Cooperation Network at a glance: 1992-2017 Structural Topic Models (STM) A gentle introduction to STM using international armament cooperation (IAC) data # load required packages and data library(stm) ## stm v1.3.6 successfully loaded. See ?stm for help. ## Papers, resources, and other materials at structuraltopicmodel.com # lo...

220 sym R (9227 sym/31 pcs) 9 img

Statistical Learning: Week-13

13.05.2024

Stripping raw text Stripping raw text using unnest_tokens() # load required packages # install.packages(c("tibble", "tidytext", "ggplot2", "gutenbergr", "tidyr")) library(tibble) library(tidytext) library(ggplot2) ## Warning: package 'ggplot2' was built under R version 4.3.3 library(gutenbergr) library(tidyr) # example text text <- c("A mar...

466 sym R (51235 sym/124 pcs) 11 img

Statistical Learning: Week-13 & 14

10.05.2024

Stripping raw text Stripping raw text using unnest_tokens() # load required packages # install.packages(c("tibble", "tidytext", "ggplot2", "gutenbergr", "tidyr")) library(tibble) library(tidytext) library(ggplot2) ## Warning: package 'ggplot2' was built under R version 4.3.3 library(gutenbergr) library(tidyr) # example text text <- c("A mar...

466 sym R (50596 sym/122 pcs) 11 img

Statistical Learning: Week 12

02.05.2024

Leave-One-Out cross validation (LOOCV) Model validation using LOOCV # load required packages and data (we will be using AmesHousing data) # install.packages(c("boot", "rsample", "AmesHousing")) library(ISLR) ## Warning: package 'ISLR' was built under R version 4.3.3 library(caret) ## Warning: package 'caret' was built under R version 4.3.3 ## Loa...

491 sym R (21679 sym/73 pcs) 2 img

Statistical Learning: Week 11

26.04.2024

\(K\)-means and hierarchical clustering for wine types Apply \(K\)-means and hierarchical clustering on wine data # load required packages and data (wine data) # install.packages(c("HDclassif", "useful", "factoextra")) library(HDclassif) ## Warning: package 'HDclassif' was built under R version 4.3.3 ## Loading required package: MASS ## Warning...

374 sym R (18175 sym/165 pcs) 7 img

Statistical Learning: Week 10

19.04.2024

Naïve Bayes ## load required packages and data library(dplyr) ## Warning: package 'dplyr' was built under R version 4.3.2 ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union li...

656 sym R (52637 sym/191 pcs) 18 img

Support Vector Machine: Support Hyperplane

06.04.2024

Understanding separating hyperplane (or support hyperplane) # generate random data # random seed for reproducibility set.seed(123) #set n = 500 data points. n <- 500 #create a data frame with two uniformly distributed predictors lying between 0 and 1. # runif() generates n random deviates from a normal distribution df <- data.frame(x1 = ...

108 sym R (5298 sym/18 pcs) 4 img

Statistical Learning: Week 8

06.04.2024

Classifying Legendary Pokemons using SVM! Using SVM techniques to classify if a Pokemon is a legendary one. # install required packages in one-swoop install.packages(c("dplyr", "ggplot2", "tidyr", "reshape2", "caret", "skimr", "psych", "e1071", "data.table", "Matrix", "keras")) ## Installing packages into 'C:/Users/hktse/AppData/Local/R/win-librar...

459 sym R (30769 sym/117 pcs) 3 img

Statistical Learning: Week 7

29.03.2024

Ridge and LASSO regressions Fit ridge and LASSO regressions, interpret coefficients and visualize their variation across the range of \(\lambda\). # load the required packages and data library(glmnet) library(caret) library(plotmo) data(mtcars) names(mtcars) # as usual, check out what's inside the loaded dataframe ## [1] "mpg" "cyl" "disp"...

435 sym R (26038 sym/66 pcs) 4 img

Note: LASSO, Ridge, and Penalty (λ ) explained

29.03.2024

LASSO vs. Ridge LASSO and Ridge regression are regression methods that perform \(variable\) \(selection\) and \(regularization\) to enhance the prediction accuracy and interpretability of the statistical model. In short, they do two things Variable selection: identify important variables in the data that explain major variation in the outcome ...

4700 sym Python (10359 sym/45 pcs) 3 img