Publications by Kate C

PCA analysis

24.01.2022

PCA Case Study Kate C 2022-01-26 Load Packages and Dataset Packages used to import and analyse the data include First step - Data Preparation we use as.matrix to convert the features of the data in wisc.df (col 3: 32) to matrix and store it. wisc.data <- as.matrix(wisc.df[, 3:32]) assign the row names of wisc.data the values currently contai...

3211 sym R (95176 sym/51 pcs) 6 img

ML_Tree-based_method

18.01.2022

Logistic Regression Kate C 2022-01-17 Load Packages and Dataset Packages used to import and analyse the data include broom - better view of the model outputs wvplots - for gaincurveplot dplyr - data manipulation ggplot - plotting tidyr - create tidy data (column is variable, row is observation, cell is single value) mgcv - GAM model A. Key poi...

4227 sym R (10722 sym/49 pcs) 6 img

ML_Tree-based_method_02

18.01.2022

Unsupervised learning in R Kate C 2022-01-23 Load Packages and Dataset Packages used to import and analyse the data include A. Key Points unsupervised learning - finding structures in unlabeled data two goals find homogeneous subgroups within larger group - it is called clustering (i.e. market segmentations) finding patterns in the features ...

5960 sym R (3727 sym/18 pcs) 2 img

ML_Non-Linear

17.01.2022

Logistic Regression Kate C 2022-01-17 Load Packages and Dataset Packages used to import and analyse the data include broom - better view of the model outputs wvplots - for gaincurveplot dplyr - data manipulation ggplot - plotting tidyr - create tidy data (column is variable, row is observation, cell is single value) mgcv - GAM model A. Key poi...

4227 sym R (10722 sym/49 pcs) 6 img

ML_Regression_updated

13.01.2022

ML_Regression Recap_Supervised Learning Kate C 2022-01-13 Load Packages and Dataset Packages used to import and analyse the data include broom - elegant view of the linear regression model’s results sigr - succinct and correct stat summaries for reports dplyr - for manipulating data WVplots - for plotting gain curve (for model evaluation) vtr...

6414 sym R (11953 sym/89 pcs) 6 img

logistic_binary

07.01.2022

ML_Binary Predictions Kate C 2022-01-07 Load Packages and Dataset Packages used to import and analyse the data include class, dplyr, googlesheet4. class - various functions for classification, including k-nearest neighbor, learning vector quantification and self-organizing maps. googlesheet4 - used for importing the next_sign dataset built in g...

7519 sym R (13083 sym/59 pcs) 3 img

Multiple_logistic_regression_update

31.12.2021

Multiple Logistic Regression Kate C 2021-12-31 Multiple Logistic Regression Logistic regression also supports multiple explanatory variables. In this section, we will look at the case of two numeric explanatory variables, and for visualization, we will use color to denote the response. For logistic regression, there are only two possible values ...

7089 sym R (5480 sym/40 pcs) 5 img 2 tbl

Linear Regression_Many Variables

29.12.2021

Multiple Linear Regression_02 Kate C 2021-12-29 Many Numeric Explanatory Variables Load Packages and Dataset Packages includes fst (for reading fst document), dplyr (data manipulation), ggplot2, broom. r dataset is the one on Taiwan’s property price. Visualizing Many Numeric Variables Faceting should be a good choice when dealing with multip...

2965 sym R (4949 sym/14 pcs) 2 img

Linear Regression_Two Vars

28.12.2021

Multiple Linear Regression Kate C 2021-12-28 Two Numeric Explanatory Variables Load Packages and Dataset Packages includes fst (for reading fst document), dplyr (data manipulation), ggplot2, broom. r dataset is the one on Taiwan’s property price. Visualizing Three Numeric Variables 3D scatter plot - might suffer perspective issues and diffi...

1556 sym R (1807 sym/9 pcs) 4 img

Linear Regression_One Var

26.12.2021

Regression Model Intermediate Kate C 2021-12-27 Packages and Data Packages includes fst (for reading fst document), dplyr (data manipulation), ggplot2, broom. r dataset is the one on taiwan’s property price. One Model Per Category Filter data set for each subcategory per categorical data: taiwan_0_to_15 <- taiwan_real_estate %>% filter(hou...

2936 sym R (5657 sym/28 pcs) 4 img 3 tbl