Publications by Kate C

kNN

01.01.2022

ML_Supervised Learning Classification Kate C 2022-01-02 Classification with Nearest Neighbors KNN - K Nearest Neighbors Load Packages and Dataset Packages used to import and analyse the data include class, dplyr, googlesheet4. class - various functions for classification, including k-nearest neighbor, learning vector quantification and self-or...

5636 sym R (5949 sym/32 pcs) 1 img

LM_regression review

10.01.2022

ML_Regression_Supervised Learning Kate C 2022-01-10 Load Packages and Dataset Packages used to import and analyse the data include broom - elegant view of the linear regression model’s results sigr - succinct and correct stat summaries for reports dplyr - for manipulating data A simple demo on LM note that LM was already covered in previous...

1296 sym R (5494 sym/34 pcs) 2 img

ML_DecisionsTrees_02

13.01.2022

ML_Decision trees Kate C 2022-01-09 Load Packages and Dataset Packages used to import and analyse the data include class, dplyr, googlesheet4. class - various functions for classification, including k-nearest neighbor, learning vector quantification and self-organizing maps. rpart - for recursive partitioning (aka divide and conquer) rpart.plot...

7747 sym R (2243 sym/25 pcs) 4 img

ML_More regressions

13.01.2022

ML Regression_More Examples Kate C 2022-01-16 Load Packages and Dataset Packages used to import and analyse the data include googlesheet4 - for loading in data (dataset not provided by the trainer) dplyr - for data manipulation vtreat - for cross-validation on data Readlised that the types of the columns - sex and alcohol were in char and not ...

4490 sym R (11938 sym/70 pcs) 3 img

Tidymodels_regression

28.01.2022

Tidymodels_01 Kate C 2022-01-28 Load Packages and Dataset Packages used to import and analyse the data include tidymodels - a collection of R packages designed to support machine learning model development rsample - data sampling used to create random subsets of a dataset for different activities in the modelling process. split data into train...

2796 sym R (3865 sym/22 pcs) 2 img 1 tbl

kmeans clustering

08.02.2022

Clustering_k means Kate C 2022-02-08 Load Packages and Dataset lineup data - positions of soccer players purr - for map function dplyr - for data manipulation lineup <- readRDS("~/Documents/R programming/Datacamp/DC_Cluster/Data/lineup.rds") customers_spend <- readRDS("~/Documents/R programming/Datacamp/DC_Cluster/Data/ws_customers.rds") libra...

4287 sym R (2756 sym/23 pcs) 8 img

hierarchical clustering

08.02.2022

Cluster Analysis_distance between observations and Hierarchical clustering Kate C 2022-02-08 Load Packages and Dataset dplyr - for data manipulation dummies - for converting categorical values into binary feature value representation ggplot2 - for visualization dendextend - make colorful dendrogram (tree diagram) library(dplyr) ## ## Attachin...

5544 sym R (5810 sym/59 pcs) 13 img

tidymodel_classification

28.01.2022

Tidymodels_classification Kate C 2022-01-28 Load Packages and Dataset 6.09 pm start time Packages used to import and analyse the data include tidymodels telecom_df: a dataset contains information on customers of a telecommunications company. The outcome variable is canceled_service and it records whether a customer canceled their contract with ...

3239 sym R (17364 sym/25 pcs) 1 img

tidymodel_classification

29.01.2022

Tidymodels_classification Kate C 2022-01-29 Load Packages and Dataset 6.09 pm start time Packages used to import and analyse the data include tidymodels telecom_df: a dataset contains information on customers of a telecommunications company. The outcome variable is canceled_service and it records whether a customer canceled their contract with ...

5187 sym R (7559 sym/41 pcs) 5 img

Tidymodels_featureEngineering

01.02.2022

...

7 sym 2 img