Publications by Kitada Smalley
DATA252: K-Means
Learning Objectives In this lesson students will … Implement the k-means algorithm scratch Visualize the model Create a function Utilize the caret package for cross validation Choose the appropriate k Resources: R Shiny for K-means Clustering: https://shiny.rstudio.com/gallery/kmeans-example.html ISLR: https://static1.squarespace.com/static/5f...
3407 sym R (10346 sym/35 pcs) 8 img
DATA252:PCA
Learning Objectives In this lesson students will … Review/Get the big ideas for eigenvalues and eigenvectors Apply PCA for dimension reduction Apply PCA to graphics Resources: StatQuest: Principal Component Analysis (PCA), Step-by-Step https://www.youtube.com/watch?v=FgakZw6K1QQ&t=949s Machine Learning for Biostatistics: Principal components a...
4247 sym R (5666 sym/39 pcs) 18 img
DATA252: Basic Text
Learning Objectives In this lesson students will … Learn how to work with unstructured text data Utilize the tidytext package Tockenize text Remove stop words (and customize lexicon for stop words) Create visualizations for text Perfom basic sentiment analysis Step 1: Load the Wine Data These data originally come from winemag.com and are hosted...
2384 sym R (12691 sym/56 pcs) 11 img
DATA252: Simple Neural Net
Learning Objectives In this lesson students will … Review LDA and use it for dimension reduction Apply a Basic Neural Net Step 1: Load the Iris Data Iris is one of the most common datasets for statistical examples! It is a right of passage to use it in a class. data("iris") str(iris) ## 'data.frame': 150 obs. of 5 variables: ## $ Sepal.Len...
499 sym R (3141 sym/26 pcs) 1 img
DATA252: Classification and Image Analysis (Part 1)
Learning Objectives In this lesson students will … Learn the basics of image analysis Apply machine learning algorithms to classify numeric characters Linear Regression K-nearest Neighbors Logistic Regression Classification Tree Random Forest Linear Discriminant Analysis Quadratic Discriminant Analysis Compare model fit using confusion matrice...
5073 sym R (8437 sym/102 pcs) 10 img
DATA252: Classification Trees and More
Learning Objectives In this lesson students will learn … Implement and interpret classification tree in R Learn how to prune a tree Identify and discuss loss functions for classification methods Apply tree aggregation methods (bagging, random forest, boosting) Critically evaluate tree models for their advantages and disadvantages. Data Inspirat...
3278 sym R (6224 sym/52 pcs) 8 img
PNW MAA: Intro R
Learning Objectives The goal of this course is to lay a functional foundation in the use of R, requiring no prior background in R, and illustrated mainly using topics encountered in an elementary statistics course. R’s computational platform is geared toward working with multivariate data. As such, it extends easily to many tasks encountered in d...
8918 sym R (25183 sym/121 pcs) 18 img
DATA252: Logistic
Learning Objectives In this lesson students will learn … The generalized linear model (GLM) framework How to implement simple and multiple logistic regression How to interpret the effect of coefficients in the logistic regression model Perform predictions from a logistic regression model Trade-offs when choosing a threshold Perform variable sele...
8283 sym R (16720 sym/59 pcs) 4 img
DATA252:KNN
Learning Objectives In this lesson students will learn … How to implement the K-nearest neighbors (knn) algorithm Produce stratified training and testing sets The importance of standardizing data How to tune the knn algorithm to pick the best value of \(k\) hyperparameter BINARY CASE Example 1: Pima Indigenous People Motivation/Background: �...
6008 sym R (10878 sym/83 pcs) 11 img
DATA252: Regression Trees
Learning Objectives In this lesson students will learn how to implement… Regression trees Prune a tree Perform class validation to choose tree complexity Citation: Examples for this lesson come from https://bookdown.org/tpinto_home/Beyond-Additivity/ Osteoporosis Facts: Osteoporosis is a bone disease that develops when bone mineral density an...
2881 sym R (1220 sym/9 pcs) 7 img