Publications by Gustavo Seifer

ISLR Ch.12 (Labs) - Unsupervised Learning

28.08.2021

An Introduction to Statistical Learning (2nd ed.) Labs An Introduction to Statistcial Learning ISLR tidymodels Labs Chapter 12 Unsupervised Learning library(tidymodels) library(ISLR) library(patchwork) library(factoextra) theme_set(theme_bw()) usa <- as_tibble(USArrests, rownames = "state") Notice how the mean of each of the variables is ...

1769 sym R (13951 sym/59 pcs) 18 img

Survival Analysis

28.08.2021

Survival Analysis in R library(tidyverse) library(survival) library(ggfortify) library(survminer) theme_set(theme_bw()) var_names <- c("patient", "treatment", "time", "status", "age", "psa", "size", "gleason") prostate <- read_delim ("prostate_cancer.txt", delim = " ", col_names = var_names) glimpse(prostate) Rows: 63 Columns: 8 $ patie...

125 sym R (9072 sym/49 pcs) 12 img

ISLR Ch.08 (Labs) - Tree-Based Models

24.08.2021

An Introduction to Statistical Learning (2nd ed.) Labs An Introduction to Statistcial Learning ISLR tidymodels Labs Chapter 08 Tree-based Models library(tidymodels) library(ISLR) library(rpart.plot) library(vip) theme_set(theme_bw()) carseat <- tibble(Carseats) glimpse(carseat) Rows: 400 Columns: 11 $ Sales <dbl> 9.50, 11.22, 10...

1051 sym R (19212 sym/69 pcs) 9 img 1 tbl

ISLR Ch.05 (Labs) - Resampling Methods

15.08.2021

An Introduction to Statistical Learning (2nd ed.) Labs Chapter 05 resampling Methods library(tidymodels) library(ISLR) theme_set(theme_bw()) auto <- tibble(Auto) portfolio <- tibble(Portfolio) glimpse(auto) Rows: 392 Columns: 9 $ mpg <dbl> 18, 15, 18, 16, 17, 15, 14, 14, 14, 15, 15, 14, 15, 14, 2~ $ cylinders <dbl> 8, 8, 8...

1938 sym R (9463 sym/43 pcs) 2 img

Tidymodels tuning

12.08.2021

library(tidyverse) library(tidymodels) library(modeldata) #dataset cells theme_set(theme_bw()) data(cells) cell <- cells cell # A tibble: 2,019 x 58 case class angle_ch_1 area_ch_1 avg_inten_ch_1 avg_inten_ch_2 avg_inten_ch_3 <fct> <fct> <dbl> <int> <dbl> <dbl> <dbl> 1 Test PS 143. ...

170 sym R (32398 sym/76 pcs) 5 img

Capstone Project - Week 3

07.08.2021

library(tidytext) library(tidyverse) library(stopwords) library(tm) theme_set(theme_bw()) Background The goal of this exercise is to build and evaluate your first predictive model. You will use the n-gram and backoff models you built in previous tasks to build and evaluate your predictive model. The goal is to make the model efficient and a...

1038 sym R (4316 sym/17 pcs) 4 img 1 tbl

Task 2 - Exploratory Data Analysis (week 2)

06.08.2021

Loading packages library(tidyverse) library(tidytext) library(stopwords) library(tm) theme_set(theme_bw()) Task 0 - Understanding the Problem In this capstone we will be applying data science in the area of natural language processing. As a first step toward working on this project, you should familiarize yourself with Natural Language Proc...

3160 sym R (10311 sym/42 pcs) 4 img

Slide Deck Capstone Project

08.08.2021

Coursera Data Science - Capstone ProjectGustavo Seifer08.August.2021 Capstone Project Details Background and rationale Around the world, people are spending an increasing amount of time on their mobile devices for email, social networking, banking and a whole range of other activities. The main objective of this project was to develop a text p...

2018 sym

ISLR Ch.03 (Labs) - Linear Regression

10.08.2021

library(ISLR) library(MASS) # for the Boston data set library(tidymodels) library(GGally) library(corrplot) theme_set(theme_bw()) An Introduction to Statistical Learning (2nd ed.) Labs Chapter 03 Linear regression The Boston data set contain various statistics for 506 neighborhoods in Boston. We will build a simple linear regression mod...

4557 sym R (20354 sym/80 pcs) 3 img

ISLR Ch.04 (Labs) - Logistic Regression

14.08.2021

Packages library(ISLR) library(discrim) # for LDA and QDA library(tidymodels) library(GGally) library(corrr) theme_set(theme_bw()) An Introduction to Statistical Learning (2nd ed.) Labs Chapter 04 Classification We will be examining the Smarket data set for this lab. It contains a number of numeric variables plus a variable called Direc...

3281 sym R (16456 sym/79 pcs) 3 img