Publications by Arvind Sharma

Titanic Data - Logistic Regression and Confusion Matrix with Caret Package

26.07.2023

1 Setup # Clear the work space rm(list = ls()) # Clear environment gc() # Clear unused memory ## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) ## Ncells 525725 28.1 1167425 62.4 NA 669291 35.8 ## Vcells 966906 7.4 8388608 64.0 32768 1840353 14.1 cat("\f") # Clear the console #dev.off() # ...

2075 sym R (21707 sym/88 pcs) 1 img

Confusion Matrix

26.07.2023

1 How to get Classification Confusion Matrix? Logistic Regression is a classification type supervised learning model. Logistic Regression is used when the independent variable x, is either a continuous or categorical variable and the dependent variable (y) is a categorical variable. Confusion matrix: Confusion matrix categorizes the actual data w.r...

1769 sym R (2391 sym/18 pcs)

Backward Elimination

25.07.2023

1 Setup # Clear the work space rm(list = ls()) # Clear environment gc() # Clear unused memory ## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) ## Ncells 525428 28.1 1166577 62.4 NA 669291 35.8 ## Vcells 965458 7.4 8388608 64.0 32768 1840353 14.1 cat("\f") # Clear the console #dev.off() # ...

1195 sym R (12409 sym/42 pcs) 1 img

Linear Regression Titanic train

24.07.2023

Setup # Clear the work space rm(list = ls()) # Clear environment gc() # Clear unused memory ## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) ## Ncells 523246 28.0 1160342 62 NA 669291 35.8 ## Vcells 959432 7.4 8388608 64 32768 1840353 14.1 cat("\f") # Clear the console #dev.off() # Cl...

63 sym 4 img

Hypothesis Testing Basics

24.07.2023

Using traditional methods, it takes 109 hours to receive a basic driving license. A new license training method using Computer Aided Instruction (CAI) has been proposed. A researcher used the technique with 190 students and observed that they had a mean of 110 hours. Assume the standard deviation is known to be 6. A level of significance of 0.05 wi...

1339 sym

Normal Distribution In Class problems

22.07.2023

1 Empirical Rule 2/3rd of the data within 1 sd of the mean (if normal distribution) pnorm(q = 99.6, mean = 98.6, sd = 1 ) - pnorm(q = 97.6, mean = 98.6, sd = 1 ) ## [1] 0.6826895 95% of the data within 2 sd of the mean (if normal distribution) pnorm(q = 100.6, mean = 98.6, sd = 1 ) - pno...

636 sym

Publish Document

21.07.2023

Read this help file for R Markdown. 1 Set Up Clear all environments. Installing and loading all the libraries. Make sure you have the libraries installed. 2 Import Data Now, I will import my data. Make sure you comment out or exclude or do not use View(train) command. df <- read.csv("~/Library/CloudStorage/Dropbox/WCAS/Summer/Data Analysis/share/...

2928 sym Python (12681 sym/50 pcs) 10 img

Understanding reshaping the data using melt and ggplot options

21.07.2023

Read this help file for R Markdown. 1 Set Up Clear all environments. Installing and loading all the libraries. Make sure you have the libraries installed. 2 Import Data Now, I will import my data. Make sure you comment out or exclude or do not use View(train) command. df <- read.csv("~/Library/CloudStorage/Dropbox/WCAS/Summer/Data Analysis/share/...

3027 sym Python (15268 sym/63 pcs) 11 img

Summary Statistics, Base R plot

20.07.2023

Read this help file for R Markdown. 1 Set Up Clear all environments. Installing and loading all the libraries. Make sure you have the libraries installed. 2 Import Data Now, I will import my data. Make sure you comment out or exclude or do not use View(train) command. df <- read.csv("~/Library/CloudStorage/Dropbox/WCAS/Summer/Data Analysis/share/...

1151 sym Python (5208 sym/17 pcs) 3 img

In Class Probability Exercises

20.07.2023

# Clear the work space rm(list = ls()) # Clear environment gc() # Clear unused memory ## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) ## Ncells 525281 28.1 1166157 62.3 NA 669291 35.8 ## Vcells 965191 7.4 8388608 64.0 32768 1840415 14.1 cat("\f") # Clear the console 1 Matrix, Table commands, M...

1377 sym Python (14602 sym/35 pcs) 1 img