Publications by Arvind Sharma
Titanic Data - Logistic Regression and Confusion Matrix with Caret Package
1 Setup # Clear the work space rm(list = ls()) # Clear environment gc() # Clear unused memory ## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) ## Ncells 525725 28.1 1167425 62.4 NA 669291 35.8 ## Vcells 966906 7.4 8388608 64.0 32768 1840353 14.1 cat("\f") # Clear the console #dev.off() # ...
2075 sym R (21707 sym/88 pcs) 1 img
Confusion Matrix
1 How to get Classification Confusion Matrix? Logistic Regression is a classification type supervised learning model. Logistic Regression is used when the independent variable x, is either a continuous or categorical variable and the dependent variable (y) is a categorical variable. Confusion matrix: Confusion matrix categorizes the actual data w.r...
1769 sym R (2391 sym/18 pcs)
Backward Elimination
1 Setup # Clear the work space rm(list = ls()) # Clear environment gc() # Clear unused memory ## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) ## Ncells 525428 28.1 1166577 62.4 NA 669291 35.8 ## Vcells 965458 7.4 8388608 64.0 32768 1840353 14.1 cat("\f") # Clear the console #dev.off() # ...
1195 sym R (12409 sym/42 pcs) 1 img
Linear Regression Titanic train
Setup # Clear the work space rm(list = ls()) # Clear environment gc() # Clear unused memory ## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) ## Ncells 523246 28.0 1160342 62 NA 669291 35.8 ## Vcells 959432 7.4 8388608 64 32768 1840353 14.1 cat("\f") # Clear the console #dev.off() # Cl...
63 sym 4 img
Hypothesis Testing Basics
Using traditional methods, it takes 109 hours to receive a basic driving license. A new license training method using Computer Aided Instruction (CAI) has been proposed. A researcher used the technique with 190 students and observed that they had a mean of 110 hours. Assume the standard deviation is known to be 6. A level of significance of 0.05 wi...
1339 sym
Normal Distribution In Class problems
1 Empirical Rule 2/3rd of the data within 1 sd of the mean (if normal distribution) pnorm(q = 99.6, mean = 98.6, sd = 1 ) - pnorm(q = 97.6, mean = 98.6, sd = 1 ) ## [1] 0.6826895 95% of the data within 2 sd of the mean (if normal distribution) pnorm(q = 100.6, mean = 98.6, sd = 1 ) - pno...
636 sym
Publish Document
Read this help file for R Markdown. 1 Set Up Clear all environments. Installing and loading all the libraries. Make sure you have the libraries installed. 2 Import Data Now, I will import my data. Make sure you comment out or exclude or do not use View(train) command. df <- read.csv("~/Library/CloudStorage/Dropbox/WCAS/Summer/Data Analysis/share/...
2928 sym Python (12681 sym/50 pcs) 10 img
Understanding reshaping the data using melt and ggplot options
Read this help file for R Markdown. 1 Set Up Clear all environments. Installing and loading all the libraries. Make sure you have the libraries installed. 2 Import Data Now, I will import my data. Make sure you comment out or exclude or do not use View(train) command. df <- read.csv("~/Library/CloudStorage/Dropbox/WCAS/Summer/Data Analysis/share/...
3027 sym Python (15268 sym/63 pcs) 11 img
Summary Statistics, Base R plot
Read this help file for R Markdown. 1 Set Up Clear all environments. Installing and loading all the libraries. Make sure you have the libraries installed. 2 Import Data Now, I will import my data. Make sure you comment out or exclude or do not use View(train) command. df <- read.csv("~/Library/CloudStorage/Dropbox/WCAS/Summer/Data Analysis/share/...
1151 sym Python (5208 sym/17 pcs) 3 img
In Class Probability Exercises
# Clear the work space rm(list = ls()) # Clear environment gc() # Clear unused memory ## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) ## Ncells 525281 28.1 1166157 62.3 NA 669291 35.8 ## Vcells 965191 7.4 8388608 64.0 32768 1840415 14.1 cat("\f") # Clear the console 1 Matrix, Table commands, M...
1377 sym Python (14602 sym/35 pcs) 1 img