Publications by Gitanjali Mule

Boston Housing Prices Analysis

29.01.2021

Question 2 Explain whether each scenario is a classification or regression problem, and indicate whether we are most interested in inference or prediction. Finally, provide n and p. (a) We collect a set of data on the top 500 firms in the US. For each firm we record profit, number of employees, industry and the CEO salary. We are interested in ...

7749 sym R (64657 sym/65 pcs) 12 img

Data Analytics Application Final Project

05.12.2021

data = read.table("D:/Fall 2021/DA Application/project/marketing_campaign.csv", sep = "\t", header = TRUE) df = data.frame(data) head(df) ## ID Year_Birth Education Marital_Status Income Kidhome Teenhome Dt_Customer ## 1 5524 1957 Graduation Single 58138 0 0 04-09-2012 ## 2 2174 1954 Graduation ...

77 sym R (11445 sym/57 pcs) 14 img

ISLR Exercise 6

26.03.2022

Question 2 2. (a) Less flexible and will give improved prediction accuracy when its increase in bias is less than its decrease in variance. As lambda increases, flexibility of fit decreases, and so the estimated coefficients decrease with some being zero. This leads to a substantial decrease in the variance of the predictions for a small increas...

1347 sym R (12894 sym/45 pcs) 6 img

Exercise 5 ISLR

14.03.2022

Question 3 (a) Description & Implementation Question: Explain how k-fold cross-validation is implemented. Answer: The data is segmented into k distinct, (usually) equal-sized ‘folds’. A model is trained on k−1 of the folds and tested on the remaining fold. This process is repeated k times, such that each of the k folds acts as the test dat...

3446 sym R (4344 sym/35 pcs)

ISLR Exercise 3.7

19.02.2022

Question 1 2. Carefully explain the differences between the KNN classifier and KNN regression methods. Answer: KNN classifier is used for classification problem while KNN regression method is used for continuous variable/ regression problem KNN classifier classifies a point as the class which the majority of the knns has, while regression esti...

4029 sym R (10949 sym/46 pcs) 6 img

ISLA exercise 4

05.03.2022

Question 10 head(Weekly) ## Year Lag1 Lag2 Lag3 Lag4 Lag5 Volume Today Direction ## 1 1990 0.816 1.572 -3.936 -0.229 -3.484 0.1549760 -0.270 Down ## 2 1990 -0.270 0.816 1.572 -3.936 -0.229 0.1485740 -2.576 Down ## 3 1990 -2.576 -0.270 0.816 1.572 -3.936 0.1598375 3.514 Up ## 4 1990 3.514 -2.576 -0.270 ...

982 sym R (16251 sym/108 pcs) 3 img

Assignment 7

24.04.2022

Question 3 p=seq(0,1,0.01) gini= 2*p*(1-p) classerror= 1-pmax(p,1-p) crossentropy= -(p*log(p)+(1-p)*log(1-p)) plot(NA,NA,xlim=c(0,1),ylim=c(0,1),xlab='p',ylab='f') lines(p,gini,type='l') lines(p,classerror,col='blue') lines(p,crossentropy,col='red') legend(x='top',legend=c('gini','class error','cross entropy'), col=c('black','...

193 sym R (4389 sym/39 pcs) 13 img 10 tbl

Assignment 8

29.04.2022

library(caret) ## Loading required package: ggplot2 ## Loading required package: lattice library(ISLR) library(tidyverse) ## -- Attaching packages --------------------------------------- tidyverse 1.3.1 -- ## v tibble 3.1.6 v dplyr 1.0.8 ## v tidyr 1.2.0 v stringr 1.4.0 ## v readr 2.1.2 v forcats 0.5.1 ## v purrr 0.3.4 ##...

385 sym R (8371 sym/97 pcs) 6 img 15 tbl

Assignment 6

16.04.2022

custom_regression_metrics <- function (data, lev = NULL, model = NULL) { c(RMSE = sqrt(mean((data$obs-data$pred)^2)), Rsquared = summary(lm(pred ~ obs, data))$r.squared, MAE = mean(abs(data$obs-data$pred)), MSE = mean((data$obs-data$pred)^2), RSS = sum((data$obs-data$pred)^2)) } ctrl <- trainControl(method = "cv", number ...

945 sym R (8474 sym/33 pcs) 4 img