Publications by Abha Jha

Customer Retention Case Study

12.12.2022

library(SMCRM) # CRM data library(dplyr) # data wrangling library(tidyr) # data wrangling library(ggplot2) # plotting library(survival) # survival library(rpart) # DT library(randomForestSRC) # RF library(tidyverse) library(tree) library(Metrics) library(caret) library(car) library(kernlab) library(MASS) library(performance) librar...

15372 sym 12 img

Algorithms Homework 2

15.12.2022

library(DescTools) library(MASS) library(car) Exercise 1: Analysis of Variance The heartbpchol.csv data set contains continuous cholesterol (Cholesterol) and blood pressure status (BP_Status) (category: High/ Normal/ Optimal) for alive patients. For the heartbpchol.xlsx data set, consider a one-way ANOVA model to identify differences between g...

16036 sym R (10366 sym/66 pcs) 5 img

Algorithms Homework 1

15.12.2022

Homework 1 Author: Abha Jha abc123: wnn231 install.packages(“dplyr”) output: html_document: default library(e1071) library(fBasics) library(tidyverse) library(devtools) library(dplyr) Exercise 1 a) MPGCombo = (c(CARS$MPG_City * 0.4))+(c(CARS$MPG_Highway * 0.6)) CARS = data.frame(CARS, MPGCombo) boxplot(MPGCombo, main = “MPG Combined (...

5849 sym

Algorithms Homework 3

15.12.2022

Exercise 1 ## 'data.frame': 3134 obs. of 4 variables: ## $ Weight : int 132 158 156 131 136 194 179 151 174 155 ... ## $ Diastolic : int 90 80 76 92 80 68 76 68 90 90 ... ## $ Systolic : int 170 128 110 176 112 132 128 108 142 130 ... ## $ Cholesterol: int 250 242 281 196 196 211 225 221 188 292 ... Fitting the linear model...

7588 sym 8 img

Algorithms Homework 4

15.12.2022

Exercise 1: The liver data set is a subset of the ILPD (Indian Liver Patient Dataset) data set. It contains the first 10 variables described on the UCI Machine Learning Repository and a LiverPatient variable (indicating whether or not the individual is a liver patient. People with active liver disease are coded as LiverPatient=1 and people withou...

14771 sym 8 img

Algorithms Midterm Exam

15.12.2022

Data Sets: You need to download dataset birthweight.csv for Exercise 1-4. The birthweight data record live, singleton births to mothers between the ages of 18 and 45 in the United States who were classified as black or white. There are total of 295 observations in birthweight, and variables are: Weight: Infant birth weight (gram) Black: Categori...

9421 sym 4 img

Algorithms Final Exam

15.12.2022

Data Sets: You need to download dataset birthweight_final.csv. The data record live, singleton births to mothers between the ages of 18 and 45 in the United States who were classified as black or white. There are total of 400 observations in birthweight, and variables are: Weight: Infant birth weight (gram) Weight_Gr; Categorical variable for in...

8603 sym 1 img

Dow Jones Case Study

15.12.2022

2. Data dowjones = read.table("dow_jones_index.data", sep = ",", header = TRUE) str(dowjones) ## 'data.frame': 750 obs. of 16 variables: ## $ quarter : int 1 1 1 1 1 1 1 1 1 1 ... ## $ stock : chr "AA" "AA" "AA" "AA" ... ## $ date : chr "1/7/2011" "1/1...

2509 sym 5 img

STA 6543 - Predictive Modeling Assignment 5

25.03.2022

Problem 2 For parts (a) through (c), indicate which of i. through iv. is correct. Justify your answer. (a) The lasso, relative to least squares, is: More flexible and hence will give improved prediction accuracy when its increase in bias is less than its decrease in variance. More flexible and hence will give improved prediction accuracy when it...

7155 sym R (8396 sym/58 pcs) 7 img

STA 6543 - Predictive Modeling Assignment 4

10.03.2022

Problem 3 We now review k-fold cross-validation. (a) Explain how k-fold cross-validation is implemented. This approach involves randomly dividing the set of observations into k groups or folds of approximately equal size. The first fold is treated as a validation set and the method is fit on the remaining (k-1) folds. The mean squared error is co...

8413 sym R (10032 sym/48 pcs)

Publications by Abha Jha - wnn231