Publications by Abha Jha - wnn231

STA 6543 - Predictive Modeling Assignment 3

05.03.2022

Problem 10 This question should be answered using the Weekly data set, which is part of the ISLR package. This data is similar in nature to the Smarket data from this chapter’s lab, except that it contains 1, 089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010. (a) Produce some numerical and graphical summaries of the...

8709 sym R (22608 sym/139 pcs) 2 img

Case Study 1 - Bank Marketing

14.02.2022

Load the file bank <- read_delim("bank-additional.csv", delim = ";", escape_double = FALSE, trim_ws = TRUE) ## Rows: 4119 Columns: 21 ## -- Column specification -------------------------------------------------------- ## Delimiter: ";" ## chr (11): job, marital, education, default, housing, loan, contact, month, ...

5553 sym R (41117 sym/94 pcs) 11 img

STA 6543 - Predictive Modeling Assignment 1

18.02.2022

Exercise 2: Explain whether each scenario is a classification or regression problem, and indicate whether we are most interested in inference or prediction. Finally, provide n and p. Exercise 2a: We collect a set of data on the top 500 firms in the US. For each firm we record profit, number of employees, industry and the CEO salary. We are inter...

8881 sym R (6991 sym/81 pcs) 11 img

STA 6543 - Predictive Modeling Assignment 2

19.02.2022

Problem 2 Carefully explain the differences between the KNN classifier and KNN regression methods KNN classifier : The goal of the KNN classifier is classification, where the response variable is categorical. Given a positive integer K and a test observation x0, the KNN classifier first identifies the K points in the training data that are closes...

6881 sym R (8401 sym/40 pcs) 4 img

Case Study 2 - Bookbinders

26.02.2022

Load Data book_train <- read_excel("BBBC-Train.xlsx") book_test <- read_excel("BBBC-Test.xlsx") book_train = book_train[,-1] book_test = book_test[,-1] str(book_train) ## tibble [1,600 x 11] (S3: tbl_df/tbl/data.frame) ## $ Choice : num [1:1600] 1 1 1 1 1 1 1 1 1 1 ... ## $ Gender : num [1:1600] 1 1 1 1 0 1 1 0 1 1 ... ...

2141 sym R (24887 sym/79 pcs) 4 img

Predictive Modeling Final Project

12.05.2022

Business Problem A national veterans’ organization wishes to develop a predictive model to improve the cost-effectiveness of their direct marketing campaign. The organization, with its in-house database of over 13 million donors, is one of the largest direct-mail fundraisers in the United States. According to their recent mailing records, the o...

9217 sym R (19749 sym/46 pcs) 9 img

Data Analytics Application Project

09.05.2022

Load the file audit_risk <- read_delim("audit_risk.csv", delim = ",") ## Rows: 776 Columns: 27 ## -- Column specification -------------------------------------------------------- ## Delimiter: "," ## chr (1): LOCATION_ID ## dbl (26): Sector_score, PARA_A, Score_A, Risk_A, PARA_B, Score_B, Risk_B, TO... ## ## i Use `spec()` to retrieve the f...

4328 sym R (37192 sym/105 pcs) 6 img

STA 6543 - Predictive Modeling Assignment 7

23.04.2022

Problem 3 Consider the Gini index, classification error, and entropy in a simple classification setting with two classes. Create a single plot that displays each of these quantities as a function of \(\hat{p}_{m1}\). The x-axis should display \(\hat{p}_{m1}\), ranging from 0 to 1, and the y-axis should display the value of the Gini index, classif...

5109 sym R (10196 sym/51 pcs) 5 img

STA 6543 - Predictive Modeling Assignment 6

16.04.2022

Problem 6 In this exercise, you will further analyze the Wage data set considered throughout this chapter. (a) Perform polynomial regression to predict wage using age. Use cross-validation to select the optimal degree d for the polynomial. What degree was chosen, and how does this compare to the results of hypothesis testing using ANOVA? Make a p...

2824 sym R (9167 sym/24 pcs) 5 img

STA 6543 - Predictive Modeling Assignment 8

30.04.2022

Problem 5 We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features. **(a) Generate a data set with n = 500 and p = 2,...

5789 sym R (56525 sym/105 pcs) 8 img