Publications by John K. Hancock

CUNY DATA 624 Homework7 - Linear Regression

17.04.2021

Linear Regression - 6.3 (a) Start R and use these commands to load the data: library(AppliedPredictiveModeling) ## Warning: package 'AppliedPredictiveModeling' was built under R version 3.6.3 data(ChemicalManufacturingProcess) The matrix processPredictors contains the 57 predictors (12 describing the input biological material and 45 describing t...

16924 sym R (39379 sym/62 pcs) 6 img

CUNY DATA 624 Solo Project One

10.04.2021

Introduction The solo project consists of two parts. Part A tasks us with forecasting the amount of cash withdrawn from 4 ATM machines. Part B is forecasting residential power usage. The data provided for this project is in two excel files. The major outline of my approach is to examine the data, clean and prepare the data for analysis, and compa...

80475 sym R (34391 sym/302 pcs) 89 img 27 tbl

DATA624 Spring 2021 - HW 6 - ARIMA

26.03.2021

8.11.1 a. Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. Explain the differences among these figures. Do they all indicate that the data are white noise? Starting with the definition of “white noise” which is “a time series that shows no autocorrelation”. Yes. All three random time series ...

37590 sym R (14102 sym/117 pcs) 33 img

DATA 624 Spring 2021 HW4 - Data Preprocessing/Overfitting

06.03.2021

Question No. 3.1 The Glass dataset consists of 214 observations and 10 variables. help(Glass) ## starting httpd help server ... done Glass Identification Database Description A data frame with 214 observation containing examples of the chemical analysis of 7 different types of glass. The problem is to forecast the type of class on basis of the...

26695 sym R (11699 sym/30 pcs) 3 img 1 tbl

DATA624 Spring 2021 HW5 - Exponential Smoothing

13.03.2021

7.8.1 Consider the pigs series – the number of pigs slaughtered in Victoria each month. a. Use the ses() function in R to find the optimal values of \(\alpha\) and \(\ell_0\) and generate forecasts for the next four months. First, we begin by inspecting the pigs time series by looking at its information, its head and tail. We see that the time...

41130 sym R (13365 sym/77 pcs) 25 img 13 tbl

CUNY DATA624 Homework 8 - Non Linear Regression

22.04.2021

7.2. Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: \[y = 10sin(\pi x_1x_2) + 20(x_3-0.5)^2 + 10x_4+5x_5+N(0,\sigma^2)\] where the x values are random variables uniformly distributed between [0, 1] (there are also 5 other non-informative v...

24631 sym R (50148 sym/128 pcs) 11 img

CUNY DATA624 HW9 - Regression Trees and Rule Based Models

28.04.2021

8.1 ##Recreate the simulated data from Exercise 7.2 library(mlbench) ## Warning: package 'mlbench' was built under R version 3.6.3 set.seed(200) simulated <- mlbench.friedman1(200, sd = 1) simulated <- cbind(simulated$x, simulated$y) simulated <- as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" head(simulated) ## ...

51898 sym R (386203 sym/116 pcs) 15 img

CUNY DATA 624 PROJECT TWO

09.05.2021

OVERVIEW The data science team of Salma Elshahawy, John K. Hancock, and Farhana Zahir have prepared the following technical report to address the issue of understanding ABC’s manufacturing process and its predictive factors. This report is the predictive value of the PH. The report consists of the following: PART 1: THE DATASETS PART 2: DATA P...

44416 sym R (44249 sym/156 pcs) 25 img 4 tbl