Publications by Zach Herold, Anthony Pagan, Betsy Rosalen

Data 624 Project 2 Technical Report

11.05.2020

Project Description The data science team at ABC Beverage has been asked to provide an analysis of our manufacturing process, the predictive factors, and a predictive model of PH in order to comply with new regulations. This report details the steps taken in our analysis, including the assumptions made, the methodology used, the models tested, th...

13136 sym R (20345 sym/7 pcs) 16 img

Data 624 Project 2 Non-Technical Report

11.05.2020

Project Description Data Analysis of the impact of ABC Beverage Manufacturing Process on pH This report contains the findings of the data analysis undertaken by the data science team, lead by Zach Herold, Anthony Pagan, and Betsy Rosalen at ABC Beverage Company in order to better understand the impact of manufacturing processes on the pH level i...

11338 sym R (1232 sym/1 pcs) 12 img

Association Rules

04.05.2020

Introduciton Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer’s basket - and therefore ‘Market Basket Analysis’. That is exactly what the Groceries Data Set contains: a collection of receipts with each line r...

1892 sym R (7445 sym/5 pcs) 6 img

Regression Trees and Rule Based Models

27.04.2020

8.1 Recreate the simulated data from Exercise 7.2: library(mlbench) set.seed(200) simulated <- mlbench.friedman1(200, sd = 1) simulated <- cbind(simulated$x, simulated$y) simulated <- as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" set.seed(1234) pd<-sample(2,nrow(simulated),replace = TRUE,prob=c(.7,.3)) traindata<-s...

5773 sym R (7255 sym/13 pcs) 15 img

Team Presentation Non-Linear Regression

22.04.2020

DATA 624 - Non-Linear Regression Zach Herold, Anthony Pagan, Betsy Rosalen April 21, 2020 Linear Regression Review Linear Regression model equations can be written either directly or indirectly in the form: \[y_i = b_0 + b_1x_{i1} + b_2x_{i2} + ... + b_Px_{iP} + e_i\] Where: \(y_i\) is the outcome or response \(b_0\) is the Y-intercept \(P\)...

13060 sym R (4593 sym/15 pcs) 23 img

Non-Linear Regression Models

20.04.2020

7.2. Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: y = 10 sin(πx1x2) + 20(x3 − 0.5)2 + 10x4 + 5x5 + N(0, σ2) where the x values are random variables uniformly distributed between [0, 1] (there are also 5 other non-informative variable...

2413 sym R (4347 sym/9 pcs) 3 img

Linear Regression

07.04.2020

6.2. Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug. (a) Start R and use these commands to load the data.The matrix fingerprints contains the 1,107 binary molecu...

3067 sym R (16579 sym/19 pcs) 7 img

SVM on Covid Sample Survey

05.04.2020

BB Week 10 Split/Subset Data covid<- read.csv("covid-19-survey-responses-sample.csv", header = TRUE) covid<-subset(covid,select=(names(covid[12:ncol(covid)-2]))) intrain<- createDataPartition(y = covid$q03_symptoms, p=.7, list=FALSE) training <- covid[intrain,] test<- covid[-intrain, ] dim(training) ## [1] 15 11 dim(test) ## [1] 5 11 anyNA...

131 sym R (7906 sym/22 pcs) 1 img

Prob-LM-KNN

02.04.2020

Probabilities You have been hired by a local electronics retailer and the above dataset has been given to you. Manager Bayes Jr.9th wants to create a spreadsheet to predict is a customer is likely prospect. Prior Probabilities Compute prior probabilities for the Prospect Yes/No Prior YES probability: 64% Prior NO probability: 36% Conditional ...

4025 sym R (12740 sym/17 pcs) 21 img

Project 1

30.03.2020

Project 1 Description This project consists of 3 parts - two required and one bonus and is worth 15% of your grade. The project is due at 11:59 PM on Sunday March 31. I will accept late submissions with a penalty until the meetup after that when we review some projects. Part A – ATM Forecast, ATM624Data.xlsx In part A, I want you to forecast ho...

13649 sym R (47848 sym/191 pcs) 98 img