Publications by Group 4
DATA624 Final Project
Overview This is role playing. I am your new boss. I am in charge of production at ABC Beverage and you are a team of data scientists reporting to me. My leadership has told me that new regulations are requiring us to understand our manufacturing process, the predictive factors and be able to report to them our predictive model of PH. Please use ...
3420 sym R (40528 sym/108 pcs) 16 img 2 tbl
DATA621 Assignment 5
Overview In this homework assignment, we will explore, analyze and model a data set containing information on approximately 12,000 commercially available wines. The variables are mostly related to the chemical properties of the wine being sold. The response variable is the number of sample cases of wine that were purchased by wine distribution co...
3226 sym R (25751 sym/33 pcs) 6 img
DATA624 Homework 9
library(tidyverse) library(mlbench) library(caret) library(Cubist) library(gbm) library(ipred) library(party) library(partykit) library(randomForest) library(rpart) library(RWeka) library(AppliedPredictiveModeling) library(rattle) 8.1 Recreate the simulated data from Exercise 7.2: set.seed(200) simulated <- mlbench.friedman1(200, s...
6009 sym R (15891 sym/52 pcs) 9 img 2 tbl
DATA624 Homework 8
library(tidyverse) library(caret) library(plotmo) library(earth) library(kernlab) library(forecast) library(ipred) library(mlbench) library(AppliedPredictiveModeling) 7.2 Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: \[y = 10 sin...
3690 sym R (38830 sym/83 pcs) 16 img 2 tbl
DATA621 Assignment 4
Overview In this homework assignment, we will explore, analyze and model a data set containing approximately 8000 records representing a customer at an auto insurance company. Each record has two response variables. The first response variable, TARGET_FLAG, is a 1 or a 0. A “1” means that the person was in a car crash. A “0” means that th...
9349 sym R (37899 sym/82 pcs) 20 img 1 tbl
DATA621 - Blog 4: Partial Least Squares (PLS) Regression
What is Partial Least Squares (PLS) Regression? Partial Least Squares (PLS) regression is a technique that reduces the predictors to a smaller set of uncorrelated components and performs least squares regression on these components, instead of just on the original data. PLS regression is a solution for: The problem of Multicollinearity. That is,...
4678 sym R (4136 sym/24 pcs) 5 img
DATA621 Homework 3
Overview In this homework assignment, we will explore, analyze and model a data set containing information on crime for various neighborhoods of a major city. Each record has a response variable indicating whether or not the crime rate is above the median crime rate (1) or not (0). Our objective is to build a binary logistic regression model on t...
6505 sym R (19177 sym/81 pcs) 23 img 4 tbl
DATA621 - Blog 4: Linear Regression and it's Cousins
Linear Regression and its Cousins Javern Wilson 4/1/2020 Introduction Models to be discussed: Ordinary Linear Regression Partial Least Square (PLS) Penalized Models: Lasso, Ridge and Elastic Net Introduction Each model can be written in the form directly or indirectly: \(y_i\): response variable \(\beta_0\): estimated intercept \(\beta_i\)...
3607 sym 5 img
DATA624 Homework 7
library(AppliedPredictiveModeling) library(tidyverse) library(MASS) library(caret) library(pls) library(Amelia) 6.2 Developing a model to predict permeability (see Sect.1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become ...
4943 sym R (9021 sym/46 pcs) 8 img
DATA608 Module 5
DATA608 Module 5 Javern Wilson 3/30/2020 TASK 1 Let’s reverse a word! Click the “Reverse” button to reverse the text. Reverse TASK 2 Multiples! Enter a number. View Multiples ...
218 sym