Publications by PK O’Flaherty
DATA622 Homework 2
Introduction Our task is to generate two decision trees models and a random forest model to make classification predictions, and then, using this sales article and examples of real cases where decision trees went wrong, answer how we can improve the perception of our final decision tree model. Essay We extended upon the dataset used in the first a...
27667 sym R (10775 sym/27 pcs) 7 img
DATA622 Homework 2
Introduction Our task is to generate two decision trees models and a random forest model to make classification predictions, and then, using this sales article and examples of real cases where decision trees went wrong, answer how we can improve the perception of our final decision tree model. Essay We extended upon the dataset used in the first a...
17833 sym R (11569 sym/28 pcs) 3 img
DATA624 Project 2
Project Introduction: New regulations are requiring ABC Beverage to provide a report with an outline of our manufacturing process, and a predictive model of PH including an explanation of predictive factors. Our data science team is tasked with developing the predictive model from provided historical data and using that model to predict PH on test ...
16826 sym R (53487 sym/127 pcs) 17 img 1 tbl
DATA622 Homework 1
Introduction Our task is to conduct exploratory analysis to predict an outcome based on two sets of data, large and small, and compare the results. Also needed is an explanation of the selection of algorithms and how they relate to the data and what we are trying to do in a short essay. Essay After first identifying we were presented with a classi...
16840 sym R (10747 sym/24 pcs) 1 img 3 tbl
DATA624 Project 2
Project Introduction New regulations are requiring ABC Beverage to provide a report with an outline of our manufacturing process, and a predictive model of PH including an explanation of predictive factors. Our data science team is tasked with developing the predictive model from provided historical data and using that model to predict PH on test d...
10851 sym R (25223 sym/39 pcs) 5 img 1 tbl
DATA624 Project 2
Project Introduction New regulations are requiring ABC Beverage to provide a report with an outline of our manufacturing process, and a predictive model of PH including an explanation of predictive factors. Our data science team is tasked with developing the predictive model from provided historical data and using that model to predict PH on test d...
6397 sym R (11407 sym/29 pcs) 2 img
DATA624 Homework 10
Introduction Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer’s basket - and therefore ‘Market Basket Analysis’. That is exactly what the Groceries Data Set contains: a collection of receipts with each line rep...
7719 sym R (8713 sym/12 pcs) 1 img
DATA624 Homework 7
Exercise 6.2 Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: Part a The matrix fingerprints contains the 1,107 binary molecular predictors for the 165 compounds, w...
14647 sym R (18502 sym/42 pcs) 2 img 5 tbl
DATA624 Homework 9
Exercise 8.1 Recreate the simulated data from Exercise 7.2: In the hidden code chunk below we create the simulated data. # Check out packages library(mlbench) library(randomForest) library(caret) library(knitr) library(party) library(gbm) library(Cubist) library(rules) library(rpart) library(ggplot2) library(AppliedPredictiveModeling) library(rpart...
11791 sym R (10071 sym/30 pcs) 2 img 6 tbl
DATA624 Homework 8
Exercise 7.2 Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: \[ y = 10 \sin(\pi x_1 x_2) + 20 (x_3 - 0.5)^2 + 10 x_4 + 5 x_5 + N(0, \sigma^2) \] where the x values are random variables uniformly distributed between [0, 1] (there are also 5 ot...
7134 sym R (20420 sym/41 pcs) 3 img 2 tbl