Publications by Adam Douglas
DATA 624 - Homework 7
Problem 6.2 Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: a) Start R and use these commands to load the data: library(AppliedPredictiveModeling) data(permeabil...
7821 sym R (8202 sym/40 pcs) 11 img
DATA 624 - Homework 8
7.2 Friedman (1991) introduced several benchmark data sets created by simulation. One of these simulations used the following nonlinear equation to create data: \[ y = 10sin(\pi x_1 x_2) + 20(x_3 - 0.5)^2 + 10x_4+5x_5 + N(0,\sigma^2) \] where the \(x\) values are random variables uniformly distributed between \([0,1]\) (there are also 5 other no...
6246 sym R (10677 sym/38 pcs) 13 img
DATA 622 - Test 1
Get the Data We start by reading in the data from homework 1: # Read in our data rawdata <- read_csv("hw1.csv", col_types = "dff") Let’s plot the data to refresh our memory about it: # Plot rawdata %>% ggplot(aes(x=x, y=y, col=label)) + geom_point() + scale_color_manual(name="Label",values=c("BLUE"="blue","BLACK"="black")) We have one contin...
2955 sym R (1187 sym/8 pcs) 1 img
DATA 624 - Homework 9
8.1 - Recreate the simulated data from Exercise: 7.2: set.seed(200) simulated <- mlbench.friedman1(200,sd=1) simulated <- cbind(simulated$x, simulated$y) simulated <- as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" (a) Fit a random forest model to all of the predictors, then estimate the variable importance scores: model1 <-...
7685 sym R (11992 sym/42 pcs) 8 img
Project 2 - Draft 1
DATA We start our analysis by importing the training and evaluation set. Manual one hot encoding is also applied to the Brand Code variable. This results in three new variables: brand_code_b, brand_code_c and brand_code_d. A fourth variable brand_code_ais implicit when the preceding three variable are equal to 0. train <- read_excel("StudentDat...
13692 sym R (6905 sym/18 pcs) 12 img
DATA 624 - Project 2 - Draft 2
DATA We start our analysis by importing the training and evaluation set. Manual one hot encoding is also applied to the Brand Code variable. This results in three new variables: brand_code_b, brand_code_c and brand_code_d. A fourth variable brand_code_ais implicit when the preceding three variable are equal to 0. train <- read_excel("StudentDat...
13639 sym R (6980 sym/15 pcs) 17 img
DATA 624 - Project 2 - Draft 4
Data We start our analysis by importing the training and evaluation set. Manual one hot encoding is also applied to the Brand Code variable. This results in three new variables: brand_code_b, brand_code_c and brand_code_d. A fourth variable brand_code_ais implicit when the preceding three variable are equal to 0. train <- read_excel("StudentDat...
15581 sym R (8541 sym/23 pcs) 21 img