Publications by Adam Douglas

DATA 624 - Homework 7

01.11.2020

Problem 6.2 Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: a) Start R and use these commands to load the data: library(AppliedPredictiveModeling) data(permeabil...

7821 sym R (8202 sym/40 pcs) 11 img

DATA 624 - Homework 8

08.11.2020

7.2 Friedman (1991) introduced several benchmark data sets created by simulation. One of these simulations used the following nonlinear equation to create data: \[ y = 10sin(\pi x_1 x_2) + 20(x_3 - 0.5)^2 + 10x_4+5x_5 + N(0,\sigma^2) \] where the \(x\) values are random variables uniformly distributed between \([0,1]\) (there are also 5 other no...

6246 sym R (10677 sym/38 pcs) 13 img

DATA 622 - Test 1

14.11.2020

Get the Data We start by reading in the data from homework 1: # Read in our data rawdata <- read_csv("hw1.csv", col_types = "dff") Let’s plot the data to refresh our memory about it: # Plot rawdata %>% ggplot(aes(x=x, y=y, col=label)) + geom_point() + scale_color_manual(name="Label",values=c("BLUE"="blue","BLACK"="black")) We have one contin...

2955 sym R (1187 sym/8 pcs) 1 img

DATA 624 - Homework 9

19.11.2020

8.1 - Recreate the simulated data from Exercise: 7.2: set.seed(200) simulated <- mlbench.friedman1(200,sd=1) simulated <- cbind(simulated$x, simulated$y) simulated <- as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" (a) Fit a random forest model to all of the predictors, then estimate the variable importance scores: model1 <-...

7685 sym R (11992 sym/42 pcs) 8 img

Project 2 - Draft 1

07.12.2020

DATA We start our analysis by importing the training and evaluation set. Manual one hot encoding is also applied to the Brand Code variable. This results in three new variables: brand_code_b, brand_code_c and brand_code_d. A fourth variable brand_code_ais implicit when the preceding three variable are equal to 0. train <- read_excel("StudentDat...

13692 sym R (6905 sym/18 pcs) 12 img

DATA 624 - Project 2 - Draft 2

07.12.2020

DATA We start our analysis by importing the training and evaluation set. Manual one hot encoding is also applied to the Brand Code variable. This results in three new variables: brand_code_b, brand_code_c and brand_code_d. A fourth variable brand_code_ais implicit when the preceding three variable are equal to 0. train <- read_excel("StudentDat...

13639 sym R (6980 sym/15 pcs) 17 img

DATA 624 - Project 2 - Draft 4

09.12.2020

Data We start our analysis by importing the training and evaluation set. Manual one hot encoding is also applied to the Brand Code variable. This results in three new variables: brand_code_b, brand_code_c and brand_code_d. A fourth variable brand_code_ais implicit when the preceding three variable are equal to 0. train <- read_excel("StudentDat...

15581 sym R (8541 sym/23 pcs) 21 img