Publications by Devin Teran

Data621-Homework5

24.05.2021

Overview In this assignment, we will explore, analyze and model a data set containing information on approximately 12,000 commercially available wines. The variables are mostly related to the chemical properties of the wine being sold. The response variable is the number of sample cases of wine that were purchased by wine distribution companies a...

7620 sym R (22326 sym/21 pcs) 6 img 2 tbl

Module8

22.05.2021

Exercise 1 Use the nnet package to analyze the iris data set. Use 80% of the 150 samples as the training data and the rest for validation. Discuss the results. data(iris) set.seed(123) samples <- sample(nrow(iris), nrow(iris)*0.80) train <- iris[samples,] test <- iris[-samples,] iris_nn <- nnet(Species ~ ., size = 2,data=train) ## # weights: 1...

2526 sym R (26178 sym/27 pcs) 2 img

data609-mod7

16.05.2021

Exercise 1 Use the svm() algorithm of the e1071 package to carry out the support vector machine for the PlantGrowth data set. Then, discuss the number of support vectors/samples. data(PlantGrowth) x <- PlantGrowth$weight y <- PlantGrowth$group model_svm <- svm(group ~ ., data = PlantGrowth) summary(model_svm) ## ## Call: ## svm(formula = group...

1125 sym R (1716 sym/10 pcs)

Module6

03.05.2021

R Markdown Exercise 1 Use a data set such as the PlantGrowth in R to calculate three different distance metrics and discuss the results. I used the dataset iris to calculate the Cartesian distance, the Jaccard distance and Manhattan. calc_distance <- function(x,y){ sqrt=(sum((x - y)^2)) } calc_distance(iris[1],iris[2]) dist(t(iris[1:2]),metho...

3197 sym R (10219 sym/44 pcs) 10 img

Module5

19.04.2021

Example 1 Carry out the logistic regression (example 22 on page 94) in R using the data. We’ll start with initial numbers, a = 1 and b = 1. x <- c(0.1,0.5,1,1.5,2,2.5) y <- c(0,0,1,1,1,0) data <- rbind(x,y) X <- as.matrix(x) a <- 1 b <- 1 P_val <- function(x,y,a,b){ return (1/(1+exp(a+b*x))) } #hypothesis/prediction P <- matrix(0,1,ncol(da...

2058 sym R (4995 sym/32 pcs) 10 img

Data624-Homework2Batch

12.07.2021

Chapter 6 Exercises Exercise 6 6.3. A chemical manufacturing process for a pharmaceutical product was discussed in Sect.1.4. In this problem, the objective is to understand the relationship between biological measurements of the raw materials (predictors), measurements of the manufacturing process (predictors), and the response of product yield....

30492 sym R (33092 sym/121 pcs) 16 img

Data624-Chapter6

07.07.2021

Exercise 6 6.3. A chemical manufacturing process for a pharmaceutical product was discussed in Sect.1.4. In this problem, the objective is to understand the relationship between biological measurements of the raw materials (predictors), measurements of the manufacturing process (predictors), and the response of product yield. Biological predictor...

5131 sym R (3914 sym/20 pcs) 6 img 1 tbl

Data624-Week4

06.07.2021

Exercise 7.2 7.2. Friedman (1991) introduced several benchmark data sets created by simulation. One of these simulations used the following nonlinear equation to create data: \(y = 10sin(πx_{1}x_{2})+20(x_{3} −0.5)^2 +10x_{4} +5x_{5} +N(0,\sigma^2)\) where the x values are random variables uniformly distributed between [0, 1] (there are also 5...

10214 sym R (13654 sym/99 pcs) 6 img

Data624-Project1

25.06.2021

Getting Data First we will read the data into a dataframe from the xls. data <- read.xlsx2("/Users/devinteran/MSinDS/DATA624/DATA624/Project1/Data Set for Class.xls",sheetIndex = 1,colClasses = c('integer','character','double','double','double','double','double'),stringsAsFactors=FALSE) Exploratory Data Analysis Our data represents 5 different g...

5158 sym R (8721 sym/49 pcs) 10 img 2 tbl

Data622-Homework3

12.10.2021

Code Show All Code Hide All Code Homework3 Homework3 Exploratory Data Analysis Missing Data Data Setup for Modeling Decision Trees Devin Teran 10/4/2021 loan_data <- read.csv('https://raw.githubusercontent.com/devinteran/DATA622/main/Loan_approval.csv') Exploratory Data Analysis Our loan data has 13 columns, 8 of which are categorical a...

6068 sym R (9085 sym/34 pcs) 13 img 18 tbl