Publications by Devin Teran
Data621-Homework5
Overview In this assignment, we will explore, analyze and model a data set containing information on approximately 12,000 commercially available wines. The variables are mostly related to the chemical properties of the wine being sold. The response variable is the number of sample cases of wine that were purchased by wine distribution companies a...
7620 sym R (22326 sym/21 pcs) 6 img 2 tbl
Module8
Exercise 1 Use the nnet package to analyze the iris data set. Use 80% of the 150 samples as the training data and the rest for validation. Discuss the results. data(iris) set.seed(123) samples <- sample(nrow(iris), nrow(iris)*0.80) train <- iris[samples,] test <- iris[-samples,] iris_nn <- nnet(Species ~ ., size = 2,data=train) ## # weights: 1...
2526 sym R (26178 sym/27 pcs) 2 img
data609-mod7
Exercise 1 Use the svm() algorithm of the e1071 package to carry out the support vector machine for the PlantGrowth data set. Then, discuss the number of support vectors/samples. data(PlantGrowth) x <- PlantGrowth$weight y <- PlantGrowth$group model_svm <- svm(group ~ ., data = PlantGrowth) summary(model_svm) ## ## Call: ## svm(formula = group...
1125 sym R (1716 sym/10 pcs)
Module6
R Markdown Exercise 1 Use a data set such as the PlantGrowth in R to calculate three different distance metrics and discuss the results. I used the dataset iris to calculate the Cartesian distance, the Jaccard distance and Manhattan. calc_distance <- function(x,y){ sqrt=(sum((x - y)^2)) } calc_distance(iris[1],iris[2]) dist(t(iris[1:2]),metho...
3197 sym R (10219 sym/44 pcs) 10 img
Module5
Example 1 Carry out the logistic regression (example 22 on page 94) in R using the data. We’ll start with initial numbers, a = 1 and b = 1. x <- c(0.1,0.5,1,1.5,2,2.5) y <- c(0,0,1,1,1,0) data <- rbind(x,y) X <- as.matrix(x) a <- 1 b <- 1 P_val <- function(x,y,a,b){ return (1/(1+exp(a+b*x))) } #hypothesis/prediction P <- matrix(0,1,ncol(da...
2058 sym R (4995 sym/32 pcs) 10 img
Data624-Homework2Batch
Chapter 6 Exercises Exercise 6 6.3. A chemical manufacturing process for a pharmaceutical product was discussed in Sect.1.4. In this problem, the objective is to understand the relationship between biological measurements of the raw materials (predictors), measurements of the manufacturing process (predictors), and the response of product yield....
30492 sym R (33092 sym/121 pcs) 16 img
Data624-Chapter6
Exercise 6 6.3. A chemical manufacturing process for a pharmaceutical product was discussed in Sect.1.4. In this problem, the objective is to understand the relationship between biological measurements of the raw materials (predictors), measurements of the manufacturing process (predictors), and the response of product yield. Biological predictor...
5131 sym R (3914 sym/20 pcs) 6 img 1 tbl
Data624-Week4
Exercise 7.2 7.2. Friedman (1991) introduced several benchmark data sets created by simulation. One of these simulations used the following nonlinear equation to create data: \(y = 10sin(πx_{1}x_{2})+20(x_{3} −0.5)^2 +10x_{4} +5x_{5} +N(0,\sigma^2)\) where the x values are random variables uniformly distributed between [0, 1] (there are also 5...
10214 sym R (13654 sym/99 pcs) 6 img
Data624-Project1
Getting Data First we will read the data into a dataframe from the xls. data <- read.xlsx2("/Users/devinteran/MSinDS/DATA624/DATA624/Project1/Data Set for Class.xls",sheetIndex = 1,colClasses = c('integer','character','double','double','double','double','double'),stringsAsFactors=FALSE) Exploratory Data Analysis Our data represents 5 different g...
5158 sym R (8721 sym/49 pcs) 10 img 2 tbl
Data622-Homework3
Code Show All Code Hide All Code Homework3 Homework3 Exploratory Data Analysis Missing Data Data Setup for Modeling Decision Trees Devin Teran 10/4/2021 loan_data <- read.csv('https://raw.githubusercontent.com/devinteran/DATA622/main/Loan_approval.csv') Exploratory Data Analysis Our loan data has 13 columns, 8 of which are categorical a...
6068 sym R (9085 sym/34 pcs) 13 img 18 tbl