Publications by Salma Elshahawy, Mael Illien, Dhairav Chhatbar
DATA621 - Assignment 5
Data 621 Homework5 Introduction In this assignment, we are tasked with predicting the number of cases of wine sold to wine distribution companies following a sampling. The target variable, cases of wine sold, is count data and therefore will be modeled using appropriate techniques such as Poisson and Negative Binomial regressions. data_train <- ...
10157 sym R (16841 sym/40 pcs) 8 img
DATA621 - Blog 2
Logistic Regression In logistic regression, we fit a regression curve, \(y = f(x)\) where y is a categorical variable such as such as True/False or 0/1. The predictors can be continuous, categorical or a mix of both. the underlying technique is the same as linear regression There are 3 different types of Logistic Regression models: Binary Logisti...
3337 sym R (1448 sym/7 pcs) 1 img
DATA621 - Blog 4
Analysis of covariance (ANCOVA) is a statistical method that allows accounting for third variabls, called covariates, when investigating the relationship between an independent and a dependent variable. The covariate is a continuous, never the key independent variable, and always observed. For example, there is a study that looked to estimate the...
2353 sym R (1194 sym/7 pcs)
DATA622: Machine Learning - Assignment 1
Let’s use the Penguin dataset for our assignment. To learn more about the dataset, please visit: https://allisonhorst.github.io/palmerpenguins/articles/intro.html For this assignment, let us use ‘species’ as our outcome or the dependent variable. Logistic Regression with Binary Outcome The penguin dataset has ‘species’ column. Please ...
3075 sym R (10754 sym/25 pcs) 2 img 4 tbl
DATA622: Machine Learning - Assignment 2
library(skimr) library(dplyr) library(ggplot2) library(tidyverse) library(palmerpenguins) require(foreign) require(nnet) require(ggplot2) require(reshape2) library(broom) library(gmodels) library(MASS) library(psych) library(caret) library(devtools) library(ggord) library(AppliedPredictiveModeling) library(klaR) library(e1071) ...
2704 sym R (4848 sym/26 pcs) 7 img 3 tbl
DATA 622 - Machine Learning: Classification using KNN, Decision Trees, Random Forests and Gradient Boosting
Setup library(skimr) library(tidyverse) library(caret) # For featureplot, classification report library(corrplot) # For correlation matrix library(AppliedPredictiveModeling) library(mice) # For data imputation library(VIM) # For missing data visualization library(gridExtra) # For grid plots library(rpart) # For Decision Trees models libr...
24776 sym R (11986 sym/54 pcs) 25 img 6 tbl
DATA 622 Machine Learning - Image Classification
Setup library(skimr) library(tidyverse) library(gridExtra) library(readr) library(dplyr) library(caret) library(naivebayes) library(factoextra) # For PCA plots library(e1071) library(Rtsne) library(RColorBrewer) library(gbm) library(randomForest) Data Description mnist_raw <- read_csv("https://pjreddie.com/media/files/mnist_train.csv...
11286 sym R (17944 sym/41 pcs) 14 img
DATA 622 Machine Learning - HW4: Clustering, PCA and SVM
Setup library(skimr) library(tidyverse) library(caret) # For featureplot, classification report library(corrplot) # For correlation matrix and PCA contributionplots library(AppliedPredictiveModeling) library(mice) # For data imputation library(VIM) # For missing data visualization library(gridExtra) # For grid plots library(pROC) # For AU...
24798 sym R (18247 sym/50 pcs) 27 img 3 tbl
DATA 624 Predictive Analysis - Project 1: Time Series Forecasting
Introduction Data Exploration & Preparation The data Read the excel file and get the insight on the data url<-"https://github.com/jnataky/Predictive_Analytics/raw/main/Project1/Data_Set_for_Class.xls" temp.file <- paste(tempfile(),".xls",sep = "") download.file(url, temp.file, mode = "wb") dataset <- read_excel(temp.file, sheet = 1) str(d...
11832 sym R (51703 sym/249 pcs) 120 img