Publications by Salma Elshahawy, Mael Illien, Dhairav Chhatbar

DATA621 - Assignment 5

02.12.2020

Data 621 Homework5 Introduction In this assignment, we are tasked with predicting the number of cases of wine sold to wine distribution companies following a sampling. The target variable, cases of wine sold, is count data and therefore will be modeled using appropriate techniques such as Poisson and Negative Binomial regressions. data_train <- ...

10157 sym R (16841 sym/40 pcs) 8 img

DATA621 - Blog 2

09.12.2020

Logistic Regression In logistic regression, we fit a regression curve, \(y = f(x)\) where y is a categorical variable such as such as True/False or 0/1. The predictors can be continuous, categorical or a mix of both. the underlying technique is the same as linear regression There are 3 different types of Logistic Regression models: Binary Logisti...

3337 sym R (1448 sym/7 pcs) 1 img

DATA621 - Blog 4

09.12.2020

Analysis of covariance (ANCOVA) is a statistical method that allows accounting for third variabls, called covariates, when investigating the relationship between an independent and a dependent variable. The covariate is a continuous, never the key independent variable, and always observed. For example, there is a study that looked to estimate the...

2353 sym R (1194 sym/7 pcs)

DATA622: Machine Learning - Assignment 1

19.02.2021

Let’s use the Penguin dataset for our assignment. To learn more about the dataset, please visit: https://allisonhorst.github.io/palmerpenguins/articles/intro.html For this assignment, let us use ‘species’ as our outcome or the dependent variable. Logistic Regression with Binary Outcome The penguin dataset has ‘species’ column. Please ...

3075 sym R (10754 sym/25 pcs) 2 img 4 tbl

DATA622: Machine Learning - Assignment 2

08.03.2021

library(skimr) library(dplyr) library(ggplot2) library(tidyverse) library(palmerpenguins) require(foreign) require(nnet) require(ggplot2) require(reshape2) library(broom) library(gmodels) library(MASS) library(psych) library(caret) library(devtools) library(ggord) library(AppliedPredictiveModeling) library(klaR) library(e1071) ...

2704 sym R (4848 sym/26 pcs) 7 img 3 tbl

DATA 622 - Machine Learning: Classification using KNN, Decision Trees, Random Forests and Gradient Boosting

09.04.2021

Setup library(skimr) library(tidyverse) library(caret) # For featureplot, classification report library(corrplot) # For correlation matrix library(AppliedPredictiveModeling) library(mice) # For data imputation library(VIM) # For missing data visualization library(gridExtra) # For grid plots library(rpart) # For Decision Trees models libr...

24776 sym R (11986 sym/54 pcs) 25 img 6 tbl

DATA 622 Machine Learning - Image Classification

20.05.2021

Setup library(skimr) library(tidyverse) library(gridExtra) library(readr) library(dplyr) library(caret) library(naivebayes) library(factoextra) # For PCA plots library(e1071) library(Rtsne) library(RColorBrewer) library(gbm) library(randomForest) Data Description mnist_raw <- read_csv("https://pjreddie.com/media/files/mnist_train.csv...

11286 sym R (17944 sym/41 pcs) 14 img

DATA 622 Machine Learning - HW4: Clustering, PCA and SVM

07.05.2021

Setup library(skimr) library(tidyverse) library(caret) # For featureplot, classification report library(corrplot) # For correlation matrix and PCA contributionplots library(AppliedPredictiveModeling) library(mice) # For data imputation library(VIM) # For missing data visualization library(gridExtra) # For grid plots library(pROC) # For AU...

24798 sym R (18247 sym/50 pcs) 27 img 3 tbl

DATA 624 Predictive Analysis - Project 1: Time Series Forecasting

06.07.2021

Introduction Data Exploration & Preparation The data Read the excel file and get the insight on the data url<-"https://github.com/jnataky/Predictive_Analytics/raw/main/Project1/Data_Set_for_Class.xls" temp.file <- paste(tempfile(),".xls",sep = "") download.file(url, temp.file, mode = "wb") dataset <- read_excel(temp.file, sheet = 1) str(d...

11832 sym R (51703 sym/249 pcs) 120 img