Publications by Jamal Rogers

Data Transformation in Preprocessing

20.09.2024

This markdown document presents common data transformation techniques in preprocessing as a requirment in PhD ITI 624 Application of Learning Analytics in Instructional Design. The Educational Data Set source: https://www.kaggle.com/datasets/spscientist/students-performance-in-exams ## gender race.ethnicity parental.level.of.education l...

3876 sym

Comparing Prophet and ARIMA Regression on the USM Heat Index Data

29.10.2023

Required R Libraries Tidymodels package Tidyverse package Modeltime Timetk library(tidymodels) library(tidyverse) library(modeltime) library(timetk) The USM Heat Index Data Data was recorded from October 06, 2020, at 9:00 PM until April 26, 2021, at 2:00 PM with a 15-minute interval. The data contains heat index (Heat), Date and Time column...

613 sym R (1991 sym/11 pcs)

Machine Learning Model for Classifying Poultry Diseases

15.09.2023

Table of contents Data Gathering The Data Set Data Cleaning Exploratory Data Analysis Disease Distribution Cycle Day versus Mortality Harmful Gases versus Mortality Microclimatic Parameters versus Mortality Air Quality Parameters versus Mortality Predictive Modeling using Machine Learning Modeling Results Training Workflow Models Generated (...

4210 sym 13 img

Discriminant Function Analysis

14.09.2023

Table of contents What is DFA? Introduction: DFA as a Classifier Use Case Classify based on CRP Classify based on Temp Classify based on CRP and Temp The Discriminant Function Definition Widely Used DFA Methods The Palmer Penguins Data Set The Data Set Exploratory Data Analysis Removing unwanted variables Supervised Machine Learning Framework ...

4627 sym 13 img

Applied Data Science: Module 3 Lesson 2 Abstraction 2

08.09.2023

Table of contents Optimizing Models via Tuning Parameters Optimizing the hash features Boosted Trees Optimizing tuning parameters Grid Search Use the tune_*() functions to tune models Early stopping for boosted trees Hyperparameter Tuning Author Jamal Rogers Published September 9, 2023 Previously - Setup library(tidymodels) library(modeld...

5327 sym 19 img

Applied Data Science: Module 3 Lesson 2 Application

08.09.2023

Table of contents The data set Data Budget Recipe Cross Validation Model Specification Workflow Parallel Processing Create Grid and Tune Select Best Model Finalized Workflow Last Fit and Test Confusion Matrix Module 3 Lesson 2 Application Author Jamal Rogers Published September 8, 2023 Let’s load the tidyverse, tidymodels, and palmerpengu...

666 sym 1 img

Applied Data Science: Module 3 Lesson 1 Application

17.08.2023

Table of contents The data set Data Budget Cross Validation Model Specification Workflow Fit Resamples and collect metrics Final Fit and Performance Confusion Matrix Module 3 Lesson 1 Application Author Jamal Rogers Published August 17, 2023 Let’s load the tidyverse, tidymodels, and palmerpenguins packages to begin. library(tidyverse) l...

551 sym 1 img

Applied Data Science: Module 3 Lesson 2 Abstraction 1

17.08.2023

Table of contents What is Feature Engineering? Data Splitting Strategy Data Spending Resampling Strategy Prepare your data for modeling A first recipe Create indicator variables Filter out constant columns Normalization Reduce correlation Other possible steps Minimal recipe Measuring Performance Using a workflow Holdout predictions Calibration Plo...

4665 sym 13 img

Applied Data Science: Module 3 Lesson 1 Abstraction 4

17.08.2023

Table of contents Looking at predictions Confusion Matrix Metrics for model performance How can we use the training data to compare and evaluate different models? The whole game - status update Random Forest 🌳🌲🌴🌵 The whole game - status update The final fit The whole game Evaluating Models Author Jamal Rogers Published August 17,...

2706 sym 26 img

Applied Data Science: Module 3 Lesson 1 Abstraction 2

17.08.2023

Table of contents Data Splitting and Spending The initial split Accessing the data The training and test sets The whole game - status update Your Data Budget Author Jamal Rogers Published August 17, 2023 Data Splitting and Spending For machine learning, we typically split data into training and test sets: The training set is used to estima...

1204 sym 8 img