Publications by Mohammed Rahman

Assignment 2: Experimentation & Model Training

22.03.2025

Introduction In Machine Learning, Experimentation refers to the systematic process of designing, executing, and analyzing different configurations to identify the optimal settings that performs best on a given task. Experimentation is learning by doing. It involves systematically changing parameters, evaluating results with metrics, and compari...

4542 sym 1 img

Assignment 1: Exploratory Data Analysis

04.03.2025

Introduction The data set is derived from a Portuguese bank’s direct marketing initiatives. Phone calls served as the foundation for these marketing strategies. To determine whether a consumer had subscribed to the product (a bank term deposit), multiple interactions with the client were necessary. The categorization goal is to forecast wheth...

9830 sym 3 img

Assignment 1: Exploratory Data Analysis

03.03.2025

Background Introduction One of the most important challenges in banking institutions’ success has always been marketing to potential customers. Not surprisingly, banks typically use digital media, social media, customer service, and strategic alliances to connect with their clientele. However, how can banks more precisely sell to a particula...

10247 sym 6 img

Predicting the PH Of Beverages

13.05.2024

Introduction: As a data scientist tasked with developing a predictive model for pH regulation in manufacturing processes, the overarching objective is to leverage data-driven insights to enhance operational efficiency, ensure regulatory compliance, and optimize product quality. By harnessing advanced analytics and machine learning techniques, w...

20129 sym Python (28837 sym/49 pcs) 4 img 3 tbl

Walmart Data Analysis

05.05.2024

Introduction For the final project I’m using the Walmart sales data from Kaggle that is scraped from the web to perform data and statistical analysis. Data Source: Walmart Dataset The data contains sales of different Walmart stores from 2010-02-05 to 2012-11-01. It has columns with store number, week of sales, sales for the given store, holi...

5357 sym R (5029 sym/18 pcs) 6 img 1 tbl

Project 4 - Document Classification

24.04.2024

Project 4 - Document Classification Overview It can be useful to be able to classify new “test” documents using already classified “training” documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. For this project, you can start with a spam/ham dataset, ...

1582 sym Python (6352 sym/18 pcs) 3 img

Data-624 Homework 9

14.04.2024

Exercise 8.1: Recreate the simulated data from Exercise 7.2: library(mlbench) set.seed(200) simulated <- mlbench.friedman1(200, sd = 1) simulated <- cbind(simulated$x, simulated$y) simulated <-as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" (a): Fit a random forest model to all of the predictors, then estimate the variab...

6359 sym R (10186 sym/40 pcs) 2 img 2 tbl

Week 11: Recommender Systems

07.04.2024

Introduction: Pandora is a leading music and podcast discovery platform, providing a highly-personalized listening experience to approximately 70 million users each month with its proprietary Music Genome Project® and Podcast Genome Project® technology - whether at home or on the go - through its mobile app, the web, and integration with more...

4737 sym

Data-624 Homework 8

04.04.2024

Question 7.2: Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: $y = 10 sin(\pi x_1x_2) + 20(x_3 - 0.5)^2 + 10x_4 + 5x_5 + N(0, \sigma^2)$ where the x values are random variables uniformly distributed between [0, 1] (there are also 5 othe...

2477 sym R (23504 sym/47 pcs) 6 img

Data-624 Homework 7

31.03.2024

Exercise 6.2: Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: Start R and use these commands to load the data: library(AppliedPredictiveModeling) data(permeab...

5095 sym R (10070 sym/29 pcs) 4 img