Publications by Mohammed Rahman

Predicting the PH Of Beverages

13.05.2024

Introduction: As a data scientist tasked with developing a predictive model for pH regulation in manufacturing processes, the overarching objective is to leverage data-driven insights to enhance operational efficiency, ensure regulatory compliance, and optimize product quality. By harnessing advanced analytics and machine learning techniques, w...

20129 sym Python (28837 sym/49 pcs) 4 img 3 tbl

Walmart Data Analysis

05.05.2024

Introduction For the final project I’m using the Walmart sales data from Kaggle that is scraped from the web to perform data and statistical analysis. Data Source: Walmart Dataset The data contains sales of different Walmart stores from 2010-02-05 to 2012-11-01. It has columns with store number, week of sales, sales for the given store, holi...

5357 sym R (5029 sym/18 pcs) 6 img 1 tbl

Project 4 - Document Classification

24.04.2024

Project 4 - Document Classification Overview It can be useful to be able to classify new “test” documents using already classified “training” documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. For this project, you can start with a spam/ham dataset, ...

1582 sym Python (6352 sym/18 pcs) 3 img

Data-624 Homework 9

14.04.2024

Exercise 8.1: Recreate the simulated data from Exercise 7.2: library(mlbench) set.seed(200) simulated <- mlbench.friedman1(200, sd = 1) simulated <- cbind(simulated$x, simulated$y) simulated <-as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" (a): Fit a random forest model to all of the predictors, then estimate the variab...

6359 sym R (10186 sym/40 pcs) 2 img 2 tbl

Week 11: Recommender Systems

07.04.2024

Introduction: Pandora is a leading music and podcast discovery platform, providing a highly-personalized listening experience to approximately 70 million users each month with its proprietary Music Genome Project® and Podcast Genome Project® technology - whether at home or on the go - through its mobile app, the web, and integration with more...

4737 sym

Data-624 Homework 8

04.04.2024

Question 7.2: Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: \(y = 10 sin(\pi x_1x_2) + 20(x_3 - 0.5)^2 + 10x_4 + 5x_5 + N(0, \sigma^2)\) where the x values are random variables uniformly distributed between [0, 1] (there are also 5 othe...

2477 sym R (23504 sym/47 pcs) 6 img

Data-624 Homework 7

31.03.2024

Exercise 6.2: Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: Start R and use these commands to load the data: library(AppliedPredictiveModeling) data(permeab...

5095 sym R (10070 sym/29 pcs) 4 img

Data 607 - TidyVerse Create

28.03.2024

Description In this assignment, you’ll practice collaborating around a code project with GitHub. You could consider our collective work as building out a book of examples on how to use TidyVerse functions. GitHub repository: https://github.com/pkowalchuk/SPRING2024TIDYVERSE FiveThirtyEight.com datasets. Kaggle datasets. Your task here is to Cr...

1482 sym Python (7359 sym/13 pcs)

Data-624 Project 1

24.03.2024

Part A – ATM Forecast We are asked to forecast how much cash is taken out of 4 different ATM machines for May 2010. We are given data in a single file with variable cash provided in hundreds of dollars. Explain and demonstrate you process, techniques used and not used and your actual forecast. # Load the dataset data <- read_excel("ATM624Data...

2095 sym Python (8397 sym/53 pcs) 18 img

Data 607 - Week 9

24.03.2024

Introduction: I searched on the NY Times API website and I signed up into the website. I found the Books APIs and I chose the “Book Sellers History”. I retrieved info for the history to see the last few best sellers. ex<-GET("https://api.nytimes.com/svc/books/v3/lists/best-sellers/history.json?api-key=mP5gHH5A5oHbVq6PHAd2pAdv0BlS6s12") cat(...

534 sym 1 tbl