Publications by Junaid Ahmed Mohammed

Junaid SL-Lab8

07.04.2025

In this exercise, you will further analyze the Wage data set considered throughout this chapter. Perform polynomial regression to predict wage using age. Use cross-validation to select the optimal degree d for the polyno mial. What degree was chosen, and how does this compare to the results of hypothesis testing using ANOVA? Make a plot of the...

9698 sym R (12944 sym/51 pcs) 18 img

Junaid SL-Lab7

01.04.2025

In this exercise, we will generate simulated data, and will then use this data to perform best subset selection. Use the rnorm() function to generate a predictor X of length n =100, as well as a noise vector ϵ of length n = 100. Generate a response vector Y of length n = 100 according to the model Y =β0+β1X+β2X2+β3X3+ϵ, where β0, β1, β...

7800 sym 6 img

Junaid_SL_lab6

24.03.2025

In Chapter 4, we used logistic regression to predict the probability of default using income and balance on the Default data set. We will now estimate the test error of this logistic regression model using the validation set approach. Do not forget to set a random seed before beginning your analysis. Fit a logistic regression model that uses i...

7207 sym R (6735 sym/42 pcs)

Junaid_SL_lab5

13.03.2025

# Load required libraries library(MASS) library(class) library(tidyverse) ## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── ## ✔ dplyr 1.1.4 ✔ readr 2.1.5 ## ✔ forcats 1.0.0 ✔ stringr 1.5.1 ## ✔ ggplot2 3.5.1 ✔ tibbl...

8619 sym R (10832 sym/65 pcs) 6 img

Junaid TakeHomeMidTerm

13.03.2025

Please hand this result in on Canvas no later than 11:59pm on Wednesday, March 12! Do not work in groups! Consider the data from R package . We will use linear regression to investigate the relationship between variables in this data set and estimated performance (variable ). Do not use published performance as a predictor of performance in th...

26404 sym R (24161 sym/63 pcs) 8 img

Statistical Learning Exercise 3

12.03.2025

5. We now examine the differences between LDA and QDA. (a) If the Bayes decision boundary is linear, do we expect LDA or QDA to perform better on the training set? On the test set? Answer: Training set: QDA will likely perform better because it is more flexible and can model complex patterns, even if they aren’t necessary for a linear decis...

7232 sym

Lab4

26.02.2025

8. Simple Linear Regression Analysis using the Auto Dataset (a) Fitting a Simple Linear Regression Model We begin by fitting a simple linear regression model to investigate the relationship between miles per gallon (mpg) and horsepower from the Auto dataset. We use lm() to fit the model and summary() to examine the results. lm_fit <- lm(mpg ~ h...

7521 sym 7 img

SL_Exercise2

25.02.2025

Question 1 Describe the null hypotheses to which the p-values given in Table 3.4 correspond. Explain what conclusions you can draw based on these p-values. Your explanation should be phrased in terms of sales, TV, radio, and newspaper, rather than in terms of the coefficients of the linear model. For intercept, the null hypothesis is \(\beta_0 ...

3830 sym R (714 sym/3 pcs)

Netflix Data Dive - Comprehensive Data Analysis

05.12.2024

Load the Netflix Dataset I’ll first load the Netflix dataset and preview it to understand its structure. # Load the Netflix dataset netflix_data <- read.csv("~/Netflix_dataset.csv") # Preview the dataset head(netflix_data) ## id title type ## 1 ts300399 Five Came Back: The Reference Films SHOW ## 2 ...

14990 sym R (38467 sym/59 pcs) 19 img

Week 13 | Model Critique and Analysis

27.11.2024

Goal 1: Business Scenario Customer or Audience: The target audience for this analysis is a real estate analytics firm operating in Ames, Iowa. The firm’s clients include: Homebuyers: Understand pricing dynamics across different neighborhoods to identify affordable homes of high quality. Investors: Make data-driven decisions on which propert...

11891 sym R (15989 sym/35 pcs) 8 img