Publications by Enid Roman

Data 624 Homework 10 Market Basket and Clusters

23.11.2024

Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer’s basket - and therefore ‘Market Basket Analysis’. That is exactly what the Groceries Data Set contains: a collection of receipts with each line representin...

30875 sym 7 img 2 tbl

Data 698 - Master’s Research Project

19.11.2024

Predicting Migraine Types Using Symptom and Demographic Data: A Machine Learning Approach ## Age Duration Frequency Location Character Intensity Nausea Vomit ## <int> <int> <int> <int> <int> <int> <int> <int> ## 1: 30 1 5 1 1 2 1 0 ## 2: 50 3 5 ...

95315 sym 27 img 8 tbl

Data 624 Homework 9 Chapter 8 Regression Trees and Rule-Based Models

17.11.2024

8.1. Recreate the simulated data from Exercise 7.2: (a) Fit a random forest model to all of the predictors, then estimate the variable importance scores: Overall V1 8.6053659 V2 6.8312592 V3 0.7415349 V4 7.8833841 V5 2.2447503 V6 0.1360542 V7 0.0559509 V8 -0.0681958 V9 0.0031962 V10 -0.0547059 Did the random forest model significantl...

28042 sym R (75344 sym/19 pcs) 10 img 21 tbl

Data 624 Homework 8 Chapter 7 Non-Linear Regression

09.11.2024

7.2. Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: y = 10 sin(πx1x2) + 20(x3 − 0.5)2 + 10x4 + 5x5 + N (0, σ2) where the x values are random variables uniformly distributed between [0, 1 (there are also 5 other non-informative var...

13322 sym 4 img 1 tbl

Data 624 Homework 7 Chapter 6 Linear Regression and Its Cousins

03.11.2024

6.2. Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug. (a) Start R and use these commands to load the data: ## permeability ## 1 12.5200 ## 2 ...

38083 sym 12 img 3 tbl

Data 624 Predictive Analytics Project 1

28.10.2024

Part A – ATM Forecast In part A, I want you to forecast how much cash is taken out of 4 different ATM machines for May 2010. The data is given in a single file. The variable ‘Cash’ is provided in hundreds of dollars, other than that it is straight forward. I am being somewhat ambiguous on purpose to make this have a little more business f...

100178 sym 19 img

Data 624 Homework 6 Chapter 9.11

20.10.2024

# Load required libraries library(fpp3) library(tsibble) library(ggplot2) library(tidyverse) library(forecast) #install.packages("latex2exp") library(latex2exp) #install.packages("imager") library(imager) 1. Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. a. Explain the differences among t...

49809 sym R (18935 sym/75 pcs) 49 img

Data 624 Homework 5 Chapter 8.8

07.10.2024

# Load required libraries library(fpp3) library(tsibble) library(ggplot2) library(tidyverse) library(forecast) 1. Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset. a. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and �...

29686 sym R (31604 sym/49 pcs) 13 img

Data 624 Homework 4 Chapter 3.1

29.09.2024

3.1. The UC Irvine Machine Learning Repository6 contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. The data can be accessed via: librar...

8301 sym R (14728 sym/23 pcs) 41 img

Data 624 Homework 3 Chapter 5.11

22.09.2024

# Load required libraries library(fpp3) library(tsibble) library(ggplot2) 1. Produce forecasts for the following series using whichever of NAIVE(y), SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case: Australian Population (global_economy) # Filter the data for Australia aus_data <- global_economy |> filter(Country == "Austr...

16373 sym R (12773 sym/30 pcs) 22 img