Publications by Chun San Yip

Data624 Project2

21.05.2023

Project Requirements This is role playing. I am your new boss. I am in charge of production at ABC Beverage and you are a team of data scientists reporting to me. My leadership has told me that new regulations are requiring us to understand our manufacturing process, the predictive factors and be able to report to them our predictive model of P...

3050 sym Python (29336 sym/73 pcs) 5 img 9 tbl

Data624 HW10

15.04.2023

Market Basket and Clusters Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer’s basket - and therefore ‘Market Basket Analysis’. That is exactly what the Groceries Data Set contains: a collection of receipts ...

1171 sym 3 img 1 tbl

Data624 HW9

10.04.2023

Chapter 8 Regression Trees and Rule-Based Models Do problems 8.1, 8.2, 8.3, and 8.7 in Kuhn and Johnson. Ex 8.1 8.1. Recreate the simulated data from Exercise 7.2: library(AppliedPredictiveModeling) library(mlbench) library(caret) set.seed(200) simulated <- mlbench.friedman1(200, sd = 1) simulated <- cbind(simulated$x, simulated$y) simul...

5580 sym Python (6764 sym/20 pcs) 7 img

Data624 HW8

08.04.2023

In Kuhn and Johnson do problems 7.2 and 7.5 Ex 7.2 7.2. Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data: y = 10 sin(πx1x2) + 20(x3 − 0.5)2 + 10x4 + 5x5 + N(0, σ2) where the x values are random variables uniformly distributed between [0...

2363 sym Python (7378 sym/17 pcs) 6 img

Data624 HW7

02.04.2023

In Kuhn and Johnson do problems 6.2 and 6.3 Ex 6.2 Permeability 6.2. Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: (a) Start R and use these commands to load...

4420 sym Python (144472 sym/55 pcs) 11 img 2 tbl

Data624 Project1

31.03.2023

In part A, I want you to forecast how much cash is taken out of 4 different ATM machines for May 2010. The data is given in a single file. The variable ‘Cash’ is provided in hundreds of dollars, other than that it is straight forward. I am being somewhat ambiguous on purpose to make this have a little more business feeling. Explain and dem...

5228 sym Python (9684 sym/59 pcs) 19 img

Data624 HW6

21.03.2023

Do exercises 9.1, 9.2, 9.3, 9.5, 9.6, 9.7, 9.8 in Hyndman. Ex 9.1 Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. a.Explain the differences among these figures. Do they all indicate that the data are white noise? A white noise series is stationary — it does not matter when you observe it, it sho...

5641 sym Python (9561 sym/70 pcs) 28 img

Data624 HW5

15.03.2023

Do exercises 8.1, 8.5, 8.6, 8.7, 8.8, 8.9 in Hyndman. Ex 8.1 Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset. a. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ0, and generate forecasts for the next four months. b. Co...

4195 sym 9 img

Data624 HW4

12.03.2023

Do problems 3.1 and 3.2 in the Kuhn and Johnson book Applied Predictive Modeling. Ex 3.1 The UC Irvine Machine Learning Repository6 contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eig...

3253 sym Python (11339 sym/27 pcs) 8 img

Data624 HW3

08.03.2023

Do exercises 5.1, 5.2, 5.3, 5.4 and 5.7 in the Hyndman book. Ex 5.1 Produce forecasts for the following series using whichever of NAIVE(y), SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case: Australian Population (global_economy) Bricks (aus_production) NSW Lambs (aus_livestock) Household wealth (hh_budget). Australian takeaway f...

2335 sym Python (5000 sym/30 pcs) 18 img