Publications by Misha Kollontai

DATA624 HW3

10.02.2021

Exercise 6.2: The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle? autoplot(plastics) + ggtitle('Sales of product A') + theme(plot.title = element_text(hjust = 0.5)) ...

2766 sym R (1412 sym/5 pcs) 5 img

DATA624 W1

10.02.2021

Exercise 2.1: Use the help function to explore what the series gold, woolyrnq and gas represent. Use autoplot() to plot each of these in separate plots. library(fpp2) ## Registered S3 method overwritten by 'quantmod': ## method from ## as.zoo.data.frame zoo ## -- Attaching packages ----------------------------------------------...

6496 sym R (2817 sym/34 pcs) 39 img

DATA624_HW2

10.02.2021

Exercise 3.1: For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. usnetelec usgdp mcopper enplanements autoplot(usnetelec) elec_lab <- BoxCox.lambda(usnetelec) autoplot(BoxCox(usnetelec,elec_lab)) The change here is difficult to perceive and that isn’t very surprising since the original ...

3289 sym R (1933 sym/21 pcs) 14 img

DATA624 Homework 5

14.03.2021

Exercise 7.1: Consider the pigs series — the number of pigs slaughtered in Victoria each month. Use the ses() function in R to find the optimal values of \(\alpha\) and ℓ0, and generate forecasts for the next four months. pigsdata <- pigs fc <- ses(pigsdata, h = 4) summary(fc) ## ## Forecast method: Simple exponential smoothing ## ##...

8301 sym R (11152 sym/39 pcs) 12 img

DATA624 Homework 4

07.03.2021

Exercise 3.1: The UC Irvine Machine Learning Repository6 contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. data(Glass) str(Glass) ## 'da...

5044 sym R (16854 sym/36 pcs) 7 img

DATA624 Homework 6

25.03.2021

Exercise 8.1: Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. Explain the differences among these figures. Do they all indicate that the data are white noise? All three of these images suggest that the data are white noise - there doesn’t appear to be a pattern in any of the ACF plots. The plots ...

11325 sym R (8889 sym/89 pcs) 39 img

DATA624 Project 1

10.04.2021

Part A - ATM Forecast In part A, I want you to forecast how much cash is taken out of 4 different ATM machines for May 2010. The data is given in a single file. The variable ‘Cash’ is provided in hundreds of dollars, other than that it is straight forward. I am being somewhat ambiguous on purpose to make this have a little more business feeli...

9947 sym R (1212974 sym/92 pcs) 36 img

Homework 7

18.04.2021

Questions 6.2: Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: (a) Start R and use these commands to load the data: library(AppliedPredictiveModeling) ## Warni...

4671 sym R (10852 sym/46 pcs) 3 img

DATA624 Homework 8

25.04.2021

Questions 7.2: Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data:$$y = 10sin(x_{1}x_{2}) + 20(x_{3} − 0.5)^{2} + 10x_{4} + 5x_{5} + N(0, ^{2})$$ where the x values are random variables uniformly distributed between [0, 1] (there are also 5...

2626 sym R (7807 sym/25 pcs) 3 img

DATA624 Homework 9

01.05.2021

Questions 8.1: Recreate the simulated data from Exercise 7.2: set.seed(200) simulated <- mlbench.friedman1(200, sd = 1) simulated <- cbind(simulated$x, simulated$y) simulated <- as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" (a) Fit a random forest model to all of the predictors, then estimate the variable importance ...

7062 sym R (12264 sym/47 pcs) 8 img