Publications by Misha Kollontai
DATA624 HW3
Exercise 6.2: The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle? autoplot(plastics) + ggtitle('Sales of product A') + theme(plot.title = element_text(hjust = 0.5)) ...
2766 sym R (1412 sym/5 pcs) 5 img
DATA624 W1
Exercise 2.1: Use the help function to explore what the series gold, woolyrnq and gas represent. Use autoplot() to plot each of these in separate plots. library(fpp2) ## Registered S3 method overwritten by 'quantmod': ## method from ## as.zoo.data.frame zoo ## -- Attaching packages ----------------------------------------------...
6496 sym R (2817 sym/34 pcs) 39 img
DATA624_HW2
Exercise 3.1: For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. usnetelec usgdp mcopper enplanements autoplot(usnetelec) elec_lab <- BoxCox.lambda(usnetelec) autoplot(BoxCox(usnetelec,elec_lab)) The change here is difficult to perceive and that isn’t very surprising since the original ...
3289 sym R (1933 sym/21 pcs) 14 img
DATA624 Homework 5
Exercise 7.1: Consider the pigs series — the number of pigs slaughtered in Victoria each month. Use the ses() function in R to find the optimal values of \(\alpha\) and ℓ0, and generate forecasts for the next four months. pigsdata <- pigs fc <- ses(pigsdata, h = 4) summary(fc) ## ## Forecast method: Simple exponential smoothing ## ##...
8301 sym R (11152 sym/39 pcs) 12 img
DATA624 Homework 4
Exercise 3.1: The UC Irvine Machine Learning Repository6 contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. data(Glass) str(Glass) ## 'da...
5044 sym R (16854 sym/36 pcs) 7 img
DATA624 Homework 6
Exercise 8.1: Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. Explain the differences among these figures. Do they all indicate that the data are white noise? All three of these images suggest that the data are white noise - there doesn’t appear to be a pattern in any of the ACF plots. The plots ...
11325 sym R (8889 sym/89 pcs) 39 img
DATA624 Project 1
Part A - ATM Forecast In part A, I want you to forecast how much cash is taken out of 4 different ATM machines for May 2010. The data is given in a single file. The variable ‘Cash’ is provided in hundreds of dollars, other than that it is straight forward. I am being somewhat ambiguous on purpose to make this have a little more business feeli...
9947 sym R (1212974 sym/92 pcs) 36 img
Homework 7
Questions 6.2: Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: (a) Start R and use these commands to load the data: library(AppliedPredictiveModeling) ## Warni...
4671 sym R (10852 sym/46 pcs) 3 img
DATA624 Homework 8
Questions 7.2: Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used the following nonlinear equation to create data:$$y = 10sin(x_{1}x_{2}) + 20(x_{3} − 0.5)^{2} + 10x_{4} + 5x_{5} + N(0, ^{2})$$ where the x values are random variables uniformly distributed between [0, 1] (there are also 5...
2626 sym R (7807 sym/25 pcs) 3 img
DATA624 Homework 9
Questions 8.1: Recreate the simulated data from Exercise 7.2: set.seed(200) simulated <- mlbench.friedman1(200, sd = 1) simulated <- cbind(simulated$x, simulated$y) simulated <- as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" (a) Fit a random forest model to all of the predictors, then estimate the variable importance ...
7062 sym R (12264 sym/47 pcs) 8 img