Publications by Adam Douglas
DATA 624 - Homework 1
1. Use the help function to explore what the series gold, woolyrnq and gas represent. The gold series describes gold prices (in US dollars) on a daily basis, from January 1, 1985 through March 31, 1989. The woolyrnq series shows production of woolen yarn (in tonnes) in Australia. The data is gathered on a quarterly basis from Mar 1965 through Se...
5736 sym R (5535 sym/44 pcs) 40 img
DATA 624 - Homework 2
1. For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance: usnetelec, usgdp, mcopper, enplanements. Dataset: usnetelec autoplot(usnetelec) + labs (title="Annual US Net Electricity Generation", subtitle = "Non-Transformed", y="KWh", x="Year") + scale_y_continuous(labe...
2461 sym R (4680 sym/25 pcs) 16 img
DATA 624 - Homework 5
7.1 Consider the pigs series — the number of pigs slaughtered in Victoria each month. a) Use the ses() function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months. First, we will look at the series: # Plot series autoplot(pigs) + scale_y_continuous(labels=comma) + labs(title="Month...
10258 sym R (12948 sym/40 pcs) 21 img
DATA 624 - Homework 4
3.1 The UC Irvine Machine Learning Repository contains a dataset related to glass identification. The data consist of 214 glass samples labeled as one of several categories. There are nine predictors, including the refractive index and percentage of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. a) Using visualizations, explore the predictor...
4800 sym R (7787 sym/39 pcs) 30 img
DATA 624 - Homework 3
2. The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years. a) Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle? autoplot(plastics) + scale_y_continuous(labels=comma) + labs(title='Monthly Sales of Plastic Product "A"', ...
2174 sym R (1931 sym/11 pcs) 9 img
DATA 622 - LDA Covariance
Linear Discriminant Analysis One of the assumptions about LDA is that the predictors are Gaussian (normal) and come from a multivariate distribution with a common covariance matrix. This is an attempt to test the impact of dissimilar covariance across the classes on overall accuracy of predictions. # A function to generate 2 random Gaussian indep...
643 sym R (2773 sym/6 pcs) 4 img
DATA 624 - Project 1
ATM Withdrawal Forecasting Problem Statement Given historical data of daily ATM withdrawals, from 4 different machines, forecast the withdrawal amounts for the next month (Mat 2010). Data Import and Cleansing Import First, we load the data. The given data file was changed the Excel file into a CSV to avoid trouble with how Microsoft stores dat...
20499 sym R (23962 sym/107 pcs) 40 img
DATA 624 - Project 2 - Final
Data We start our analysis by importing the training and evaluation set. Manual one hot encoding is also applied to the Brand Code variable. This results in three new variables: brand_code_b, brand_code_c and brand_code_d. A fourth variable brand_code_ais implicit when the preceding three variable are equal to 0. train <- read_excel("StudentDat...
15581 sym R (8541 sym/23 pcs) 21 img
DATA 624 - Project 2 - Draft 3
DATA We start our analysis by importing the training and evaluation set. Manual one hot encoding is also applied to the Brand Code variable. This results in three new variables: brand_code_b, brand_code_c and brand_code_d. A fourth variable brand_code_ais implicit when the preceding three variable are equal to 0. train <- read_excel("StudentDat...
14465 sym R (8541 sym/23 pcs) 21 img
DATA 622 - HW1
Investigate The Data Once the data is loaded, we can look at it’s structure and see which method we feel may best be used to classify it: summary(data) ## x y label ## Min. : 5 a:6 BLACK:22 ## 1st Qu.:19 b:6 BLUE :14 ## Median :43 c:6 ## Mean :38 d:6 ## 3rd Qu.:55 e:6 ...
6300 sym R (5655 sym/11 pcs) 1 img