Publications by Tyler Brown
Technical Project 2
Intoduction This is role playing. I am your new boss. I am in charge of production at ABC Beverage and you are a team of data scientists reporting to me. My leadership has told me that new regulations are requiring us to understand our manufacturing process, the predictive factors and be able to report to them our predictive model of PH. Please...
4494 sym R (36205 sym/49 pcs) 6 img
Random Forests
8.1 library(mlbench) set.seed(200) simulated = mlbench.friedman1(200, sd=1) simulated = cbind(simulated$x, simulated$y) simulated = as.data.frame(simulated) colnames(simulated)[ncol(simulated)] = "y" Fit a random forest model to all of the predictors, then estimate the variable importance scores: library(randomForest) ## randomForest 4.7-1...
3839 sym R (8066 sym/63 pcs) 1 img
Nonlinear Modeling
7.2 Friedman (1991) introduced several benchmark data sets create by simulation. One of these simulations used a nonlinear equation to create data where the x values are random variables uniformly distributed between [0,1] (there are also 5 other non-informative variables also created in the simulation). The package mlbench contains a function ...
2350 sym 3 img
Linear Reg and its Cousins
Exercise 6.2 Developing a model to predict permeability could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: Start R and use these commands to load the data: #install.packages('AppliedPredictiveModeling') library(AppliedPr...
3953 sym R (11536 sym/23 pcs) 4 img
ATM + KWH Forecasts
ATM Forecast Forecast how much cash is taken out of 4 different ATM machines for May 2010. The data is given in a single file. The variable ‘Cash’ is provided in hundreds of dollars, other than that it is straight forward. I am being somewhat ambiguous on purpose to make this have a little more business feeling. Explain and demonstrate your...
3485 sym Python (7502 sym/23 pcs) 20 img
ARIMA Modeling
9.1) Explain the differences among these figures. Do they all indicate that the data are white noise? The differences in these figures are the bounded area size, the positioning of a spike in respect to lag, and the intensity of each spike. The size of the bounded area decreases as the time series increases in the amount of random numbers, whi...
5014 sym Python (4071 sym/25 pcs) 32 img
ETS Models
8.1) Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ0, and generate forecasts for the next four months. ## Series: Count ## Model: ETS(A,N,N) ## Smoothing paramet...
4656 sym 9 img
Handling Data
3.1) The UC Irvine Machine Learning Repository contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. Using visualizations, explore the pre...
2549 sym Python (2570 sym/45 pcs) 57 img
DATA624 Mean, Naive, SNaive, and Drift Methods
5.1) Produce forecasts for the following series using whichever of NAIVE(y), SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case: Australian Population (global_economy) As the data trend shows an increase with no seasonal components, Drift method would be ideal in forecasting. Bricks (aus_production) Seasonal Naive is the most appropr...
3091 sym 18 img
DATA624 Homework 2
3.1) Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time? ## Warning: Removed 3242 rows containing missing values (`geom_line()`). ## # A tsibble: 262 x 10 [1Y] ## # Key: Country [262] ## Country Code Year ...
4250 sym 25 img