Publications by Douglas Barley
Data 624 HA Ch 3
Chapter 3, Hyndman and Athanasopoulos Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has this changed over time? Remove rows where records have NA values in GDP, and plot the remaining records for all countries that recorded GDP over time. Upo...
9203 sym R (5830 sym/22 pcs) 21 img
Data 624 KJ Ch 8
Chapter 8, Kuhn and Johnson, Applied Predictive Modeling Exercise 8.1 Recreate the simulated data from Exercise 7.2: library(mlbench) set.seed(200) simulated <- mlbench.friedman1(200, sd = 1) simulated <- cbind(simulated$x, simulated$y) simulated <- as.data.frame(simulated) colnames(simulated)[ncol(simulated)] <- "y" 8.1.a Fit a random for...
10751 sym R (39160 sym/55 pcs) 7 img
Data 622 HW 4
Instructions You get to decide which dataset you want to work on. The data set must be different You can work on a problem from your work, or something you are interested in. You may also obtain a dataset from sites such as Kaggle, Data.Gov, Census Bureau, USGS or other open data portal. Select one of the methodologies studied in weeks 1-10, a...
8946 sym R (16536 sym/32 pcs) 9 img 5 tbl
Data 624 Project 2 Technical Report
Project Premise This is role playing. I am your new boss. I am in charge of production at ABC Beverage and you are a team of data scientists reporting to me. My leadership has told me that new regulations are requiring us to understand our manufacturing process, the predictive factors and be able to report to them our predictive model of PH....
8956 sym R (12494 sym/24 pcs) 6 img 9 tbl
Data 624 KJ Ch 6
Chapter 6, Kuhn and Johnson, Applied Predictive Modeling 6.2. Developing a model to predict permeability (see Sect. 1.4) could save significant resources for a pharmaceutical company, while at the same time more rapidly identifying molecules that have a sufficient permeability to become a drug: 6.2.a Start R and use these commands to load th...
5404 sym R (40221 sym/39 pcs) 5 img
Data 624 KJ Ch 7
Chapter 7, Kuhn and Johnson, Applied Predictive Modeling Exercise 7.2 (a) Create the data Friedman (1991) introduced several benchmark data sets created by simulation. One of these simulations used the following nonlinear equation to create data: \(y = 10 sin(\pi x_1x_2) + 20(x_3 − 0.5)^2 + 10x_4 + 5x_5 + N(0, \sigma^2)\) where the x value...
4072 sym R (18365 sym/46 pcs) 2 img
Data 624 Market Basket and Clusters
Market Basket and Clusters Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer’s basket - and therefore Market Basket Analysis. That is exactly what the Groceries Data Set contains: a collection of receipts with...
8527 sym R (9902 sym/31 pcs) 5 img