Publications by Douglas Barley
DATA 607 Data Project
Kiva loans Part 1 - Introduction and the big questions Kiva.org is a crowdfunding organization that helps people around the world by providing loans for a variety of purposes ranging across areas such as education, health, women, the arts, technology and business startups. The loans are disbursed directly to the borrowers, and many of the loans ...
15338 sym R (58238 sym/71 pcs) 14 img
DATA607 - Week 10 Assignment
Re-create the base analysis The assignment is to re-create the R code from Chapter 2 of the textbook: Silge, Julia, and David Robinson. 2017. Text Mining with R: A Tidy Approach. O’Reilly. https://www.tidytextmining.com/sentiment.html. Here is a re-creation of that code. The sentiments dataset Import three sentiment lexicons. The first is the ...
4736 sym R (12569 sym/53 pcs) 7 img
DATA605 - Final Project
Part 1 Using R, generate a random variable X that has 10,000 random uniform numbers from 1 to N, where N can be any number of your choosing greater than or equal to 6. Then generate a random variable Y that has 10,000 random normal numbers with a mean of \(\mu = \sigma = (N+1)/2\). set.seed(5432) n <- 10000 N <- 10 sigma <- (N + 1)/2 df <- ...
8986 sym R (31285 sym/76 pcs) 12 img
DATA 608 - Module 1
Principles of Data Visualization and Introduction to ggplot2 I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in: inc <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module1/Data/inc5000_data.csv", header= TRUE) And lets preview this...
1475 sym R (8576 sym/18 pcs) 4 img
Data 624 HA Ch 1-2
Chapters 1 and 2, Hyndman and Athanasopoulos Use the help function to explore what the series gafa_stock, PBS, vic_elec and pelt represent. The help function (?) displays info about each series in the Help window. The descriptions for each series are copied into the comments below. ?gafa_stock # Historical stock prices from 2014-2018 for Googl...
4751 sym R (9584 sym/37 pcs) 13 img
DATA 624 Project 1
Project 1 Part A – ATM Forecast Forecast how much cash is taken out of 4 different ATM machines for May 2010. The variable Cash is provided in hundreds of dollars, other than that it is straight forward. Explain and demonstrate your process, techniques used and not used, and your actual forecast. First examine the data. It is a tibble with...
15697 sym R (16736 sym/78 pcs) 38 img
Data 624 HA Ch 9
Chapter 9, Hyndman and Athanasopoulos Q 9.1 9.1. Figure 9.32 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers. Explain the differences among these figures. Do they all indicate that the data are white noise? The primary differences among these three figures is the amount of numbers in the series, as describe...
12526 sym R (11971 sym/76 pcs) 36 img
Data 624 HA Ch 8
Chapter 8, Hyndman and Athanasopoulos Q 8.1 3.1. Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months. Fir...
10270 sym R (10426 sym/58 pcs) 15 img
Data 624 KJ Ch 3-4
Chapters 3-4, Kuhn and Johnson, Applied Predictive Modeling Q 3.1 3.1. The UC Irvine Machine Learning Repository contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and precentages of eight elements: Na, Mg, ...
7185 sym R (15874 sym/55 pcs) 9 img
Data 624 HA Ch 5
Chapter 5, Hyndman and Athanasopoulos Produce forecasts for the following series using whichever of NAIVE(y), SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case: Australian Population (global_economy) Looking at the data there is clearly a linear trend, so the drift model would show the continuation of the trend that is inherent in t...
7490 sym R (7381 sym/41 pcs) 26 img