Publications by Group 4
DATA621 Assignment 2
Overview In this homework assignment, we will work through various classification metrics. Functions are in R to carry out the various calculations. We will also investigate some functions in packages that will let us obtain the equivalent results. Finally, we will create graphical output that also can be used to evaluate the output of classifica...
2924 sym R (4923 sym/33 pcs) 3 img 1 tbl
DATA608 Project Proposal
INTRODUCTION Around the world Terrorism tends to have significant impacts on human rights with devastating consequences. Terrorism can undermine or weaken Governments, jeopardize national security, peace and both social and economic development. The Global Terrorism Database (GTD) defines terrorism as “The threatened or actual use of illegal fo...
3535 sym 1 tbl
DATA624 Homework 5
library(fpp2) library(tidyverse) 7.1 Consider the pigs series — the number of pigs slaughtered in Victoria each month. Use the ses() function in R to find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months. pigs_ses <- ses(pigs, h=4) pigs_ses[["model"]] ## Simple exponential smoothing ## ##...
4160 sym R (8475 sym/68 pcs) 9 img
DATA621 - Blog 2: Handling Missing Data-Imputation
Intro Missing values can be a problem when trying to do analysis on the data. In most models, missing values are excluded which can limit the amount of information available in the analysis. This is the case why we have to either remove the missing values, impute them or model them. In this example, missing values will be imputed. library(tidyver...
1165 sym R (9733 sym/12 pcs) 2 img
DATA621 - Assignment 1
Overview In this homework assignment, we will explore, analyze and model a data set containing approximately 2200 records. This analysis attempts to predict the number of wins for the teams. Each record represents a professional baseball team from the years 1871 to 2006 inclusive. Each record has the performance of the team for the given year, wi...
12029 sym R (33210 sym/95 pcs) 37 img
DATA621 - Blog 1: Multicollinearity
What is Multicollinearity? Multicollinearity is the case where one of the assumption of Linear models are violated. This concept is where there is high correlation between two or more independent variables. Problems may arise due to the fact that this issue causes problems such as undermining the statistical significance of an independent variabl...
2329 sym R (19458 sym/21 pcs) 1 img
DATA624 Homework 3
library(tidyverse) library(fpp2) library(seasonal) 6.2 The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle? autoplot(plastics) Yes. There seems to be an upward trend...
1446 sym R (1980 sym/10 pcs) 8 img
DATA624 Homework 2
library(tidyverse) library(fpp2) library(gridExtra) 3.1 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. usnetelec usgdp mcopper enplanements usnetelec lambda <- BoxCox.lambda(usnetelec) lambda ## [1] 0.5167714 original <- autoplot(usnetelec) + ggtitle("Original") transformed <- autop...
2005 sym R (3762 sym/35 pcs) 10 img
DATA624 Howework 1
library(fpp2) library(kableExtra) Exercise 2.1 Use the help function to explore what the series gold, woolyrnq and gas represent. #help('gold') # Daily morning gold prices in US dollars. 1 January 1985 – 31 March 1989. #help("woolyrnq") # Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 – Sep 1994. #help('gas') # Austra...
6053 sym R (2108 sym/52 pcs) 32 img 1 tbl
DATA608 Assignment 1
library(tidyverse) library(psych) library(kableExtra) Principles of Data Visualization and Introduction to ggplot2 I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in: inc <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module1/Dat...
1790 sym R (4661 sym/13 pcs) 3 img 2 tbl