Publications by Jack Russo, Javern Wilson, Joseph Simone, Anthony Munoz, Paul Perez

D621 - Assignment 1 - Money Ball

02.03.2020

Overview In this homework assignment, we will explore, analyze and model a data set containing approximately 2200 records. This analysis attempts to predict the number of wins for the teams. Each record represents a professional baseball team from the years 1871 to 2006 inclusive. Each record has the performance of the team for the given year, wi...

11945 sym R (20782 sym/26 pcs) 37 img

D607 - Final Project - Database Access

09.12.2019

Database Package #install.packages("sqldf") library(sqldf) ## Loading required package: gsubfn ## Loading required package: proto ## Loading required package: RSQLite Connect to Database db <- dbConnect(SQLite(), dbname = "nyc_taxi_db.sqlite") List Table’s in Database dbListTables(conn = db) ## [1] "fhv_daily" "green_daily" "v_taxi_d...

149 sym R (532 sym/11 pcs)

D607 - Final Project - Database Management

09.12.2019

Database Package #install.packages("sqldf") library(sqldf) ## Loading required package: gsubfn ## Loading required package: proto ## Loading required package: RSQLite Connect to Database If a database has not been created, the below code will create it, otherwise a connection to the database is made. db <- dbConnect(SQLite(), dbname = "nyc_taxi...

1522 sym R (4935 sym/17 pcs)

D607 - Tidyverse

09.12.2019

Library ## -- Attaching packages ---------------------------------------------------------------------------------------------------------------------------- tidyverse 1.3.0 -- ## v ggplot2 3.2.1 v purrr 0.3.3 ## v tibble 2.1.3 v dplyr 0.8.3 ## v tidyr 1.0.0 v stringr 1.4.0 ## v readr 1.3.1 v forcats 0.4.0 ## -- Confli...

2001 sym R (7420 sym/26 pcs)

D606 - Final Project

11.12.2019

Data Preparation # load data #install.packages(c("psych", "ggplot2", "DT")) library(psych) library(ggplot2) library(DT) library(DATA606) df <- read.csv("https://raw.githubusercontent.com/ChefPaul/R/master/d606_Project/exam_results.csv", TRUE, ",") Research question Does a parents level of education contribute to their child’s test outcom...

5098 sym R (7264 sym/40 pcs) 7 img

Data 605 - Final Project

20.12.2020

library(tidyr) library(kableExtra) library(Amelia) library(Matrix) library(corrplot) library(MASS) Problem 1 Using R, generate a random variable X that has 10,000 random uniform numbers from 1 to N, where N can be any number of your choosing greater than or equal to 6. Then generate a random variable Y that has 10,000 random normal numbers ...

16014 sym R (50020 sym/70 pcs) 9 img

Data 624 - Homework 3

28.02.2021

Exercise 6.9 - 2 The plastics data set consists of the monthly sales (in thousands) of product A for a plastics manufacturer for five years. a. Plot the time series of sales of product A. Can you identify seasonal fluctuations and/or a trend-cycle? autoplot(plastics) The plastics dataset over the course of 5 years shows an upward trend. Additio...

2294 sym R (1318 sym/8 pcs) 8 img

Data 624 - Homework 2

22.02.2021

Exercise 3.7 - 1 For the following series, find an appropriate Box-Cox transformation in order to stabilise the variance. usnetelec usgdp mcopper enplanements usnetelec autoplot(usnetelec) lambda <- BoxCox.lambda(usnetelec) autoplot(BoxCox(usnetelec, lambda)) The lambda for the usnetelec Box-Cox is 0.5167714. usgdp autoplot(usgdp) lambda <...

2378 sym R (2941 sym/27 pcs) 15 img

Data 624 - Homework 1

15.02.2021

Exercise 2.10 - 1 Use the help function to explore what the series gold, woolyrnq, and gas represent. The data in gold represents daily morning gold prices in US dollars from January 1st, 1985 - March 31st, 1989. The data in woolyrnq represents quarterly production of woolen yarn in Australia from March 1965 - September 1994. The data in gas repr...

7593 sym R (932 sym/44 pcs) 32 img

Data 624 - Project 1

12.04.2021

Part A - ATM Forecast Our goal is for Part A of project 1 to forecast how much cash is taken out of the 4 different ATM machines for May 2010. Given the excel file containing all of our data, there are three columns; DATE, ATM, and Cash. We have to explore the dat and determine the best way to forecast, with little direction. Data Collection As ...

8671 sym R (12316 sym/120 pcs) 40 img