Publications by john mazon

DATA 622 001[30417] Homework 1

17.03.2022

DATA622 Homework 1 As the quiz that was part of the original content was discarded, here’s a new assignment: Visit the following website and explore the range of sizes of this dataset (from 100 to 5 million records). https://eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/ Based on your computer’s capabilities (memo...

12234 sym R (27094 sym/24 pcs) 3 img

Week 4 Forecasting |21-Feb 27-Feb

28.02.2022

Week 4 Forecasting |21-Feb 27-Feb Do exercises 5.1, 5.2, 5.3, 5.4 and 5.7 in the Hyndman book. 5.1 Produce forecasts for the following series using whichever of NAIVE(y), SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case: Australian Population (global_economy) From below we can see a clear indication that pop. is increasing at a const...

4830 sym R (10905 sym/64 pcs) 24 img

DATA 624 - Homework #1

14.02.2022

DATA 624 - Homework #1 Please submit exercises 2.1, 2.2, 2.3, 2.4, 2.5 and 2.8 from the Hyndman online Forecasting book. Please submit both your Rpubs link as well as attach the .rmd file with your code. Question 2.1 Use the help function to explore what the series gafa_stock, PBS, vic_elec and pelt represent. a.Use autoplot() to plot some of t...

3551 sym R (8408 sym/49 pcs) 7 img

Week 3 Decomposition

21.02.2022

Week 3 Decomposition |14-Feb 20-Feb Do exercises 3.1, 3.2, 3.3, 3.4, 3.5, 3.7, 3.8 and 3.9 from the online Hyndman book. Please include your Rpubs link along with your .rmd file. QUESTIONS 3.1 Consider the GDP information in global_economy. Plot the GDP per capita for each country over time. Which country has the highest GDP per capita? How has ...

3773 sym R (8768 sym/36 pcs) 17 img

DATA698 Project Proposal

21.02.2022

DATA 698 - Project Proposal Overview and Motivation My DATA 698 Project will be based on NYC Crime. I was wondering how violent is the city, if some neighborhoods or type of persons are most affected, and if I could identify some patterns for shooting over time as well compare to other crimes. My primary goal is to take the tools and resources l...

5546 sym 1 img

Week 5 Data Preprocessing/Overfitting |28-Feb 6-Mar

07.03.2022

3.1. The UC Irvine Machine Learning Repository6 contains a data set related to glass identification. The data consist of 214 glass samples labeled as one of seven class categories. There are nine predictors, including the refractive index and percentages of eight elements: Na, Mg, Al, Si, K, Ca, Ba, and Fe. The data can be accessed via: library(m...

8035 sym R (9149 sym/16 pcs) 10 img 1 tbl

Week 6 Exponential Smoothing |7-Mar 13-Mar

13.03.2022

Week 6 Exponential Smoothing |7-Mar 13-Mar Do exercises 8.1, 8.5, 8.6, 8.7, 8.8, 8.9 in Hyndman. Please submit both the link to your Rpubs and the .rmd file. 8.1 Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset. Use the ETS() function to estimate the equivalent model for simple exponential smoothing...

4877 sym R (8277 sym/45 pcs) 10 img

DATA 698 Thesis - NYC Shootings Time Series Analysis

23.05.2022

DATA 698 - Project Proposal Overview and Motivation My DATA 698 Project will be based on NYC Crime. I was wondering how violent is the city, if some neighborhoods or type of persons are most affected, and if I could identify some patterns for shooting over time as well compare to other crimes. My primary goal is to take the tools and resources l...

5154 sym R (25198 sym/55 pcs) 8 img

DATA622_FINALPROJECT

20.05.2022

Homework 4 (Final project) DATA INITIAL SUMMARY DATA SOURCE: https://www.kaggle.com/datasets/rounakbanik/pokemon LABELS ARTS & Entertainment - Earth and Nature - Games - Video Games - Anime - Pop Culture Data Source Activity Overview activity stats Views 307518 Downloads 41644 Download per view ratio 0.14 Total unique contributors 157 Descrip...

2535 sym R (30424 sym/39 pcs) 9 img

DATA624_HW9

02.05.2022

Do problems 8.1, 8.2, 8.3, and 8.7 in Kuhn and Johnson. Please submit the Rpubs link along with the .rmd file. ##8.1. Recreate the my_simu data from Exercise 7.2: library(mlbench) set.seed(200) my_simu <- mlbench.friedman1(200, sd = 1) my_simu <- cbind(my_simu$x, my_simu$y) my_simu <- as.data.frame(my_simu) colnames(my_simu)[ncol(my_simu)] <...

4691 sym R (18582 sym/108 pcs) 6 img