Publications by Peter Gatica

DATA607 Data Acquisition and Management

20.03.2021

library(devtools) library(tidyverse) library(RCurl) library(XML) library(jsonlite) library(knitr) Sources: Favorite Books XML file Favorite Books HTML Table file Favorite Books HTML file Favorite Books JSON file Process the XML file filename <- getURL("https://raw.githubusercontent.com/audiorunner13/Masters-Coursework/main/DATA607%20Spring%202021...

252 sym R (13646 sym/37 pcs)

DATA607 Data Acquisition and Management

07.03.2021

Load needed libraries library(devtools) library(tidyverse) library(RCurl) library(knitr) Source the untidy data source for cleansing and transformation filename <- getURL("https://raw.githubusercontent.com/audiorunner13/Masters-Coursework/main/DATA607%20Spring%202021/Week5/Data/airline_untidy_data.csv") (airline_untidy <- read.delim(text=filenam...

2553 sym R (3970 sym/29 pcs)

DATA607 Data Acquisition and Management

14.03.2021

Introduction - NBA Leaders in Three pointers, Field Goals, and Free Throws as of March, 9, 2021 through 37 games. I really enjoy sports statistics especially baseball and basketball stats. I chose basketball for this project because in baseball we count everything and there are just too many for my inexperience with data analytics. Basketball ha...

4771 sym R (10043 sym/16 pcs) 3 img

Project 3 - Data Scientist Skills

05.04.2021

Project 3 Assignment Requirements This is a project for your entire class section to work on together, since being able to work effectively on a virtual team is a key “soft skill” for data scientists. Please note especially the requirement about making a presentation during our first meetup after the project is due. W. Edwards Deming sai...

8155 sym R (13353 sym/64 pcs) 10 img 5 tbl

DATA607 Data Acquisition and Management

13.04.2021

Tidyverse Assignment Requirements - Tidyverse CREATE Assignment (25 points) Clone the provided repository (1 point) * Write a vignette using one TidyVerse package (15 points) * Write a vignette using more than one TidyVerse packages (+ 2 points) * Make a pull request on the shared repository (1 point) Update the README.md file with your example ...

11993 sym R (5293 sym/22 pcs) 4 img

Project 4 - Document Classification

02.05.2021

library(tidymodels) library(tidytext) library(tinytex) library(tidyverse) library(RCurl) library(knitr) library(R.utils) library(tm) # Function to get tar file and untar get_tar_untar <- function(tar_file_nm) { tarDir <- "https://spamassassin.apache.org/old/publiccorpus/" tar_file <- paste0(tarDir,tar_file_nm) download.file(tar_file,destfil...

45 sym R (5064 sym/29 pcs)

Recommender Systems Discussion

30.04.2021

Recommender Systems - Spotify’s Recommender System Analysis - https://www.spotify.com/us/home/ For this assignment I looked at Spotify, is a streaming music application. It provides free music ad based listening or a purchased ad free based listening packages. Much of this information provide in my assignment is from this blogpost which is an e...

2610 sym

Project 4 - Document Classification Version 2

02.05.2021

Project 4 Assignment Requirements It can be useful to be able to classify new “test” documents using already classified “training” documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. For this project, you can start with a spam/ham dataset, then predict...

4042 sym R (6938 sym/29 pcs)

DATA607 Data Acquisition and Management

17.05.2021

The Premise We want to predict the performance of Ethereum against USD using historical data. Generally, this practice is frowned upon because each new trading day changes the business cycles that produce following rates. However, for the sake of this assignment, instead of using regressors, we are going to try to create a model based on categori...

4269 sym R (6188 sym/29 pcs) 3 img