Publications by Peter Gatica
DATA607 Data Acquisition and Management
library(devtools) library(tidyverse) library(RCurl) library(XML) library(jsonlite) library(knitr) Sources: Favorite Books XML file Favorite Books HTML Table file Favorite Books HTML file Favorite Books JSON file Process the XML file filename <- getURL("https://raw.githubusercontent.com/audiorunner13/Masters-Coursework/main/DATA607%20Spring%202021...
252 sym R (13646 sym/37 pcs)
DATA607 Data Acquisition and Management
Load needed libraries library(devtools) library(tidyverse) library(RCurl) library(knitr) Source the untidy data source for cleansing and transformation filename <- getURL("https://raw.githubusercontent.com/audiorunner13/Masters-Coursework/main/DATA607%20Spring%202021/Week5/Data/airline_untidy_data.csv") (airline_untidy <- read.delim(text=filenam...
2553 sym R (3970 sym/29 pcs)
DATA607 Data Acquisition and Management
Introduction - NBA Leaders in Three pointers, Field Goals, and Free Throws as of March, 9, 2021 through 37 games. I really enjoy sports statistics especially baseball and basketball stats. I chose basketball for this project because in baseball we count everything and there are just too many for my inexperience with data analytics. Basketball ha...
4771 sym R (10043 sym/16 pcs) 3 img
Project 3 - Data Scientist Skills
Project 3 Assignment Requirements This is a project for your entire class section to work on together, since being able to work effectively on a virtual team is a key “soft skill” for data scientists. Please note especially the requirement about making a presentation during our first meetup after the project is due. W. Edwards Deming sai...
8155 sym R (13353 sym/64 pcs) 10 img 5 tbl
DATA607 Data Acquisition and Management
Tidyverse Assignment Requirements - Tidyverse CREATE Assignment (25 points) Clone the provided repository (1 point) * Write a vignette using one TidyVerse package (15 points) * Write a vignette using more than one TidyVerse packages (+ 2 points) * Make a pull request on the shared repository (1 point) Update the README.md file with your example ...
11993 sym R (5293 sym/22 pcs) 4 img
Project 4 - Document Classification
library(tidymodels) library(tidytext) library(tinytex) library(tidyverse) library(RCurl) library(knitr) library(R.utils) library(tm) # Function to get tar file and untar get_tar_untar <- function(tar_file_nm) { tarDir <- "https://spamassassin.apache.org/old/publiccorpus/" tar_file <- paste0(tarDir,tar_file_nm) download.file(tar_file,destfil...
45 sym R (5064 sym/29 pcs)
Recommender Systems Discussion
Recommender Systems - Spotify’s Recommender System Analysis - https://www.spotify.com/us/home/ For this assignment I looked at Spotify, is a streaming music application. It provides free music ad based listening or a purchased ad free based listening packages. Much of this information provide in my assignment is from this blogpost which is an e...
2610 sym
Project 4 - Document Classification Version 2
Project 4 Assignment Requirements It can be useful to be able to classify new “test” documents using already classified “training” documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. For this project, you can start with a spam/ham dataset, then predict...
4042 sym R (6938 sym/29 pcs)
DATA607 Data Acquisition and Management
The Premise We want to predict the performance of Ethereum against USD using historical data. Generally, this practice is frowned upon because each new trading day changes the business cycles that produce following rates. However, for the sake of this assignment, instead of using regressors, we are going to try to create a model based on categori...
4269 sym R (6188 sym/29 pcs) 3 img