Publications by Stefano Biguzzi
Week 1 Assignment – Loading Data into a Data Frame
Introduction to Article and Data For this assignment I chose the article “Voter Registrations Are Way, Way Down During The Pandemic”. In our current political climate it is more important than ever to get people registered to vote. The pandemic has made it very difficult for people to register in person, and as a result for the months of Marc...
2250 sym R (2004 sym/12 pcs) 1 img
DATA607: Week 2 Assignment - SQL and R
ETL Process Create Connection to Local PostgreSQL database and getting tables Creating connection con <- dbConnect( RPostgres::Postgres(), dbname = "MovieRatings", host="localhost", port="5432", user="postgres", password=params$pwd) Setting tables to dataframes Loading tblPersons and removing any white spaces #Loading person ...
10198 sym R (6016 sym/36 pcs)
Data 607 Assignment Week 3 - Character manipulation & Data processing
Part I: Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset, provide code that identifies the majors that contain either “DATA” or “STATISTICS” filtered_data <- major_data[ grepl("(DATA|STATISTICS)",major_data$Major), ] Filtered Majors FOD1P Major Major_Category 6212 MANAGEMENT INFORMATION S...
2253 sym R (719 sym/7 pcs)
DATA 607: Project 2 - Data Transformation
Introduction The goal of this assignment is to give you practice in preparing different datasets for downstream analysis work. Your task is to: (1) Choose any three of the “wide” datasets identified in the Week 6 Discussion items. (You may use your own dataset; please don’t use my Sample Post dataset, since that was used in your Week 5...
10968 sym R (4389 sym/28 pcs) 8 img
DATA 607: Week 5 Assignment - Tidying and Transforming Data
Introduction - Week 5 Assignemnt The task for this week’s assignment is to: Create a .CSV file (or optionally, a MySQL database!) that includes all of the information above. You’re encouraged to use a “wide” structure similar to how the information appears above, so that you can practice tidying and transformations as described below, Re...
4243 sym R (1057 sym/6 pcs) 4 img
Project 1 - Chess Tournament Data
Introduction In this project, you’re given a text file with chess tournament results where the information has some structure. Your job is to create an R Markdown file that generates a .CSV file (that could for example be imported into a SQL database) with the following information for all of the players: Player’s Name Player’s State Total...
10617 sym R (4828 sym/14 pcs)
DATA 607: Week 7 Assignment - Working with XML, JSON, and HTML in R
Loading Libraries library(tidyverse) library(RCurl) library(XML) library(rjson) library(knitr) Introduction We were asked to create a XML file, a Json file, and a Html file. We were then asked to load them into R and create a data frame from the data. I decided to use the tidyverse and Rcurl libraries for loading in data,XML for loading in t...
3269 sym R (1674 sym/12 pcs)
DATA 607: Week 9 Assignment - Web APIs
Introduction Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame. Loading Libraries The following libraries are required for running this code library(glue) library(httr) library(jsonlite) library(knitr) NY Times books API I decided to use the NY Ti...
3435 sym R (554 sym/5 pcs)
Tidyverse CREATE Assignment
Tidyverse CREATE Assignment Stefano Biguzzi 2020-10-25 Assignment In this assignment we were asked to create a vignette to discuss one or more packages within the tidyverse library. For this assignment I chose to look at the dplyr library specifically the group_by(), tally(), and the summarise() functions. tidyverse Intro We’ll start the assi...
2427 sym R (4375 sym/9 pcs)
DATA 607: Final Project - Predicting Serie A Matchday 10
Loading Libraries library(tidyverse) library(rvest) library(prob) library(stargazer) library(GGally) library(nnet) library(knitr) library(kableExtra) Introduction Inspired by my love for soccer and FiveThirtyEight’s soccer prediction page, I set out to find a way to predict outcomes of Serie A games. My idea is to gather more data than ...
15452 sym R (10283 sym/26 pcs) 7 img