Publications by Michael Ippolito
CUNY-DATA607-Assignment1
Overview In their article, Why Americans Don’t Vote, Thomson-Deveaux et al. (2020) explored the reasons why a large number of eligible voters (35 to 60 percent) don’t vote in US elections and at the voting habits of voters broken out by various categories (age, level of education, race, gender, and income). The data collected confirmed the w...
5940 sym R (5753 sym/18 pcs) 15 img
CUNY - Data607 - Project1
Problem Statement Create an R markdown file that converts a structured text file into a .csv, containing the following: Player’s Name Player’s State Total Number of Points Player’s Pre-Rating Average Pre-Chess Rating of Opponents For the first player, the line should read: ...
2721 sym R (5379 sym/8 pcs)
CUNY-DATA607-Assignment2
OVERVIEW I asked 23 friends, coworkers, and family members to complete a SurveyMonkey survey that asked them to rate six recent popular movies. 1. The Outpost 2. The Invisible Man 3. The Platform (El Hoyo) 4. Mulan 5. Parasite 6. Uncut Gems Note: All images are from rottentomatoes.com. DATA Entity relationship diagram Image of an enti...
2529 sym 19 img 6 tbl
CUNY - DATA607 - Assignment3
1. Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/], provide code that identifies the majors that contain either “DATA” or “STATISTICS” #original URL on fivethirtyeight,com: #https://raw.githubusercontent.com/fivethirtyeight/...
1476 sym R (1605 sym/20 pcs)
CUNY - DATA607 - Assignment 7
Assignment 7 Overview The assignment was to do the following: Create files in HTML, JSON, and XML formats containing various attributes about three books. Parse the files into separate R data frames. Compare the results. Approach My approach will be as follows: Create each file by hand and store in my Github repo. Load the file into R using a...
1806 sym R (5532 sym/8 pcs) 3 tbl
CUNY - DATA607 - Project2
Project 2 Overview Discussion posts chosen for this project: 1. My own: literary agents from pw.org 2. Brad’s: New York voter registration from elections.ny.gov 3. Daniel’s: epidemics listing from wikipedia.org In each of these cases, the following steps are performed: Figure 1. Overview flowchart Literary agents The task is to scrape t...
4378 sym R (18020 sym/24 pcs) 11 img 13 tbl
CUNY - Data607 - Assignment5
Assignment 5 Overview In this assignment, the following tasks are performed: Figure 1. Overview flowchart Import Import csv file from Githup into an initial format. # Download the flight info CSV from Github csvfile <- getURL("https://raw.githubusercontent.com/mmippolito/cuny/main/data607/assignment5/flightinfo.csv") rawdata <- read.csv(tex...
2273 sym R (7088 sym/9 pcs) 2 img 9 tbl
CUNY - Data607 - Assignment 9
Assignment 7 Overview The assignment was to do the following: Sign up for an API key on the New York Times website. Build a procedure in R to read from the API. Store the data into an R data frame. Approach My approach is follows: Sign up for an API key, register a new application, and enable the Top Stories API. Fetch the “world” top sto...
1250 sym R (2527 sym/5 pcs) 2 tbl
CUNY - DATA607 - Assignment10
Assignment 10 - Tidytext Overview The assignment was as follows: Reproduce the example code in Chapter 2 of Text Mining with R (Silge & Robinson, 2017). Extend the code by working with a different corpus of my choice. Incorporate at least one other sentiment analysis lexicon into the analysis. Approach My approach is follows: Recreate the cod...
4257 sym R (20139 sym/59 pcs) 21 img
CUNY - DATA607 - Final Project
Final Project - Subnet Classifier Overview One real-world problem my organization faces is that, because of its large, decentralized structure, it is often difficult to discern boundaries of responsibility, especially when it comes to tasks such as cleaning up an infected computer. Our cybersecurity operations center (CSOC) routinely sees intrus...
2515 sym R (30258 sym/27 pcs) 8 img 6 tbl