Publications by Keith Colella
CUNY SPS MSDS - DATA605 - Final
library(tidyverse) Problem 1 Using R, set a random seed equal to 1234 (i.e., set.seed(1234)). Generate a random variable X that has 10,000 continuous random uniform values between 5 and 15.Then generate a random variable Y that has 10,000 random normal values with a mean of 10 and a standard deviation of 2.89. set.seed(1234) X <- runif(n = 1...
17847 sym R (29971 sym/118 pcs) 33 img
CUNY SPS MSDS - DATA605 Week 2
library(matrixcalc) set.seed(42) Problem Set 1 Problem 1 Show that \(A^{T}A \neq AA^{T}\) in general. (Proof and demonstration.) Response 1 In general, we know that matrix multiplication is not commutative. We can look at Definition MM and Example MMNC from [1, Beezer] to confirm that \(AB \neq BA\). Moreover, we know that, in general, \(A ...
3308 sym R (15855 sym/79 pcs)
CUNY SPS MSDS - DATA605 Week 1
library(animation) library(rgl) library(gifski) library(knitr) Assignment For this assignment, build the first letters for both your first and last name using point plots in R. Then, write R code that will left multiply (%>%) a square matrix (x) against each of the vectors of points (y). Initially, that square matrix will be the Identity mat...
2437 sym R (6990 sym/42 pcs) 11 img
Partisanship and Competitive Elections: Notebook 6
library(tidyverse) library(infer) library(tigris) library(sf) library(kableExtra) library(cowplot) set.seed(888) Abstract In this paper, I investigate the relationship between the competitiveness of electoral districts and the partisan leanings of candidates elected in those districts. The predominating narrative on partisan gerrymanderi...
3566 sym R (73575 sym/58 pcs) 18 img
Partisanship and Competitive Elections: Notebook 2
library(tidyverse) library(httr) library(jsonlite) library(fuzzyjoin) Intro The goal of this notebook is to create a list of candidates that we’ll use for further analysis. I’ll use data from the Federal Election Committee (FEC) to provide a baseline list of candidates registered for elections for the House of Representatives in the 20...
2737 sym R (13571 sym/21 pcs)
Partisanship and Competitive Elections: Notebook 5
library(tidyverse) library(kableExtra) library(tigris) library(sf) District-Level Ideology Survey This notebook collects a number of measures of district competitiveness / ideological leanings and maps them to the candidates dataframe from previous notebooks. The output of this notebook will serve as the final dataset for this project’s ul...
2765 sym R (63192 sym/65 pcs) 14 img
Partisanship and Competitive Elections: Notebook 4
library(tidyverse) library(httr) library(tidytext) library(kableExtra) library(superml) library(e1071) library(data.table) Intro This notebook collects a number of measures of partisanship / political polarization and maps them to the candidates dataframe from previous notebooks. The majority of the notebook will focus on the Twitter-base...
5573 sym R (9387 sym/27 pcs) 5 img 1 tbl
TidyVerse Vignette - Philly Crime Rates
We’ll walk through an analysis of crime rates in Philadelphia, highlighting various features of the tidyverse along the way. We’ll primarily make use of dplyr, ggplot and forcats functions. Our data will be pulled from the City of Philadelphia’s OpenPhillyData site: https://opendataphilly.org/datasets/crime-incidents/. The site provides ...
4183 sym R (6111 sym/21 pcs) 4 img 5 tbl
Election Tweets 2022 - Party Prediction
library(tidyverse) library(httr) library(tidytext) library(kableExtra) library(superml) library(data.table) This analysis will aim to predict a political candidate’s party based on the language used in their tweets. I’ll use a collection of over 3 million tweets scraped from over a thousand candidates running for House seats in the 2022...
6045 sym R (12549 sym/53 pcs) 6 img 1 tbl
Global Baseline Estimator for Movie Recommendations
library(tidyverse) library(getPass) library(RMySQL) library(recommenderlab) Read in Data We’ll start by using the original movie ratings from the Week 2 assignment. As before, I’ll read this in from a local database. pwd <- getPass(msg = "Please enter MySQL root password: ", noblank = TRUE, forcemask = TRUE)...
1631 sym R (4522 sym/24 pcs)