Publications by Kristin Lussi
Flood Risk in NYC: A Product for Insurance Professionals
Table of contents Introduction As a Product Genesis: Class Project to Product Phase 1: POC (Completed) Phase 2: Alpha - The Working Product Future Development Raw Product Demonstration Demonstration for Insurance Professionals Schedule of Values: Mapped Overview: CUNY Property Values Property Value by Flood Zone Risk Scoring Summary of Values b...
6544 sym 3 img 1 tbl
A Comprehensive Analysis of Gender Pay Gap in the Tech Industry
A Comprehensive Analysis of Gender Pay Gap in the Tech Industry Kristin Lussi and Tony Fraser December 7th, 2023 Abstract This research paper uses six years’ worth of the annual Stack Overflow survey data to examine the gender pay gap within the technology sector. It’s important to note that the self-reported responses in this dataset are no...
10257 sym 7 img
DATA 607: Project 4
Introduction In this project, we created a model that classifies emails as spam or ham. Load Packages library(readr) library(dplyr) library(tidyr) library(tidyverse) library(wordcloud) library(tm) library(naivebayes) library(e1071) library(RTextTools) library(caret) library(quanteda) library(rsample) Load Data url <- "https://raw.githubuserconten...
379 sym R (31274 sym/13 pcs) 2 img
DATA 607 Extra Credit: Classification Performance Metrics
Load Packages library(readr) library(dplyr) library(tidyr) library(ggplot2) Load Data url <- "https://raw.githubusercontent.com/acatlin/data/master/classification_model_performance.csv" performance <- read_csv(url, show_col_types = FALSE) head(performance) ## # A tibble: 6 × 3 ## class scored.class scored.probability ## <dbl> <dbl> ...
1819 sym R (3800 sym/10 pcs) 1 img
DATA 607 Week 10 Assignment
Introduction In this assignment, we will provide the base code from Chapter 2: Sentiment analysis with tidy data from Text Mining with R: A Tidy Approach. Once this code is running, we will use the SentimentAnalysis package to perform sentiment analysis on Nelson Mandela’s 1996 State of the Nation speech. We retrieved this speech from the State o...
762 sym R (5626 sym/29 pcs) 7 img
DATA 607: More JSON Practice
Introduction In this extra credit, we will load the Nobel Prize data using an API from nobelprize.org. We will then ask 4 interesting questions and provide answers to these questions using the data. Load Packages library(jsonlite) library(tidyverse) library(dplyr) library(ggplot2) library(httr) library(gt) Load and Clean Data prize_url <- "http:/...
2116 sym R (5842 sym/9 pcs) 3 img 1 tbl
DATA 607 Assignment 9
Introduction In this assignment, we will load the New York Times Best Sellers list using an API key. Then, we will use the data to answer the following question: Which book category is ranked highest on the NYT Best Sellers list on average? Load Packages First, we load the necessary packages: library(jsonlite) library(tidyverse) library(dplyr) lib...
1151 sym R (3416 sym/7 pcs) 1 img
DATA 607 Week 7 Assignment
Introduction In this assignment, I will load 3 different file types into R and assess the differences between each file type. Load Packages Here, I load the necessary packages needed to load the files into R. library(rvest) library(xml2) library(dplyr) library(XML) library(methods) library(rjson) Load .html file First, I will load the .html file ...
1379 sym R (4802 sym/13 pcs)
DATA 606 Project 2: Couple Sleeping Arrangements
Introduction In this analysis, I am working with a data set from fivethiryeight.com which contains survey responses from American adults who are married, in a domestic partnership, in a civil union, or cohabitating with a partner. I want to answer the following questions: What percentage of couples sleep in separate beds regularly? What are the mo...
2659 sym R (18491 sym/9 pcs) 3 img 1 tbl
DATA 607 Project 2: Doctorate Recipients
Introduction In this analysis, I am working with a dataset of research doctorate recipients and their fields of doctorate, which has been provided by the National Science Foundation. One question that I would like to answer is as follows: How has the makeup of doctorate degrees changed over the years? Load Packages library(readr) library(dplyr) li...
1973 sym R (10405 sym/9 pcs) 4 img