Publications by Leo Yi & Christopher Bloome

Dis11

05.11.2020

Intro For this discussion I wanted to take a first pass at a challenge I have been meaning to attempt: the intro Kaggle challenge on Titanic survivor ship. This training data set is effectively a ship manifest for the Titanic, with an added field indicating whether they survived the iceberg or not. I converted some of the fields to “dummy varia...

1329 sym R (3905 sym/6 pcs)

Document605HW14

06.12.2020

This week, we’ll work out some Taylor Series expansions of popular functions. f(x) = 1/(1−x) f(x) = e^x f(x) = ln(1 + x) For each function, only consider its valid ranges as indicated in the notes when you are computing the Taylor Series expansion. Please submit your assignment as a R-Markdown document. f(x) = 1/(1−x) Derivatives \[ f...

1165 sym

Data605Final

19.12.2020

Problem 1 Using R, generate a random variable X that has 10,000 random uniform numbers from 1 to N, where N can be any number of your choosing greater than or equal to 6. Then generate a random variable Y that has 10,000 random normal numbers with a mean of: \[ \mu = \sigma = \frac{(N+1)}{2} \] set.seed(642) N <- 10 N1 <- ((N+1)/2) X <- runi...

14909 sym R (23368 sym/92 pcs) 13 img

608 HW 1

31.01.2021

Principles of Data Visualization and Introduction to ggplot2 I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in: inc <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module1/Data/inc5000_data.csv", header= TRUE) And lets preview this...

1472 sym R (3868 sym/15 pcs) 3 img

621.HW1.Prep

01.03.2021

Import/Clean Import Data / View Summaries library(rvest) ## Warning: package 'rvest' was built under R version 3.6.3 ## Loading required package: xml2 ## Warning: package 'xml2' was built under R version 3.6.3 library(ggplot2) x <- "https://raw.githubusercontent.com/ChristopherBloome/621/main/moneyball-training-data.csv" TrainingData <- read.cs...

989 sym R (43404 sym/258 pcs) 87 img

621_HW4_Draft

15.04.2021

Homework 3 Homework 3 Overview Data Exploration Data Preparation Data Splitting log transformation BoxCox Transformation Build Models Model 1 - Glmulti Model 2 - Stepwise Regression and Calculated Variables Model 3 - Lasso Model Selection Rerun model on entire training set Predict Test Set / Export results Discussion Group 1 04/09/2021 Ov...

2736 sym R (30741 sym/64 pcs) 9 img 2 tbl

621.HW3.Prep

11.04.2021

library(corrplot) library(tidyverse) Intro In this homework assignment, we are asked to explore, analyze and model a data set containing information on crime for various neighborhoods of a major city. Each record has a response variable indicating whether or not the crime rate is above the median crime rate (1) or not (0). Our objective is to b...

3126 sym R (20440 sym/24 pcs) 3 img

621.HW2.Prep

16.03.2021

Note to Professor: Critical Thinking Group #1 worked independently before meeting together, sharing our work and talking through the points where we each struggled. While we leveraged collaboration and group learning, we ultimately felt it made the most sense to submit independent work, Step 1 Download the classification output data set (attache...

4474 sym R (7365 sym/53 pcs) 2 img

Master of Science Data Science Final Research Project

10.05.2021

Abstract With a full year of the pandemic behind us and vaccination rates on the rise, many workplaces are looking forward to a return to normalcy and in-person work. Employers have many mechanisms at their disposal when crafting a return to work scheme that balances the well-being of their employees with the friction that comes from implementing...

39835 sym R (43559 sym/54 pcs) 9 img

Thesis_Data_Review

10.05.2021

Case Data Confirmed Cases Through the last year, we have used a variety of metrics to contextualize the pandemic and answer questions like “How bad is COVID right now?” and “Are things getting better or worse?” Perhaps the most considered data are the quantity of new cases in a region. We will be using data of this sort as the basis for ...

15560 sym R (37619 sym/42 pcs) 8 img