Publications by Leo Yi & Christopher Bloome
Dis11
Intro For this discussion I wanted to take a first pass at a challenge I have been meaning to attempt: the intro Kaggle challenge on Titanic survivor ship. This training data set is effectively a ship manifest for the Titanic, with an added field indicating whether they survived the iceberg or not. I converted some of the fields to “dummy varia...
1329 sym R (3905 sym/6 pcs)
Document605HW14
This week, we’ll work out some Taylor Series expansions of popular functions. f(x) = 1/(1−x) f(x) = e^x f(x) = ln(1 + x) For each function, only consider its valid ranges as indicated in the notes when you are computing the Taylor Series expansion. Please submit your assignment as a R-Markdown document. f(x) = 1/(1−x) Derivatives \[ f...
1165 sym
Data605Final
Problem 1 Using R, generate a random variable X that has 10,000 random uniform numbers from 1 to N, where N can be any number of your choosing greater than or equal to 6. Then generate a random variable Y that has 10,000 random normal numbers with a mean of: \[ \mu = \sigma = \frac{(N+1)}{2} \] set.seed(642) N <- 10 N1 <- ((N+1)/2) X <- runi...
14909 sym R (23368 sym/92 pcs) 13 img
608 HW 1
Principles of Data Visualization and Introduction to ggplot2 I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in: inc <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module1/Data/inc5000_data.csv", header= TRUE) And lets preview this...
1472 sym R (3868 sym/15 pcs) 3 img
621.HW1.Prep
Import/Clean Import Data / View Summaries library(rvest) ## Warning: package 'rvest' was built under R version 3.6.3 ## Loading required package: xml2 ## Warning: package 'xml2' was built under R version 3.6.3 library(ggplot2) x <- "https://raw.githubusercontent.com/ChristopherBloome/621/main/moneyball-training-data.csv" TrainingData <- read.cs...
989 sym R (43404 sym/258 pcs) 87 img
621_HW4_Draft
Homework 3 Homework 3 Overview Data Exploration Data Preparation Data Splitting log transformation BoxCox Transformation Build Models Model 1 - Glmulti Model 2 - Stepwise Regression and Calculated Variables Model 3 - Lasso Model Selection Rerun model on entire training set Predict Test Set / Export results Discussion Group 1 04/09/2021 Ov...
2736 sym R (30741 sym/64 pcs) 9 img 2 tbl
621.HW3.Prep
library(corrplot) library(tidyverse) Intro In this homework assignment, we are asked to explore, analyze and model a data set containing information on crime for various neighborhoods of a major city. Each record has a response variable indicating whether or not the crime rate is above the median crime rate (1) or not (0). Our objective is to b...
3126 sym R (20440 sym/24 pcs) 3 img
621.HW2.Prep
Note to Professor: Critical Thinking Group #1 worked independently before meeting together, sharing our work and talking through the points where we each struggled. While we leveraged collaboration and group learning, we ultimately felt it made the most sense to submit independent work, Step 1 Download the classification output data set (attache...
4474 sym R (7365 sym/53 pcs) 2 img
Master of Science Data Science Final Research Project
Abstract With a full year of the pandemic behind us and vaccination rates on the rise, many workplaces are looking forward to a return to normalcy and in-person work. Employers have many mechanisms at their disposal when crafting a return to work scheme that balances the well-being of their employees with the friction that comes from implementing...
39835 sym R (43559 sym/54 pcs) 9 img
Thesis_Data_Review
Case Data Confirmed Cases Through the last year, we have used a variety of metrics to contextualize the pandemic and answer questions like “How bad is COVID right now?” and “Are things getting better or worse?” Perhaps the most considered data are the quantity of new cases in a region. We will be using data of this sort as the basis for ...
15560 sym R (37619 sym/42 pcs) 8 img