Publications by Josh Iden
Blog 1 - OLS Regression
Introduction In this blog post, we take a look at Ordinary Least Squares (OLS) regression, and see how it can be used to predict ticket sales for concerts. About Ordinary Least Squares Regression Ordinary Least Square regression is a statistical method used to estimate the relationship between a dependent variable and one or more independent varia...
2055 sym
Blog 2 - Weighted Least Squares Regression
Introduction In this blog post, we expand on the previous post covering Ordinary Least Squares (OLS) regression, and how it can be used to predict ticket sales, and look at another type of linear regression that takes into account the variability of the errors or residuals of the data – weighted least squares regression. About Weighted Least Squ...
2406 sym
DATA608 Module 3
Assignment Solution Assignment Solution Data Preparation First we read the data into R. df = read.csv('/Users/joshiden/Documents/Classes/CUNY SPS/Spring 2023/DATA608/DATA608/CUNY_DATA_608/module3/data/cleaned-cdc-mortality-1999-2010-2.csv') kable(head(df)) ICD.Chapter State Year Deaths Population Crude.Rate Certain infectious and parasiti...
449 sym 1 img 3 tbl
DATA 608 HW1
Principles of Data Visualization and Introduction to ggplot2 I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. Lets read this in: inc <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module1/Data/inc5000_data.csv", header= TRUE) And lets preview this...
1477 sym 3 img 1 tbl
DATA 607 HW2
Introduction I chose six recent films and asked five friends to rate each of the movies they had seen from a scale of 1 to 5. This project follows the collection of that information and it’s migration to R for further analysis. Collecting The Data The time constraints of the project required collection of the data by text message. I created a ...
1709 sym Python (2419 sym/12 pcs)
DATA 607 HW2
Introduction I chose six recent films and asked five friends to rate each of the movies they had seen from a scale of 1 to 5. This project follows the collection of that information and it’s migration to R for further analysis. Collecting The Data The time constraints of the project required collection of the data by text message. I created a ...
1710 sym Python (2419 sym/12 pcs)
DATA 605 HW3
DATA 605: Assignment 3 Problem Set #1 1) What is the rank of matrix \(A\)? A <- matrix(c(1,2,3,4, -1,0,1,3, 0,1,-2,1, 5,4,-2,3), 4, byrow=TRUE) print(A) ## [,1] [,2] [,3] [,4] ## [1,] 1 2 3 4 ## [2,] -1 0 1 3 ## [3,] 0 1 -2 1 ## [4,] 5 4 -2 3 ...
2937 sym
DATA 607 HW3
# load packages library(rvest) library(htmltab) library(stringr) library(dplyr) – 1.Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/], provide code that identifies the majors that contain either “DATA” or “STATISTICS” # sto...
1500 sym R (1526 sym/8 pcs)
DATA 605 HW4
## Week 4, Linear Transformations & Representations ** With the provided data file, build and visualize eigenimagery that accounts for 80% of the variability. Provide full R code and discussion. ** Setting Up The Data Load packages: library(imager) library(jpeg) library(EBImage) library(recolorize) library(OpenImageR) library(stats) Read the ima...
1263 sym R (3528 sym/23 pcs) 7 img
DATA 605 HW5
Question 1 (Bayesian). A new test for multinucleoside-resistant (MNR) human immunodeficiency virus type 1 (HIV-1) variants was recently developed. The test maintains 96% sensitivity, meaning that, for those with the disease, it will correctly report “positive” for 96% of them. The test is also 98% specific, meaning that, for those without the...
7183 sym R (1853 sym/50 pcs)