Publications by Irene Jacob
DATA608_Homework3
Assignment 3 Goal I have provided you with data about mortality from all 50 states and the District of Columbia.Please access it at https://github.com/charleyferrari/CUNY_DATA608/tree/master/module3/data You are invited to gather more data from our provider, the CDC WONDER system, at https://wonder.cdc.gov This assignment must be done in R. It m...
1549 sym R (4749 sym/13 pcs)
DATA621 Assignment 2
1. Download the classification output data set class_df <- read.csv("https://raw.githubusercontent.com/irene908/DATA621/main/classification-output-data.csv") head(class_df) 2. Confusion matrix class_scored <- class_df[,c('class', 'scored.class','scored.probability')] table(class_scored$class,class_scored$scored.class) ## ## 0 1 ...
1068 sym R (5003 sym/36 pcs) 3 img
DATA622 Homework 2
Goal Based on the latest topics presented, bring a dataset of your choice and create a Decision Tree where you can solve a classification or regression problem and predict the outcome of a particular feature or detail of the data used. Switch variables to generate 2 decision trees and compare the results. Create a random forest for regression an...
2623 sym R (2944 sym/14 pcs) 2 img 3 tbl
DATA622 Homework 1
Goal Visit the following website and explore the range of sizes of this dataset (from 100 to 5 million records). https://eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/ Based on your computer’s capabilities (memory, CPU), select 2 files you can handle (recommended one small, one large) Review the structure and conte...
4166 sym R (10107 sym/22 pcs) 2 img
DATA622 Homework 4
Goal You get to decide which dataset you want to work on. The data set must be different from the ones used in previous homeworks You can work on a problem from your job, or something you are interested in. You may also obtain a dataset from sites such as Kaggle, data.gov, Census Bureau, USGS or other open data portals. Select one of the methodo...
6170 sym R (4133 sym/18 pcs) 3 img 10 tbl
DATA622 Homework 3
Goal Perform an analysis of the dataset used in Homework #2 using the SVM algorithm.Compare the results with the results from previous homework. Based on articles https://www.hindawi.com/journals/complexity/2021/5550344/ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137961/ Search for academic content (at least 3 articles) that compare the ...
2058 sym R (3491 sym/16 pcs) 3 tbl