Publications by Trishita Nath

Data 621 Homework 3

09.11.2021

Overview In this homework assignment, you will explore, analyze and model a data set containing information on crime for various neighborhoods of a major city. Each record has a response variable indicating whether or not the crime rate is above the median crime rate (1) or not(0). Your objective is to build a binary logistic regression model o...

6148 sym R (8526 sym/13 pcs) 10 img 16 tbl

Data 621 Homework 2

09.10.2021

Overview In this homework assignment, you will work through various classification metrics. You will be asked to create functions in R to carry out the various calculations. You will also investigate some functions in packages that will let you obtain the equivalent results. Finally, you will create graphical output that also can be used to evalu...

4098 sym R (5164 sym/31 pcs) 2 img 3 tbl

Data 621 Homework 1

27.09.2021

Overview In this homework assignment, you will explore, analyze and model a data set containing approximately 2200 records. Each record represents a professional baseball team from the years 1871 to 2006 inclusive. Each record has the performance of the team for the given year, with all of the statistics adjusted to match the performance of a 162...

5177 sym R (4675 sym/10 pcs) 4 img 6 tbl

Data 608 Assignment 1

05.09.2021

Principles of Data Visualization and Introduction to ggplot2 I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in: inc <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module1/Data/inc5000_data.csv", header= TRUE) And lets preview this...

1584 sym R (6962 sym/25 pcs) 4 img

Data 608 Final Project

13.12.2021

Data Visualization using ggplot2 I will be analyzing the dataset of top 250 most expensive football transfers from season 2000-2001 until 2018-2019. Source Loading the dataset Preview and Summary of the data ## Name Position Age Team_from League_from Team_to ## 1 Luís Figo Right Winger 27 FC Barcelo...

514 sym R (4678 sym/9 pcs) 3 img

Data 621 Blog 1

13.12.2021

Simple Linear Regression The simple linear regression determines the relationship between two variables. One variable (predictor) tells us what we can expect from the other variable (response). The general idea of the simple linear regression is to use the predictor to come up with some average value of the response. The relationship is defined a...

640 sym R (617 sym/1 pcs) 2 img

Data 622 Homework 2

03.04.2022

Homework #2 Based on the latest topics presented, bring a loan_data of your choice and create a Decision Tree where you can solve a classification or regression problem and predict the outcome of a particular feature or detail of the data used. Switch variables to generate 2 decision trees and compare the results. Create a random forest for regre...

1977 sym R (8954 sym/7 pcs) 4 img 2 tbl

Data 622 Homework 1

24.03.2022

Homework 1…New As the quiz that was part of the original content was discarded, here’s a new assignment: Visit the following website and explore the range of sizes of this dataset (from 100 to 5 million records). https://eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/ Based on your computer’s capabilities (memory...

1918 sym R (10692 sym/10 pcs) 4 img

Data 622 Final Project 2022

24.05.2022

Overview You get to decide which dataset you want to work on. The data set must be different from the ones used in previous homeworks You can work on a problem from your job, or something you are interested in. You may also obtain a dataset from sites such as Kaggle, Data.Gov, Census Bureau, USGS or other open data portals. Select one of the meth...

7672 sym R (4416 sym/13 pcs) 8 img 1 tbl