Publications by Trishita Nath
Data 607 Final Project
Introduction For this final project, I will be working alone. I am going to analyze a dataset that holds data for the Real-Estate industry as reported by Zestimate. Zestimate was created to give consumers as much information as possible about homes and the housing market, marking the first time consumers had access to this type of home value info...
1767 sym R (22077 sym/48 pcs) 7 img
Data 607 Final Project Proposal
Introduction For this final project, I will be working alone. I am going to analyze a dataset that holds data for the Real-Estate industry as reported by Zestimate. Zestimate was created to give consumers as much information as possible about homes and the housing market, marking the first time consumers had access to this type of home value info...
1351 sym
Data 605 Homework 13
Use integration by substitution to solve the integral below. \[ \int { 4{ e }^{ -7x }dx } \] Solution: \[ Let\ u=-7x \\ du=-7dx \\ -\frac{du}{7}=dx \] 4 is constant now we substitute dx with du (reverse chain rule). \[ \frac{-4}{7} \int { { e }^{ u }du } \\ \frac{-4}{7}e^{u}+c\\ \frac{-4}{7}e^{-7x}+c \] Biologists are treating a pond...
4996 sym R (394 sym/5 pcs) 1 img
Data 605 Homework 12
Introduction The attached who_data.csv dataset contains real-world data from 2008. The variables included follow. Country: name of the country LifeExp: average life expectancy for the country in years InfantSurvival: proportion of those surviving to one year or more Under5Survival: proportion of those surviving to five years or more TBFree: prop...
4492 sym R (5301 sym/21 pcs) 8 img
Data 607 Project 4
Introduction It can be useful to be able to classify new “test” documents using already classified “training” documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. For this project, you can start with a spam/ham dataset, then predict the class of new docu...
916 sym R (6028 sym/44 pcs)
Data 607 Discussion 11 - Recommender Systems
Introduction Your task is to analyze an existing recommender system that you find interesting. You should: Perform a Scenario Design analysis as described below. Consider whether it makes sense for your selected recommender system to perform scenario design twice, once for the organization (e.g. Amazon.com) and once for the organization’s cus...
4088 sym
Data 605 Homework 14
Introduction This week, we’ll work out some Taylor Series expansions of popular functions. \(f(x) = \frac{1}{(1-x)}\) \(f(x) = e^x\) \(f(x) = \ln(1 + x)\) For each function, only consider its valid ranges as indicated in the notes when you are computing the Taylor Series expansion. Please submit your assignment as a R-Markdown document. Func...
1927 sym
Data 605 Homework 15
Question 1 Find the equation of the regression line for the given points. Round any final values to the nearest hundredth, if necessary. ( 5.6, 8.8 ), ( 6.3, 12.4 ), ( 7, 14.8 ), ( 7.7, 18.2 ), ( 8.4, 20.8 ) Solution x_values <- c(5.6, 6.3, 7, 7.7, 8.4) y_values <- c(8.8, 12.4, 14.8, 18.2, 20.8) reg_line <- lm(y_values ~ x_values) reg_line ...
3780 sym R (581 sym/5 pcs) 1 img
Data 621 Homework 5
Overview In this homework assignment, you will explore, analyze and model a data set containing information on approximately 12,000 commercially available wines. The variables are mostly related to the chemical properties of the wine being sold. The response variable is the number of sample cases of wine that were purchased by wine distribution c...
4567 sym R (49028 sym/94 pcs) 44 img 3 tbl
Data 621 Homework 4
Overview In this homework assignment, you will explore, analyze and model a dataset containing approximately 8000 records representing a customer at an auto insurance company. Each record has two response variables. The first responsevariable, TARGET_FLAG, is a 1 or a 0. A “1” means that the person was in a car crash. A zero means that the pe...
1093 sym R (22269 sym/17 pcs) 9 img 4 tbl