Publications by Diana Plunkett
Document
Introduction We will explore, analyze and model a data set containing information on crime for various neighborhoods of a major city. Each record has 12 predictor variables and a response variable indicating whether or not the crime rate is above the median crime rate (1) or not (0). The model will be a binary logistic regression model on the train...
24974 sym R (27748 sym/27 pcs) 15 img 3 tbl
Binary
Introduction We will explore, analyze and model a data set containing information on crime for various neighborhoods of a major city. Each record has a response variable indicating whether or not the crime rate is above the median crime rate (1) or not (0). The model will be a binary logistic regression model on the training data set to predict whe...
1473 sym 2 img 3 tbl
crime rt2
Introduction We will explore, analyze and model a data set containing information on crime for various neighborhoods of a major city. Each record has a response variable indicating whether or not the crime rate is above the median crime rate (1) or not (0). The model will be a binary logistic regression model on the training data set to predict whe...
1477 sym R (2734 sym/5 pcs) 2 img 3 tbl
$ball
DATA LOAD Read in the Moneyball training data df <- read.csv('/Users/dianaplunkett/CUNY/621 Business Analytics and Data Mining/Homework 1/moneyball-training-data.csv') DATA EXPLORATION (& a little prep) Confirming exact number of rows and columns dim(df) ## [1] 2276 17 Getting the full summary, will refer back to this as I go summary(df) ## ...
6486 sym R (14046 sym/44 pcs) 8 img
605 Problem 2
Problem 2 You are to register for Kaggle.com (free) and compete in the House Prices: Advanced Regression Techniques competition. https://www.kaggle.com/c/house-prices-advanced-regression-techniques . I want you to do the following. Before doing any of the specific requirements, will explore the data and do some tidying. df <- read.csv('/Users/dian...
8794 sym Python (57540 sym/134 pcs) 10 img
605 Problem 1
/Users/dianaplunkett/Downloads/house-prices-prediction-using-tfdf.ipynb Problem 1 Problem 1.1 Probability Density 1: X~Gamma. Using R, generate a random variable X that has 10,000 random Gamma pdf values. A Gamma pdf is completely describe by n (a size parameter) and lambda (λ , a shape parameter). Choose any n greater 3 and an expected value (�...
5691 sym
605 Problem 2 Draft
library(tidyr) library(ggplot2) library(Matrix) ## ## Attaching package: 'Matrix' ## The following objects are masked from 'package:tidyr': ## ## expand, pack, unpack library(corrplot) ## corrplot 0.92 loaded library(MASS) Problem 2 You are to register for Kaggle.com (free) and compete in the House Prices: Advanced Regression Techniques comp...
4167 sym R (26553 sym/60 pcs) 8 img
605 Assignment 14
Taylor Series Work out some Taylor Series expansions of popular functions. For each function, only consider its valid ranges as indicated in the notes when you are computing the Taylor Series expansion. References: https://www.3blue1brown.com/lessons/taylor-series https://byjus.com/maths/taylor-series/ Recall that the Taylor Series is a method for ...
2478 sym
605 Assignment 13
1 Use integration by substitution to solve the integral below. \(\int 4e^{-7x} dx\) \(u = -7x, du= -7dx\) \(\int 4e^u \frac{du}{-7} \Rightarrow -\frac{4}{7}e^{7x} + C\) 2 Biologists are treating a pond contaminated with bacteria. The level of contamination is changing at a rate of \(\frac{dN}{dt} = - \frac{3150}{t^4} - 220\) bacteria per cubic cen...
4453 sym 2 img
605 HW12
The attached who.csv dataset contains real-world data from 2008. The variables included follow. Country: name of the country LifeExp: average life expectancy for the country in years InfantSurvival: proportion of those surviving to one year or more Under5Survival: proportion of those surviving to five years or more TBFree: proportion of the popula...
4930 sym R (4208 sym/21 pcs) 3 img