Publications by Ken Popkin

Data607 Project 1

18.02.2020

Setup library(dplyr) library(stringr) Load tournament data data <- read.table("tournamentinfo.txt", sep = "\t", skip = 4) head(data,4) ## V1 ## 1 1 | GARY HUA |6.0 |W 39|W 21|W 18|W 14|W 7|D 12|D 4| ## 2 ON | 15445...

362 sym R (5034 sym/20 pcs)

Data606 Week 4 Homework

18.02.2020

Problem 1 Area under the curve, Part I. (4.1, p. 142) What percent of a standard normal distribution \(N(\mu=0, \sigma=1)\) is found in each region? Be sure to draw a graph. cat('z < -1.35 is', pnorm(-1.35)) ## z < -1.35 is 0.08850799 DATA606::normalPlot(mean = 0, sd = 1, bounds = c(-1.35,4), tails = TRUE) cat('z > 1.48 is', pnorm(1.48)) ## z >...

6547 sym R (2950 sym/55 pcs) 7 img

Data606 Week 4 Lab

18.02.2020

In this lab we’ll investigate the probability distribution that is most central to statistics: the normal distribution. If we are confident that our data are nearly normal, that opens the door to many powerful statistical methods. Here we’ll use the graphical tools of R to assess the normality of our data and also learn how to generate random...

9287 sym R (3954 sym/35 pcs) 14 img

Data607 Week3Assignment

12.02.2020

library('dplyr') library(stringr) Problem 1 Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset, provide code that identifies the majors that contain either “DATA” or “STATISTICS” majors <- read.csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/majors-list.csv') tail(majors,1) #...

1697 sym R (2085 sym/17 pcs)

Class Presentation for Data 606

11.02.2020

library(png) library(grid) img <- readPNG("C:/Users/user/Documents/00_Applications_DataScience/CUNY/DATA606/Presentation/Problem_217.png") grid.raster(img) Question 1: Would the mean or the median best represent what we might think of as a typical income for the 42 patrons at this coffee shop? What does this say about the robustness of the ...

969 sym R (159 sym/1 pcs) 1 img

Data607 Week2 Assignment

06.02.2020

knitr::opts_chunk$set(echo = TRUE) options(warn=-1) This notebook retrieves data from the movie surveys database and uses the information retrieved to answer the following questions: Surveys: 1. How many surveys have been completed? Participants in Surveys 2. How many participants were in each survey? 3. What is the min, max, and average age of ...

985 sym R (2300 sym/24 pcs) 2 img

Data606 Week 3 Lab

09.02.2020

Hot Hands Basketball players who make several baskets in succession are described as having a hot hand. Fans and players have long believed in the hot hand phenomenon, which refutes the assumption that each shot is independent of the next. However, a 1985 paper by Gilovich, Vallone, and Tversky collected evidence that contradicted this belief and...

11435 sym R (3872 sym/27 pcs) 3 img

Data607 Assignment 5

24.02.2020

Assignment Summary The data for this assignment is loaded in a wide format, making it difficult to immediately perform exploratory work on the dataset. To address this the data will be reformatted from a wide to long format. Exploratory work to address the following questions will be completed with the reformatted data. Information about each ai...

2660 sym R (3881 sym/14 pcs) 3 img

Data606 Week 7 Homework

07.03.2020

Problem 1 Working backwards, Part II. (5.24, p. 203) A 90% confidence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and...

6417 sym R (1854 sym/12 pcs) 4 img

Data606 Lab 7

07.03.2020

North Carolina births In 2004, the state of North Carolina released a large data set containing information on births recorded in this state. This data set is useful to researchers studying the relation between habits and practices of expectant mothers and the birth of their children. We will work with a random sample of observations from this da...

6142 sym R (5898 sym/32 pcs) 9 img 1 tbl