Publications by Irene Jacob
DATA 607_Assignment_1
Introduction This is an R Markdown document for the data that can be found in the following link. https://projects.fivethirtyeight.com/2020-general-data/presidential_national_toplines_2020.csv This data gives the forecast for the elections conducted in the year 2020. Data D <- read.csv('https://projects.fivethirtyeight.com/2020-general-data/pres...
566 sym R (60072 sym/4 pcs)
DATA 607_Assignment_2
ASSIGNMENT 2 Choose six recent popular movies. Ask at least five people that you know (friends, family, classmates, imaginary friends if necessary) to rate each of these movies that they have seen on a scale of 1 to 5. Take the results (observations) and store them in a SQL database of your choosing. Load the information from the SQL database int...
930 sym R (2634 sym/21 pcs) 8 img
Data606_Lab 3
Hot Hands Basketball players who make several baskets in succession are described as having a hot hand. Fans and players have long believed in the hot hand phenomenon, which refutes the assumption that each shot is independent of the next. However, a 1985 paper by Gilovich, Vallone, and Tversky collected evidence that contradicted this belief and...
9922 sym R (4185 sym/32 pcs) 7 img
Data606_Homework 1
Smoking habits of UK residents. A survey was conducted to study the smoking habits of UK residents. Below is a data matrix displaying a portion of the data collected in this survey. Note that “£” stands for British Pounds Sterling, “cig” stands for cigarettes, and “N/A” refers to a missing component of the data. What does each row o...
6074 sym
Data606_Homework 2
Stats scores Below are the final exam scores of twenty introductory statistics students. 57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82, 83, 83, 88, 89, 94 Create a box plot of the distribution of these scores. The five number summary provided below may be useful. scores <- c(57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82,...
5516 sym R (773 sym/7 pcs) 2 img
Data606_Lab 2
Exercise 1 What can be noticed from the 3 histograms given below is that as the bin width increases the accuracy of the data represented decreases. When the bin width is 15 the accuracy level is maximum. ggplot(data = nycflights, aes(dep_delay)) + geom_histogram() ## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`. ggplot(da...
1411 sym R (2843 sym/20 pcs) 10 img
Data606_Lab 1
library(tidyverse) ## Warning: package 'tibble' was built under R version 4.0.3 library(openintro) Exercise 1 Command to extract the count of girls baptized arbuthnot$girls ## [1] 4683 4457 4102 4590 4839 4820 4928 4605 4457 4952 4784 5332 5200 4910 4617 ## [16] 3997 3919 3395 3536 3181 2746 2722 2840 2908 2959 3179 3349 3382 3289 3013 ## [31...
978 sym R (1263 sym/13 pcs) 4 img
Data606_Homework 6
2010 Healthcare Law On June 28, 2012 the U.S. Supreme Court upheld the much debated 2010 healthcare law, declaring it constitutional. A Gallup poll released the day after this decision indicates that 46% of 1,012 Americans agree with this decision. At a 95% confidence level, this sample has a 3% margin of error. Based on this information, determi...
6385 sym R (962 sym/20 pcs)
Data606_Homework 7
Working backwards, Part II A 90% confidence interval for a population mean is (65, 77). The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviat...
5421 sym R (1924 sym/19 pcs) 4 img
DATA 607_Assignment_5
Tidying and Transforming Data 1. A .csv file of the data was created and it is being loaded below. While loading the null fields are given “NA”. The first column is named Airline and second column is named Status. flight <- read.table("https://raw.githubusercontent.com/irene908/DATA-607/master/Assignment%205_Delays.csv", header=TRUE, sep=","...
1156 sym R (3376 sym/17 pcs) 2 img