Publications by Vincent Bianco

607 - Homework 3

24.02.2020

1.Initialize string library and load majors-list.csv. library(stringr) major_list <- read.csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/majors-list.csv",header=TRUE,sep=",") Use str_subset to show only the strings with “DATA” or “STATISTICS” in the majors list. str_subset(major_list$Major, "(DATA|STATI...

1139 sym R (569 sym/6 pcs)

607 - Project 1

24.02.2020

Introdution: The purpose of this project is to create a methodology for reading specific fields of data about Chess players from a strucutured list of chess tournament results. The end result will be file containing each players name, state, number of points, pre-rating, and the average pre-rating of the opponents they faced. First, we start by i...

3878 sym R (27001 sym/24 pcs)

606 Homework 4

24.02.2020

Area under the curve, Part I. (4.1, p. 142) What percent of a standard normal distribution \(N(\mu=0, \sigma=1)\) is found in each region? Be sure to draw a graph. \(Z < -1.35\) P(Z < -1.35) = 0.0885 (b) \(Z > 1.48\) P(Z > 1.48) = 0.0694 (c) \(-0.4 < Z < 1.5\) P(-0.4 < Z < 1.5) = 0.589 (d) \(|Z| > 2\) P(|Z| > 2) = P(Z > 2) + P(Z < -2) = 2P(...

6931 sym R (828 sym/13 pcs) 9 img

DATA 607 Assignment 1

02.02.2020

Overview: The data in this assignment is pulled from the FiveThirtyEight Github repository corresponding to the article “Why Some Tennis Matches Take Forever”. In this article, Carl Bialik discusses the duration of tennis matches, with a particular focus on the time of each individual point as well as the time taken between consecutive points...

2197 sym R (3761 sym/10 pcs)

606 Homework 2

10.02.2020

Stats scores. (2.33, p. 78) Below are the final exam scores of twenty introductory statistics students. 57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82, 83, 83, 88, 89, 94 Create a box plot of the distribution of these scores. The five number summary provided below may be useful. scores <- data.frame('score' = c(57, 66, 69, 71, 72, 73...

6257 sym R (352 sym/8 pcs) 5 img

DATA 607 HW 5

02.03.2020

Introduction: The purpose of this assignment is to use tidy, transform, and analyze a given table of flights from two different airlines to 5 different cities. This data shows the amount of flights which arrive on time or delayed to a given destination.We will use tidyr to reshape the data into an easier format for analysis, then we will use ggpl...

1258 sym R (1455 sym/12 pcs) 1 img