Publications by Shane Hylton

DATA 606 HW 4

27.09.2021

Question 1 Area under the curve, Part I. (4.1, p. 142) What percent of a standard normal distribution \(N(\mu=0, \sigma=1)\) is found in each region? Be sure to draw a graph. \(Z < -1.35\) The percentage is 8.85% \(Z > 1.48\) The percentage is 6.94% \(-0.4 < Z < 1.5\) The percentage is 58.9% \(|Z| > 2\) The percentage is 4.55% ## Loading requi...

7789 sym R (2177 sym/49 pcs) 7 img

606 Lab 2

13.09.2021

data(nycflights) names(nycflights) ## [1] "year" "month" "day" "dep_time" "dep_delay" "arr_time" ## [7] "arr_delay" "carrier" "tailnum" "flight" "origin" "dest" ## [13] "air_time" "distance" "hour" "minute" ?nycflights glimpse(nycflights) ## Rows: 32,735 ## Columns: 16 ## $ year <int> 2013, 2013, 2013...

5698 sym R (3410 sym/22 pcs) 9 img

607 Homework 3

12.09.2021

Question 1 link <- "https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/majors-list.csv" majors <- read.csv(url(link), na.strings = "") stat <- majors %>% filter( grepl("STATISTICS", Major)) data <- majors %>% filter( grepl("DATA", Major)) dstat <- rbind(data,stat) dstat ## FOD1P ...

2086 sym R (1952 sym/17 pcs)

DATA 606 HW 1

06.09.2021

Question 1 Each row of the matrix represents a single person. The total number of participants is 1693, which was found using the line in the code chunk below. Sex: Categorical Age: Numerical, discrete Marital Status: Categorical Highest Qualification: Categorical, Ordinal1 Nationality: Categorical Ethnicity: Categorical Gross Income: Categorica...

6095 sym

R Bridge Homework 3

07.08.2021

Mission My goal is to see how well correlated home runs are with strikeouts. The question: does more home runs indicate more strikeouts? How does this compare to hits versus strikeouts? My hypothesis is that most of the big name players with a lot of home runs also accumulate copious amounts of strikeouts. I anticipate that hits are generally les...

1444 sym R (5480 sym/23 pcs) 5 img

DATA 607HW 1

30.08.2021

Predicting MLB Games 2021 This fivethirtyeight article sought to predict the result of individual games in the 2021 MLB season. They also maintain a list of all MLB teams and predictions for their chances to make the playoffs and advance to the world series. The list is updated after every game, so the data is always changing. https://projects.fi...

1722 sym R (3659 sym/5 pcs) 4 img

DATA 606 Lab 1

06.09.2021

library(tidyverse) library(openintro) Exercise 1 To extract all counts of the births of girls, I would simply use arbuthnot$girls. arbuthnot$girls ## [1] 4683 4457 4102 4590 4839 4820 4928 4605 4457 4952 4784 5332 5200 4910 4617 ## [16] 3997 3919 3395 3536 3181 2746 2722 2840 2908 2959 3179 3349 3382 3289 3013 ## [31] 2781 3247 4107 4803 4881 5...

10746 sym R (6158 sym/51 pcs) 14 img

606 HW 2

13.09.2021

Stats scores. (2.33, p. 78) Below are the final exam scores of twenty introductory statistics students. 57, 66, 69, 71, 72, 73, 74, 77, 78, 78, 79, 79, 81, 81, 82, 83, 83, 88, 89, 94 Create a box plot of the distribution of these scores. The five number summary provided below may be useful. Mix-and-match. (2.10, p. 57) Describe the distributio...

6972 sym R (571 sym/7 pcs) 5 img

DATA 607 Project 1

19.09.2021

Reading the Chess Data First, I isolated the rows featuring a name by following a pattern of skipping the rows that are not needed for R to evaluate, such as the first five rows and all rows between names. I used the same approach to collect all rows containing a player rating to better isolate patterns. Because the indexes remain the same for bo...

1300 sym R (3155 sym/11 pcs) 3 img

Project 3 Document 1

10.10.2021

Communication Methods Our team will be using Slack for day to day communication and information sharing. We will be using Github and rpubs for additional document sharing. We have already created a Slack Channel for the project. We are using Google Meet to collaborate further on the project. Data Sources We used Kaggle to find a number of good d...

1767 sym