Publications by Nicholas Jacob

Hypothesis Testing

19.01.2021

Olympic Medals Data Exploration I discovered this data set at https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results. In order to create my data file, I clicked the icon to upload to Google sheets, published it to the web as a csv and copied a link. Some data will be possible to upload to github but if the file is large...

3480 sym R (17337 sym/56 pcs) 7 img

Gunners Hypothesis

27.01.2021

Visualizations dates <- as.Date(Google_Trends_Combined_Feb_Mar_2018$Day, "%m/%d/%Y") Google_Trends_Combined_Feb_Mar_2018["Day"] = dates ggplot(data = Google_Trends_Combined_Feb_Mar_2018, aes(x = Day,y = PCOS_WW, group = 1)) + geom_path() + xlab("") ggplot(data = Google_Trends_Combined_Feb_Mar_2018,aes(y = PCOS_WW))+ geom_boxplot(fill = "#...

159 sym R (2389 sym/17 pcs) 4 img

Olympic Regression

02.02.2021

Scatter Plot Let’s compare height and weight as we will surely find some correlation there! ## `geom_smooth()` using formula 'y ~ x' ## Warning: Removed 64263 rows containing non-finite values (stat_smooth). ## `geom_smooth()` using formula 'y ~ x' ## Warning: Removed 64263 rows containing non-finite values (stat_smooth). ## Warning: Removed 64...

1227 sym R (7198 sym/21 pcs) 5 img

Document

01.02.2021

Say something continuing link to previous part https://rpubs.com/nurfnick/715604 Now I can knit model = lm(W ~ PTS, data = data) summary(model) ## ## Call: ## lm(formula = W ~ PTS, data = data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.8066 -0.2762 0.2136 0.2136 0.2339 ## ## Coefficients: ## Estimate St...

143 sym R (4781 sym/36 pcs) 8 img

Olympic Contingency Table

25.02.2021

Goodness of Fit Using the Olympic data I will apply the methods of the \(\chi^2\) distribution. Let’s find a nice contingency table to look at. table(data$Medal) ## ## Bronze Gold Silver ## 13295 13372 13116 Okay so let’s ask if there is any difference in the number of medals of each type being awarded. Of course we assume that the num...

2186 sym R (8981 sym/17 pcs) 3 img

Hockey Chi-Square

25.02.2021

Contingency Tables and Tests For Independance data ## Rk X AvAge GP W L OL PTS PTS. GF GA SOW SOL SRS ## 1 1 Philadelphia Flyers 27.2 4 3 1 0 6 0.750 15 11 0 0 0.38 ## 2 2 Toronto Maple Leafs 29.0 4 3 1 0 6 0.750 14 12 0 0 1.85 ## 3 3 Vegas Golden Knights 28.8 3 3 0 0 6 1.000 1...

468 sym R (7274 sym/19 pcs) 2 img

Olympic Cross Validation

08.04.2021

Cross Validation I’d like to do cross validation in two ways. I am first going to do the traditional split, 66%. I’ll fit the model to the training data and then check out the results on the remaining 34% testing data. sports = c('Swimming', 'Tennis', 'Rowing', 'Gymnastics', 'Golf', 'Athletics', 'Bobsleigh') dataLessSports <- data[which(data...

3812 sym R (11905 sym/27 pcs) 3 img

Olympic Proposal

08.04.2021

data = read.csv('https://docs.google.com/spreadsheets/d/e/2PACX-1vSDpJqmVSks0f4vLzzcmcTfPJ8TSu4ziCNpTFy_fIY6LibZksRXzCfJYXj9qZd4NiofejxoYSkmLMwu/pub?output=csv') Olympic Medals Data Exploration For this first assignment, I want you to get into R and load your data! I discovered this data set at https://www.kaggle.com/heesoo37/120-years-of-olympi...

1412 sym R (3624 sym/5 pcs)

Hockey NonParametric

23.03.2021

This is a continuation of this project <https://rpubs.com/nurfnick/720194 Ranks Most of these methods use ranks! Let’s go ahead and show the ranks! rank(data$PTS, ties.method = "average") ## [1] 28.5 28.5 28.5 28.5 25.0 25.0 25.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 ## [16] 18.0 18.0 18.0 10.0 10.0 10.0 10.0 10.0 4.5 4.5 4.5 4.5 4.5 ...

2231 sym R (3330 sym/32 pcs)

Olympics NonParametric

23.03.2021

Olympics I want to finish the olympic project with some analysis using the non-parametric methods. I am going to try to repeat many of the analysis I have already made (except for Matched Pairs). Mann-Whitney U test or Wilcoxon Rank Sum Test I will compare the median (NOT mean!) ages between the genders. My hypotheses are \[ H_0: m_M = m_W\\ H_a...

2853 sym R (2764 sym/18 pcs) 3 img