Publications by Nicholas Jacob
Hypothesis Testing
Olympic Medals Data Exploration I discovered this data set at https://www.kaggle.com/heesoo37/120-years-of-olympic-history-athletes-and-results. In order to create my data file, I clicked the icon to upload to Google sheets, published it to the web as a csv and copied a link. Some data will be possible to upload to github but if the file is large...
3480 sym R (17337 sym/56 pcs) 7 img
Gunners Hypothesis
Visualizations dates <- as.Date(Google_Trends_Combined_Feb_Mar_2018$Day, "%m/%d/%Y") Google_Trends_Combined_Feb_Mar_2018["Day"] = dates ggplot(data = Google_Trends_Combined_Feb_Mar_2018, aes(x = Day,y = PCOS_WW, group = 1)) + geom_path() + xlab("") ggplot(data = Google_Trends_Combined_Feb_Mar_2018,aes(y = PCOS_WW))+ geom_boxplot(fill = "#...
159 sym R (2389 sym/17 pcs) 4 img
Olympic Regression
Scatter Plot Let’s compare height and weight as we will surely find some correlation there! ## `geom_smooth()` using formula 'y ~ x' ## Warning: Removed 64263 rows containing non-finite values (stat_smooth). ## `geom_smooth()` using formula 'y ~ x' ## Warning: Removed 64263 rows containing non-finite values (stat_smooth). ## Warning: Removed 64...
1227 sym R (7198 sym/21 pcs) 5 img
Document
Say something continuing link to previous part https://rpubs.com/nurfnick/715604 Now I can knit model = lm(W ~ PTS, data = data) summary(model) ## ## Call: ## lm(formula = W ~ PTS, data = data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.8066 -0.2762 0.2136 0.2136 0.2339 ## ## Coefficients: ## Estimate St...
143 sym R (4781 sym/36 pcs) 8 img
Olympic Contingency Table
Goodness of Fit Using the Olympic data I will apply the methods of the \(\chi^2\) distribution. Let’s find a nice contingency table to look at. table(data$Medal) ## ## Bronze Gold Silver ## 13295 13372 13116 Okay so let’s ask if there is any difference in the number of medals of each type being awarded. Of course we assume that the num...
2186 sym R (8981 sym/17 pcs) 3 img
Hockey Chi-Square
Contingency Tables and Tests For Independance data ## Rk X AvAge GP W L OL PTS PTS. GF GA SOW SOL SRS ## 1 1 Philadelphia Flyers 27.2 4 3 1 0 6 0.750 15 11 0 0 0.38 ## 2 2 Toronto Maple Leafs 29.0 4 3 1 0 6 0.750 14 12 0 0 1.85 ## 3 3 Vegas Golden Knights 28.8 3 3 0 0 6 1.000 1...
468 sym R (7274 sym/19 pcs) 2 img
Olympic Cross Validation
Cross Validation I’d like to do cross validation in two ways. I am first going to do the traditional split, 66%. I’ll fit the model to the training data and then check out the results on the remaining 34% testing data. sports = c('Swimming', 'Tennis', 'Rowing', 'Gymnastics', 'Golf', 'Athletics', 'Bobsleigh') dataLessSports <- data[which(data...
3812 sym R (11905 sym/27 pcs) 3 img
Olympic Proposal
data = read.csv('https://docs.google.com/spreadsheets/d/e/2PACX-1vSDpJqmVSks0f4vLzzcmcTfPJ8TSu4ziCNpTFy_fIY6LibZksRXzCfJYXj9qZd4NiofejxoYSkmLMwu/pub?output=csv') Olympic Medals Data Exploration For this first assignment, I want you to get into R and load your data! I discovered this data set at https://www.kaggle.com/heesoo37/120-years-of-olympi...
1412 sym R (3624 sym/5 pcs)
Hockey NonParametric
This is a continuation of this project <https://rpubs.com/nurfnick/720194 Ranks Most of these methods use ranks! Let’s go ahead and show the ranks! rank(data$PTS, ties.method = "average") ## [1] 28.5 28.5 28.5 28.5 25.0 25.0 25.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 ## [16] 18.0 18.0 18.0 10.0 10.0 10.0 10.0 10.0 4.5 4.5 4.5 4.5 4.5 ...
2231 sym R (3330 sym/32 pcs)
Olympics NonParametric
Olympics I want to finish the olympic project with some analysis using the non-parametric methods. I am going to try to repeat many of the analysis I have already made (except for Matched Pairs). Mann-Whitney U test or Wilcoxon Rank Sum Test I will compare the median (NOT mean!) ages between the genders. My hypotheses are \[ H_0: m_M = m_W\\ H_a...
2853 sym R (2764 sym/18 pcs) 3 img