Publications by Cameron Smith

Data 606 - Lab 1

03.09.2020

library(tidyverse) library(openintro) Exercise 1 arbuthnot$girls ## [1] 4683 4457 4102 4590 4839 4820 4928 4605 4457 4952 4784 5332 5200 4910 4617 ## [16] 3997 3919 3395 3536 3181 2746 2722 2840 2908 2959 3179 3349 3382 3289 3013 ## [31] 2781 3247 4107 4803 4881 5681 4858 4319 5322 5560 5829 5719 6061 6120 5822 ## [46] 5738 5717 5847 6203 6...

3581 sym R (2180 sym/14 pcs) 3 img

Data 607 - Homework 3

10.09.2020

library(stringr) Question 1 Using the 173 majors listed in fivethirtyeight.com’s College Majors dataset [https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/], provide code that identifies the majors that contain either “DATA” or “STATISTICS” The below code loads data driectly from fivethirtyeight’s Git...

3489 sym R (2567 sym/25 pcs)

Data 607 - Homework 1

19.08.2020

Overview For this assignment I used a data set from fivethirtyeight focused on the concern of Americans on COVID-19 infections as well as on its impact on the economy. Although only one dataset was used for this assignment, serveral were included on the website. The article and data can be found at this link. Code and Comments Load Libraries If...

2279 sym R (3298 sym/19 pcs) 2 img

Data 606 - Project 2

03.10.2020

Introduction and Approach For this project we were asked to choose three “wide” datasets, create .CSV (or alternatively databases) for them, then tidy and analyze the data. I partnered with two others from Data 607 for this project - Karim Hammoud and Jack Write. Our methodology was to each take one dataset to work on and then share the code ...

9228 sym R (32096 sym/61 pcs) 7 img

Data 607 - Homework 5

25.09.2020

library(tidyverse) Introduction This assignment is focused on tidying data via the various tools within the Tidyverse package. The key verbs of the tidying process are: mutate, select, filter, summarise and arrange. For practice I have used each of them below to tidy a set of data focused on flight statistics. Load data from Github df_untidy <-...

2262 sym R (2949 sym/10 pcs) 3 img

Data 607 - Project 1

19.09.2020

Introduction This project was focused on the analysis of chess tournament data. Using cross-tables from a text file we needed to extract the key data using regex, transform it into a usable form, and then analyze specific elements including in particular the pre-ratings of each player and their opponents. Load library and read File The Tidyverse...

2140 sym R (4711 sym/14 pcs)

Data 607 - Week 7 Assignment

10.10.2020

library(tidyverse) library(RCurl) library(XML) library(rjson) Introduction and Approach For this assignment I prepared 3 separate files with the same data in order to practice loading the formats into R, convert them to data frames and compare the differences. The 3 formats are: HTML, JSON and XML. Each file contained data on books, including...

1895 sym R (1658 sym/15 pcs)

Data 606 - Lab 7

18.10.2020

library(tidyverse) library(openintro) library(infer) Exercise 1 What are the cases in this data set? How many cases are there in our sample? This sample includes 13,583 observations, with 14 different variables. data(yrbss) glimpse(yrbss) ## Rows: 13,583 ## Columns: 13 ## $ age <int> 14, 14, 15, 15, 15, 15, 15, 14, 15, ...

4812 sym R (9505 sym/50 pcs) 4 img

Data 606 Presentation - Exercise 7.29

27.10.2020

Question Text 7.29: Chicken diet and weight, Part II Page 276 Casein is a common weight gain supplement for humans. Does it have an effect on chickens? Using data provided in Exercise 7.27, test the hypothesis that the average weight of chickens that were fed casein is different than the average weight of chickens that were fed soybean. If your h...

1410 sym R (487 sym/2 pcs)

Data 607 - Week 9 Homework

21.10.2020

library(tidyverse) library(knitr) library(httr) library(jsonlite) Introduction / description For this assignment our task was to choose one of the New York Times APIs, construct an interface in R to read the JSON data, and transform it into an R DataFrame. Overview of approach I chose the “Most Popular” API which makes available information...

2862 sym R (912 sym/9 pcs)