Publications by Wilson Chau

Data 607 Assignment 4

05.10.2022

Assignment 4 Task: Create a .CSV file (or optionally, a MySQL database!) that includes all of the information above. You’re encouraged to use a “wide” structure similar to how the information appears above, so that you can practice tidying and transformations as described below. Read the information from your .CSV file into R, and use tidy...

2481 sym R (5435 sym/25 pcs)

Data_607_Project_2_Part_1

09.10.2022

Introduction This project is meant to give us some practice on using different datasets for analysis work. I chose three different datasets, this is the first dataset made by me. # Coffee Price by Wilson Chau I took this dataset from my own findings. I saw that this data set focused on coffee prices, which is my favorite topic: Coffee and Money. ...

1851 sym R (3785 sym/14 pcs)

Data_3_Data_607_Project_2

09.10.2022

#Introduction On my last data set I wanted to find a dataset that has a lot of columns and rows. The one I chose is from Melvin Matanos and his dataset on wine. This dataset has a lot of columns and my focus is trying to find the wine type with quality, quality_label,resiudal.sugar, alcohol, and wine_type. There can be some analysis done with the...

1325 sym R (2696 sym/11 pcs) 3 img

Data_2_607_Project_2

09.10.2022

#Introduction In the 2nd dataset I chose for my project 2. I had a very simple dataset found by Wilson Ng. I wanted to see how much data manipulation I can play with in order to get some analytical work on this data set. I also want to focus more on transforming the data and maybe adding more value. Maybe changing the type of data collected as we...

1234 sym R (1195 sym/9 pcs) 1 img

Data 606 Assignment 5 Part 1(Foundations for statistical inference - Sampling distributions)

12.10.2022

library(tidyverse) library(openintro) library(infer) global_monitor <- tibble( scientist_work = c(rep("Benefits", 80000), rep("Doesn't benefit", 20000)) ) ggplot(global_monitor, aes(x = scientist_work)) + geom_bar() + labs( x = "", y = "", title = "Do you believe that the work scientists do benefit people like you?" ) + coord_fl...

6025 sym R (8488 sym/33 pcs) 3 img

Data606_Assignment5b_Foundations for statistical inference - Confidence intervals

12.10.2022

library(tidyverse) library(openintro) library(infer) The data A 2019 Pew Research report states the following: To keep our computation simple, we will assume a total population size of 100,000 (even though that’s smaller than the population size of all US adults). Roughly six-in-ten U.S. adults (62%) say climate change is currently affecting ...

7390 sym R (2338 sym/16 pcs) 1 img 1 tbl

Data 607 Assignment 5

17.10.2022

Assignment Instruction: Pick three of your favorite books on one of your favorite subjects. At least one of the books should have more than one author. For each book, include the title, authors, and two or three other attributes that you find interesting. Take the information that you’ve selected about these three books, and separately create t...

1493 sym R (168 sym/2 pcs)

Data 606 lab 6

17.10.2022

title: “Inference for categorical data” author: “” output: pdf_document: default html_document: includes: in_header: header.html css: ./lab.css highlight: pygments theme: cerulean toc: true toc_float: true editor_options: chunk_output_type: console — library(tidyverse) library(openintro) library(infer) set.seed(74226) Getting Started ...

7009 sym R (2763 sym/15 pcs) 1 img

Data 606 Lab 7

24.10.2022

library(tidyverse) library(openintro) library(infer) The data Every two years, the Centers for Disease Control and Prevention conduct the Youth Risk Behavior Surveillance System (YRBSS) survey, where it takes data from high schoolers (9th through 12th grade), to analyze health patterns. You will work with a selected group of variables from a ran...

3682 sym R (2608 sym/19 pcs) 1 img

Data 607 Web API

28.10.2022

Assignment: The New York Times web site provides a rich set of APIs, as described here: https://developer.nytimes.com/apis You’ll need to start by signing up for an API key. Your task is to choose one of the New York Times APIs, construct an interface in R to read in the JSON data, and transform it into an R DataFrame. Step 1(Library): Trying t...

1115 sym R (3430 sym/11 pcs)