Publications by Wilson Chau

Data 607 Tidyverse Create Example

30.10.2022

I took this dataset from Tidyverse packages consists of many different uses. These are the few I looked up: ggplot dplyr tidyr readr purrr tibble stringr forcats I wanted to focus on dplyr because of the package’s abiltiy to dtransform and summarize tabular data with row. I also wanted to focus more on having descriptive data with some numerica...

1278 sym Python (139578 sym/10 pcs)

Research Proposal Data 606

31.10.2022

Data Preparation I downloaded the lastest dataset from 2021. I uploaded it into github and reading the RAW data from github. # load data # examine all available CES microdata files. library(tidyverse) happiness <- read.csv("https://raw.githubusercontent.com/Wilchau/Data606_Happiness_Project/main/world-happiness-report-2021.csv.xls") Research que...

2281 sym R (8281 sym/13 pcs) 3 img

Data 607 Discussion 11

03.11.2022

#Assignment It can be useful to be able to classify new “test” documents using already classified “training” documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. For this project, you can start with a spam/ham dataset, then predict the class of new docum...

3359 sym R (3499 sym/8 pcs)

Data 607 Project 4 Training Documents

20.11.2022

#Assignment It can be useful to be able to classify new “test” documents using already classified “training” documents. A common example is using a corpus of labeled spam and ham (non-spam) e-mails to predict whether or not a new document is spam. For this project, you can start with a spam/ham dataset, then predict the class of new docum...

1127 sym R (1548 sym/10 pcs)

Data 606 Assignment 8

22.11.2022

Getting Started Load packages In this lab, you will explore and visualize the data using the tidyverse suite of packages. The data can be found in the companion package for OpenIntro resources, openintro. Let’s load the packages. library(tidyverse) library(openintro) data('hfi', package='openintro') The data The data we’re working with is i...

6449 sym R (3827 sym/22 pcs) 6 img

Data 606 Lab 9

30.11.2022

Getting Started Completing Lab assignment 9 ### Load packages In this lab, you will explore and visualize the data using the tidyverse suite of packages. The data can be found in the companion package for OpenIntro resources, openintro. Let’s load the packages. library(tidyverse) library(openintro) This is the first time we’re using the GGall...

10889 sym R (8760 sym/29 pcs) 13 img

Data 607 Final Project

05.12.2022

Introduction: I am interested in dealing with data that is relevant to my life and how I am feeling. Lately I noticed that a lot of my colleagues are more concern about their own happiness and well being. This lead me to want to focus on contribution to happiness. I found a dataset on kaggle about countries and their happiness being measured. Com...

4722 sym R (13374 sym/35 pcs) 6 img

Data 607 Assignment 2 SQL/R

05.12.2022

#Introduction I created a google survey form and received a few data collected. I transfer the csv file into github and got the data cleaned up and started on this assignment. library(RMySQL) ## Loading required package: DBI library("dplyr") ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## fil...

661 sym R (2041 sym/17 pcs) 2 img

Data 606 Final Project: Happiness in 2021

06.12.2022

Part 1 - Introduction Happiness is an emotional state characterized by feelings of joy, fulfillment, satisfaction and bliss. Over the past few years from pre-pandemic times to post pandemics. Many individuals have faced some sort of adversity. Covid lock down has given many individuals opportunities to self explore and do a deep dive analysis on ...

6630 sym R (14079 sym/26 pcs) 5 img

Data 606 Lab 7

06.12.2022

library(tidyverse) library(openintro) library(infer) (require(stats)) ## [1] TRUE library(stats) The data Every two years, the Centers for Disease Control and Prevention conduct the Youth Risk Behavior Surveillance System (YRBSS) survey, where it takes data from high schoolers (9th through 12th grade), to analyze health patterns. You will work w...

4829 sym R (8361 sym/54 pcs) 1 img