Publications by Jie Zou

622: hw1

10.10.2022

Data Summary Visualization Pre-processing Data Splitting Modeling Performance 622: hw1 Jie Zou 2022-10-10 Data Summary Summary in general The dimension of small data set is (1000, 14) The dimension of large data set is (100000, 14) Among these features, 7 of them are categorical and the rest is numerical(integer + ...

4427 sym 29 img

622: hw2

24.10.2022

About Data Visualization Modeling Model Comparison Conclusion 2022-10-24 About Data Data Description The data is obtained from Kaggle. Following will be the description of each feature. 1. customer_id: account number. 2. credit_score: credit score. 3. country: country of residence. 4. gender: gender. 5. age: age. 6. tenure: from h...

3670 sym Python (3713 sym/7 pcs) 8 img

622: hw3

14.11.2022

Objectives HW Modification Data Exploration Feature Selection Preprocessing Model Training Model Performance Discussion HW3: SVM 2022-11-13 Objectives Perform an analysis of the dataset used in Homework #2 using the SVM algorithm. Compare the results with the results from previous homework. Answer questions, such as: ...

5070 sym 6 img

622: final

15.12.2022

Objectives Read Data Basic Cleaning Visualization Curse of Dimension Data Preparation Modeling Performance Conclusion Concern 622: hw4_final Jie Zou 2022-12-14 Objectives In the final project, I am going to try to do email classification. The dataset is coming from Kaggle. The technologies for classification that I w...

6084 sym 4 img 2 tbl

Assignment_1

05.02.2021

#Intro: This is a study of alcohol consumption over the world, where the alcohol are divided into three main groups: wine, beer and spirit. The unit of measurement is the servings consumed per person with standard serving size, which is glasses for wine, cans for beer and shots for spirits. article from https://fivethirtyeight.com/features/dear-m...

1318 sym R (1071 sym/5 pcs)

Assignment3

18.02.2021

library(tidyverse) identify the majors that contain wither “DATA” or “STATISTICS” from https://fivethirtyeight.com/features/the-economic-guide-to-picking-a-college-major/ major<-read.csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/college-majors/majors-list.csv") major<-data.frame(major) data_or_stat<- str_subset(major...

1564 sym R (817 sym/5 pcs)

607_p1: Chess tournament

28.02.2021

Main Data Processing Read txt file \(warn = FALSE\): don’t show the warnings while reading the file file<-readLines("tournamentinfo.txt", warn = FALSE) head(file) ## [1] "-----------------------------------------------------------------------------------------" ## [2] " Pair | Player Name |Total|Round|Round|Round|Round|Rou...

2309 sym R (4909 sym/31 pcs)

a_9: Web APIs

05.04.2021

a_9: Web APIs Jie Zou 2021-04-05 Load packages httr: interact with web api jsonlite: work with .json file library(httr) library(tidyverse) library(jsonlite) library(DT) Http request check if web api process well url <- "https://api.nytimes.com/svc/movies/v2/critics/all.json?api-key=somr1GPKGZfJ1wfSsgkLuqNj8YX8GYuR" http_request <- GET(url) h...

541 sym R (1097 sym/6 pcs)

p2: tidy & transform data

14.03.2021

p2: Tidy and Transform Three data set Jie Zou 2021-03-14 Case I: covid_data.csv Intro: The status of different cases of COVID-19 Read the file covid <- read.csv("https://raw.githubusercontent.com/Sugarcane-svg/R/main/R607/Projects/p2/p2_covid_data.csv") head(covid, 8) ## X sl_no country total_cases new_cases total_deaths new_deaths ## 1...

6244 sym R (25453 sym/60 pcs) 6 img

a4: tidying and transforming data

13.03.2021

Create a .CSV file The data from provided image is pretty small, therefore I manually enter these data into Excel with exactly the same format showed in the image, and save it as .CSV file and upload it into git first. Read .CSV file df<-read.csv("https://raw.githubusercontent.com/Sugarcane-svg/R/main/R607/Assignments/a4/a_4.csv") kable(df) X X...

2213 sym R (4549 sym/24 pcs) 3 img 1 tbl