Publications by Dennis Pong
Data 608 Module 1
Principles of Data Visualization and Introduction to ggplot2 I have provided you with data about the 5,000 fastest growing companies in the US, as compiled by Inc. magazine. lets read this in: # install.packages("tidyr") # loading libraries library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats':...
2134 sym R (6289 sym/21 pcs) 3 img
Data 608 Final Project Proposal
Data 608 - Final Project Proposal Background Objective to confirm that to allow for Data Source Data Analysis Visualization References: 2021-05-15 Data 608 - Final Project Proposal Background The pandemic has driven corporate America, esp. in SF Bay Area and NY to allow for remote work, which has, in turn, driven the housing demand for s...
3586 sym 3 img
Data 622 Homework 4: Mental Health Data Modeling
Code Show All Code Hide All Code Data 622 Homework 4: Mental Health Data Modeling Data 622 Homework 4: Mental Health Data Modeling Loading data Data Processing Steps Loading from adhd_data_3.rds Clustering Method Principal Component Analysis Xtreme Gradient Boosting (XGBoost) Support Vector Machine (SVM) Modeling Comparison Conclusion Gr...
8817 sym R (65902 sym/139 pcs) 12 img 9 tbl
Data 622 HW 3
Code Show All Code Hide All Code Homework3 Homework3 1. Exploratory Data Analysis Data Prep for Model-fitting Additional Data Processing / Manipulation Steps Imbalanced Dataset 2. Linear Discriminant Analysis 3. K-nearest Neighbor 4. Decision Trees 5. Random Forests 6. Model Performance Devin Teran, Dennis Pong, Richard Zheng, Katie Evers ...
9653 sym R (32866 sym/85 pcs) 18 img 21 tbl
Data 622 HW 3 - KNN
Code Show All Code Hide All Code DATA 622 HW3 DATA 622 HW3 K-Nearest Neighbors Processing Additional Data (Processing) Manipulation Steps Modeling with KNN Dennis Pong 2021-10-19 library(knitr) library(rmdformats) library(tidyverse) ## ── Attaching packages ─────────────────────────�...
2960 sym R (22512 sym/66 pcs) 2 img
Data 622 HW 4
Code Show All Code Hide All Code Data 622 Homework 4: Mental Health Data Modeling Data 622 Homework 4: Mental Health Data Modeling Loading data Data Processing Steps Clustering Method Principal Component Analysis Xtreme Gradient Boosting (XGBoost) Support Vector Machine (SVM) Group 4: Dennis Pong, Katie Evers, Richard Zheng, Devin Teran ...
3703 sym R (52320 sym/122 pcs) 9 img 8 tbl
Discussion Week 2
A story by James R. Hagerty entitled With Buyers Sidelined, Home Prices Slide published in the Thursday October 25, 2007 edition of the Wall Street Journal contained data on so-called fundamental housing indicators in major real estate markets across the US. The author argues that…prices are generally falling and overdue loan payments are pilli...
1570 sym R (1101 sym/7 pcs)
Ex 4.5
For the fat data used in this chapter, a smaller model using only age, weight, height and abdom was proposed on the grounds that these predictors are either known by the individual or easily measured. Compare this model to the full thirteen-predictor model used earlier in the chapter. Is it justifiable to use the smaller model? data(fat, packag...
4508 sym R (4125 sym/15 pcs)
Data 621 Discussion #9
Question (from Extending the Linear Model with R [ELMR], p.73) The dvisits data comes from the Australian Health Survey of 1977–78 and consist of 5190 single adults where young and old have been oversampled. (a) Build a Poisson regression model with doctorco as the response and sex, age, agesq, income, levyplus, freepoor, freerepa, illness, ac...
3209 sym R (8245 sym/20 pcs) 2 img
Data 621 HW 2
library("dplyr") library("ggplot2") library("knitr") Steps 1. Download the Data df <- read.csv("https://raw.githubusercontent.com/ezaccountz/DATA_621/main/HW2/classification-output-data.csv") 2. Confusion matrix The data set has three key columns we will use: class: the actual class for the observation scored.class: the predicted class for th...
4038 sym R (5675 sym/35 pcs) 2 img 2 tbl