Publications by Alexander Ng
Data 621 Blog 2 Jittering in Plots
Introduction In this blog, we illustrate the role of jittering in graphical analysis of datasets through an example in financial markets. We argue that jittering can provide useful information about the distribution of the data and the relation between variables. The goal is not to focus on the command syntax in the ggplot2 library in which this ...
4330 sym R (968 sym/4 pcs) 2 img 2 tbl
DATA 621 Blog 5 Classification Tree Vs. Human Classification
Introduction In this blog, we examine a classification tree and visualize its decision making hierarchy versus a manually created hierarchy produced for the crime data set using in HW Assignment 3. We then compare the classification tree generated using the algorithm under the standard approach with my manually created hierarchy. Background The ...
3468 sym R (4307 sym/21 pcs) 1 img 5 tbl
Decline in Manufacturing Employment and Marriage Rates
1 Introduction In this paper, we weave together two strands of economic research: international trade and household economics to study the impact of import competition from China on US labor markets and the concurrent impact on US marriage rates. China has emerged as the second largest economy in the world in the past three decades driven by its ...
35699 sym R (27768 sym/1 pcs) 4 img 3 tbl
Data 608 Assignment 4 Scratch Work in NYC Trees
library(ggplot2) library(ggrepel) library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(plotly) ## ## Attaching package: 'plotly' ## The following object is ma...
48647 sym R (19226 sym/56 pcs) 3 img
StreamGraph Example To Complement d3.js version
StreamGraph Alexander Ng 11/14/2020 An R Version of the D3.js Streamgraph Example Leveraging the R package streamgraph which requires htmlwidgets, you can build the same streamgraph in R in about 3 lines of R code. The package has some limitations, however. For example, no chart title seems to be possible on the interactive version - at least as ...
1545 sym R (668 sym/4 pcs)
Data 622 HW1 Spring 2021 - Palmer Penguins
Introduction This assignment analyzes the Palmer Penguins dataset using binary logistic and multinomial regression. The dataset contains the physical measurements and gender of individuals over a 3 year period 2007-2009 from three closely related species: Gentoo, chinstrap and Adelie in the Palmer Archipelago, Antarctica. In the first section, Pr...
12369 sym R (11762 sym/16 pcs) 12 img 6 tbl
Data 622 HW3 Group 6: KNN, Decision Tree, Random Forest, Gradient Boost
Introduction This document discusses analyses of two datasets, the Palmer Penguin dataset and a Loan Approvals dataset prepared by Group 6. We divide the document into five parts and adopted two key principles to undertaking this analysis: First, group has developed a system of checks and balances in preparing each model’s output. A primary and...
26014 sym R (23624 sym/9 pcs) 14 img 13 tbl
Data 622 HW 3 Decision Tree Component
1 Decision Tree Model This section uses the tidymodels framework to implement the CART decision tree model. It is one part of the submission for HW3 Group 6 but is rendered separately to avoid variable collision in code. Now we create our model recipe. This recipe object describes the dependent and independent variables we wish to use and the dat...
2067 sym R (4969 sym/7 pcs) 1 img
Data 622 HW3 Draft: RF Portion final only
Introduction This document discusses analyses of two datasets, the Palmer Penguin dataset and a Loan Approvals dataset prepared by Group 6. We divide the document into five parts and adopted two key principles to undertaking this analysis: First, group has developed a system of checks and balances in preparing each model’s output. A primary and...
17249 sym R (17159 sym/4 pcs) 8 img 13 tbl
Data 622 Homework 3: Template for Group
Introduction This document discusses analyses of two datasets, the Palmer Penguin dataset and a Loan Approvals dataset prepared by Group 6. We divide the document into five parts and adopted two key principles to undertaking this analysis: First, group has developed a system of checks and balances in preparing each model’s output. A primary and...
13584 sym R (11221 sym/3 pcs) 3 img 10 tbl