Publications by Phuong Linh

Ranking Method

07.03.2022

1 Ranking based on row value mydf <- read.table(text="date - site1 - site2 - site3 - site4 1/1/00 - 24 - 33 - 10 - 13 2/1/00 - 13 - 25 - 6 - 2", sep="-", header=TRUE) # find ranks t(apply(-mydf[-1], 1, rank)) ## site1 site2 site3 site4 ## [1,] 2 1 4 3 ## [2,] 2 1 3 4 # add to your dates mydf.rank <- cbind(myd...

347 sym R (4400 sym/18 pcs)

data wrangling_app usage behavior (2)

16.03.2022

1 Library Loading library(tidyverse) library("readxl") library("writexl") library(tidyr) library(dplyr) library(lubridate) 2 Data Wrangling 2.1 App usage analysis 2.1.1 Total usage time data38id <- data38id %>% arrange(desc(data38id$TIME)) %>% mutate(id = rownames(data38id)) usagetime <- Aug30_usagetime %>% group_by(UserID,ActiveDate) %>% sum...

282 sym R (3121 sym/11 pcs) 4 tbl

Customer segmentation

17.03.2022

1 Library Loading library(ggplot2) library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(pastecs) ## ## Attaching package: 'pastecs' ## The following objects a...

580 sym R (8685 sym/52 pcs) 11 img

data wrangling_churn user

17.03.2022

1 Library Loading library(tidyverse) ## -- Attaching packages ------------------------------------------------------------------- tidyverse 1.3.0 -- ## v ggplot2 3.2.1 v purrr 0.3.3 ## v tibble 2.1.3 v dplyr 0.8.4 ## v tidyr 1.0.2 v stringr 1.4.0 ## v readr 1.3.1 v forcats 0.4.0 ## -- Conflicts ---------------------------...

186 sym R (321814 sym/21 pcs) 4 img

Transport Demand Modeling - EDA

19.03.2022

1 Library Loading library(dplyr) library(tidyverse) library(readxl) library(tidyverse) library(skimr) # Library used for providing a summary of the data library(DataExplorer) # Library used in data science to perform exploratory data analysis library(corrplot) 2 Data loading dataset <- read_excel("TDM_Class3_MLR_Chicago_Example.xls") 3 Explo...

898 sym R (4470 sym/43 pcs) 6 img 2 tbl

Transport Demand Modeling - Multiple Linear Regression Model

20.03.2022

1 Import library library(readxl) #Library used to import excel files library(tidyverse) # Pack of most used libraries library(skimr) # Library used for providing a summary of the data library(DataExplorer) library(corrplot) # Library used for correlation plots library(car) # Library used for testing autocorrelation (Durbin Watson) library(olsrr)...

579 sym R (5843 sym/24 pcs) 3 img 2 tbl

Transport Demand Modelling - Cluster Analysis

02.04.2022

library(readxl) # Reading excel files library(skimr) # Summary statistics library(tidyverse) # Pack of useful tools library(mclust) # Model based clustering library(cluster) # Cluster analysis library(factoextra) # Visualizing distances 1 Import Dataset as a dataframe dataset <- read_excel("Data_Aeroports_Clustersv1.xlsx") df <- data.frame(datas...

1226 sym R (11891 sym/37 pcs) 16 img 3 tbl

Transport Demand Modelling - Generalized Linear Models

02.04.2022

library(readxl) # reading excel files library(skimr) # summary statistics library(tidyverse) # Pack of useful tools library(DataExplorer) # Exploratory data analysis library(MASS) # Negative binomial regression library(vcd) # Godness of fit parameters library(car) # Goodness of fit library(rcompanion) # Goodness of fit library(popbio) # calculate...

658 sym R (12070 sym/42 pcs) 2 img 3 tbl

Transport Demand Modeling - Hazard-Based Duration Models

03.04.2022

1 Objectives Plot the Kaplan-Meier estimate of the duration of time that commuters delay their work-to-home trips; Determine the significant factors that affect the duration of commuters’ delay using a Cox model; Examine the work-to-home departure delay using exponential, Weibull, and log-logistic proportional-hazards models. Hai cách tiếp...

2732 sym R (11201 sym/31 pcs) 7 img 3 tbl

Survival Analysis: Kaplan - Meir Model vs Cox Regression

03.04.2022

1 Import Data library(magrittr) library(tidyverse) rm(list = ls()) ChurnStudy <- read.csv("churn_data.csv") ChurnStudy %>% str() ## 'data.frame': 2000 obs. of 10 variables: ## $ Churn : int 1 0 0 1 1 1 1 0 0 0 ... ## $ Xeducation : chr "Master's Degree" "Bachelor's Degree" "Bachelor's Degree" "Bachelor's Degree" ... ## ...

3369 sym R (25579 sym/78 pcs) 23 img 3 tbl