Publications by Dat Ngo
Document
1. Project introduction In this project, I used data from a survey of Covid-19’s impact to student life in Dimension reduction and Clustering data in Kaggle (https://www.kaggle.com/kunal28chaturvedi/covid19-and-its-impact-on-students). The project includes 8 parts: - Data proceessing - Analysis of correlation - MDS - PCA - Clustering - Data ana...
1537 sym R (16011 sym/57 pcs) 14 img
Document
1. Project introduction In this project, I used clustering methods in grouping kiva.org loans for 3 ASEAN countries (Vietnam, Cambodia and Philippines). The project includes 6 parts: - Data processing - Measure the clustering tendency - Optimal number of clusters - Data clustering - Assessing clustering quality - Analysis of clusters. 2. Data ...
2462 sym R (7362 sym/38 pcs) 11 img
Document
You can turn parallel sections to tabs in html_document output. Results Plots We show a scatter plot in this section. par(mar = c(4, 4, .5, .1)) plot(mpg ~ hp, data = mtcars, pch = 19) Tables We show the data in this tab. head(mtcars) ## mpg cyl disp hp drat wt qsec vs am gear carb ## Mazda RX4 21.0 6 160 110...
171 sym R (581 sym/3 pcs) 1 img
Document
R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com. When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within t...
594 sym R (262 sym/2 pcs) 1 img
Document
Database investigation kiva <- read.csv('kivaAsean.csv', stringsAsFactors = F) head(kiva,10) ## loanId country loanTerm loanAmount sectorName ## 1 2090948 Philippines 8 225 Agriculture ## 2 2078038 Vietnam 20 1300 Agriculture ## 3 2076105 Philippines 14 425 Agriculture ## 4 2081979 Philippin...
516 sym R (5533 sym/39 pcs) 10 img
Document
1. Project introduction In this project, I time Times Higher Education’s World University Rankings 2020 from kaggle (https://www.kaggle.com/joeshamen/world-university-rankings-2020) to implement dimension reduction. The project includes 5 parts: - Data processing - Analysis of correlation - Optimal number of clusters - Data clustering - Assessi...
1739 sym R (11660 sym/31 pcs) 6 img
Document
setwd('//media//datngo//Driver1//0.Downloads') library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(tidyr) library(ggplot2) library(data.table) ## ## Attachin...
32 sym R (9287 sym/19 pcs) 7 img
Document
setwd('//media//datngo//Driver1//0.Downloads') library(dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library(tidyr) library(ggplot2) library(data.table) ## ## Attachin...
21 sym R (6696 sym/14 pcs) 1 img
Sentiment Analysis on Coursera courses’ reviews
Acknowledgements We would like to thank Dr. Karolina Kuligowska, University of Warsaw for instructing us in the course “Text Mining & Social Media Mining” and Mr. Roshan Sharma in scraping and sharing Coursera courses’ reviews data set on the Kaggle platfrom. This project cannot be completed without their valuable contribution. 1. Introd...
9249 sym R (17438 sym/30 pcs) 4 img 5 tbl