Publications by Nguyen Chi Dung

patchwork: A Powerful Library for Plot Composition

27.11.2024

Patchwork: A Powerful Library for Plot Composition The patchwork library in R is a versatile tool designed to simplify the arrangement and combination of multiple ggplot2 plots into complex layouts. Introduced by Pedersen (2020), patchwork leverages an intuitive “+” operator, allowing users to effortlessly stack, align, or grid plots withou...

1525 sym Python (6569 sym/1 pcs) 1 img

Fuzzy Matching: A Real-world Application

27.11.2024

Fuzzy Matching Fuzzy Matching is a technique for identifying approximate matches between strings, useful when exact matching fails due to typographical errors or variations in data. It employs algorithms like Levenshtein Distance or Jaccard Similarity to measure string similarity. Applications include data deduplication, record linkage, and te...

2047 sym Python (3552 sym/1 pcs) 1 img

Create Elegant Choropleth Map using ggplot2

26.11.2024

Choropleth Maps A choropleth map is a thematic map that uses different shades or patterns to represent a variable across geographic regions. It is commonly used in research, economics, social studies, and political science to visualize spatial patterns and trends. For example, choropleth maps are often employed to display data such as populatio...

2469 sym Python (3734 sym/1 pcs) 1 img

Fuzzy Matching: A Real-world Application

26.11.2024

Fuzzy Matching Fuzzy Matching is a technique for identifying approximate matches between strings, useful when exact matching fails due to typographical errors or variations in data. It employs algorithms like Levenshtein Distance or Jaccard Similarity to measure string similarity. Applications include data deduplication, record linkage, and te...

2027 sym Python (3552 sym/1 pcs) 1 img

Centralize Label for Bar Plot: The Case of Student Feedback on Teaching Survey

06.11.2024

Introduction When visualizing data using a Bar Plot, there are situations where displaying percentage labels for all categories becomes unfeasible due to overlapping text. This often occurs when the proportion of one or more categories is significantly small and disproportionate compared to others. To handle this issue, there are two possible s...

725 sym Python (6966 sym/1 pcs) 2 img

AUC, Accuracy or Profit: Which Metric Is More Important?

21.10.2024

Limitations of ROC-AUC and Accuracy In binary classification problems, both Accuracy and ROC-AUC have limitations that may impact the evaluation of model performance, particularly when working with imbalanced datasets. Accuracy is often misleading in cases where the classes are imbalanced. For instance, in datasets where the majority class domi...

7089 sym 1 img 1 tbl

ROC-AUC, Accuracy or Profit: Which Metric Is More Important?

18.10.2024

Limitations of ROC-AUC and Accuracy In binary classification problems, both Accuracy and ROC-AUC have limitations that may impact the evaluation of model performance, particularly when working with imbalanced datasets. Accuracy is often misleading in cases where the classes are imbalanced. For instance, in datasets where the majority class domi...

7216 sym 1 img 1 tbl

Data Wrangling and Visualization with R (Course 16, Hanoi - November 2024)

26.09.2024

Data Wrangling and Visualization with R (Course 16, Hanoi - November 2024) R Data Science Series Data Wrangling and Visualization with R (Course 16, Hanoi - November 2024) Course Introduction Objectives Some Key Definitions Why Data Wrangling Is Necessary? Why Data Visualization Is Necessary? Final Products Data Used Softwa...

7960 sym 6 img

Beautiful Black Theme

25.09.2024

CSS for Our Black Theme A CSS (Cascading Style Sheets) document is a file that contains CSS code, which is a language used to describe the style and formatting of a document written in HTML (Hypertext Markup Language) or XML (eXtensible Markup Language). CSS documents provide a powerful and flexible way to control the visual presentation of web...

5881 sym 1 img

Super creative title

25.09.2024

R Markdown default(), tango, pygments, kate, monochrome, espresso, zenburn, haddock, breezedark, textmate, arrow, or rstudio or a file with extension .theme. This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.c...

749 sym 1 img