Publications by HUNG.NGUYEN
TUNE AND INTERPRETE DECISION TREE FOR WIND TURBINES
Explore turbines data In this practice, we will explore the turbine data in Canada, explore factor affecting turbine’s capacity, and apply decision tree to predict tubines’s capacity based on thier characteristic. Detail description of the data frame: https://github.com/rfordatascience/tidytuesday/blob/master/data/2020/2020-10-27/readme.md Le...
1911 sym R (7216 sym/17 pcs) 2 img
Forecast house prices
Source of data and information. https://www.kaggle.com/c/house-prices-advanced-regression-techniques Load necessary libraries Data loading and exploration Load data from Kaggle URL Detail infomation about the train dataset and test dataset skim(train) Data summary Name train Number of rows 1460 Number of columns 81 _______________________ ...
1513 sym R (7721 sym/30 pcs) 1 img 6 tbl
TITANIC SUVIVAL PREDICTION USING RANDOM FOREST MODEL
library(readr) library(janitor) library(tidyverse) library(tidymodels) library(skimr) STEP 1: LOAD AND PREPROCESS DATA 1.1 loading data test <- read_csv("test.csv") ## ## -- Column specification -------------------------------------------------------- ## cols( ## PassengerId = col_double(), ## Pclass = col_double(), ## Name = co...
825 sym R (8255 sym/42 pcs) 3 img 3 tbl
CREDIT FRAUD DETECTION
Description Today, we are going to build a XGBoost model to detect credit fraud. Our data contains transactions made by credit cards in September 2013 by European cardholders. This data has been reduced dimension by PCA technique, only time and amount of transaction will be originally retained. The class collum indicates the state of fraud detect...
2548 sym R (3029 sym/19 pcs) 5 tbl
PREDICT CHURNING CUSTOMER
1. Description Today, we are going to expolore a credit card service. following is the description of the task that we need to solve. This comes from a competition in Kaggle website. *A manager at the bank is disturbed with more and more customers leaving their credit card services. They would really appreciate if one could predict for them who i...
4282 sym R (5802 sym/21 pcs) 5 tbl
PREDICT GOOD OR BAD CREDIT CUSOMER
1. Introduction In this section, I will use ramdom forest model to build a classification model that classify good or bad customer. For more details about the data set, visit: https://www.kaggle.com/rikdifos/credit-card-approval-prediction Following are context that quoted from the link above: Context Credit score cards are a common risk control...
4088 sym R (5944 sym/26 pcs) 1 img 8 tbl