Publications by Nguyen Ngoc Thieu

Product Sales Analyzed Using Python

12.04.2024

Product Sales Analyzed Using Python¶ The product sale data is analyzed using Python. Loading and checking data¶ In [1]: import pandas as pd df= pd.read_csv("/Users/nnthieu/sales.csv", skiprows=0) In [24]: df.shape Out[24]: (15000, 8) In [25]: df.head() Out[25]: week sales_method customer_id nb_sold revenue years_as_customer nb_site...

1377 sym 5 img 5 tbl

Product Sales Analyzed Using Python

11.04.2024

Product Sales Analyzed Using Python¶ The product sale data is analyzed using Python. Loading and checking data¶ In [23]: import pandas as pd df= pd.read_csv("/Users/nnthieu/sales.csv", skiprows=0) In [24]: df.shape Out[24]: (15000, 8) In [25]: df.head() Out[25]: week sales_method customer_id nb_sold revenue years_as_customer nb_sit...

1181 sym 4 img 4 tbl

Product Sales Analyzed Using Python

11.04.2024

Product Sales Analyzed Using Python¶ The product sale data is analyzed using Python. Loading and checking data¶ In [3]: import pandas as pd df= pd.read_csv("/Users/nnthieu/sales.csv", skiprows=1) In [4]: df.shape Out[4]: (15000, 9) In [70]: df.columns Out[70]: Index(['Unnamed: 0', 'week', 'sales_method', 'customer_id', 'nb_sold', ...

1188 sym 4 img 4 tbl

Pens & Printers Project Using SAS

01.04.2024

Loading data filename tempurl TEMP; proc http url= "https://s3.amazonaws.com/talent-assets.datacamp.com/product_sales.csv" method="get" out=tempurl; run; /* Import the data from the URL into a SAS dataset */ data work.sales; infile tempurl dlm =',' firstobs=2 MISSOVER; /* Set delimiter and skip header row */ input week $ sal...

1127 sym Python (1746 sym/11 pcs) 2 img

SAS Notes

28.03.2024

Select columns with string ‘M’ in their names We already have data cars filename tmp temp; data _null_; file tmp; if 0 then set cars; length varname $32; do until (varname='varname'); call vnext(varname); if index(varname,'M') then put varname; end; run; data want; set cars; keep %include tmp; ; ...

151 sym

Pens and Printers's Sale Campaign

01.03.2024

Data source Data of Pens and Printers Company is imported from “https://s3.amazonaws.com/talent-assets.datacamp.com/product_sales.csv” Summary of data Data set consists of 15,000 rows and 8 columns such as ‘week’, ‘sales_method’, ‘customer_id’, ‘nd_sold’, ‘revenue’, ‘years_as_customer’, ‘nb_site_visits’ and ‘state...

7907 sym R (16946 sym/57 pcs) 9 img

Ranking Within Groups in Python

01.01.2023

Ranking bygroup in Python import numpy as np import pandas as pd import seaborn as sns Có những trường hợp trong phân tích data chúng ta cần chọn ra những row có giá trị của biến số Y lớn nhất, nhỏ nhất hoặc thứ nth trong những nhóm nhỏ nào đó. Ví dụ, trong dataset dưới đây về tiền tip. C...

1424 sym Python (949 sym/18 pcs) 13 tbl

Discretizing Continuous Data in Python

02.01.2023

Discretization and Binning Continuos Data in Python import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt trong phân tích data, đôi khi cần chia những biến số liên tục ra các khoảng, tạo thành biến số rời rạc, đại diện cho các nhóm có giá trị tăng dần liên tiếp. ...

1004 sym Python (754 sym/12 pcs) 1 img 3 tbl

Forecasting Deaths By Covid19 In US (Apr. 9-Apr. 15)

09.04.2020

packages load library(tidyverse) ## -- Attaching packages ----------------------------------------------- tidyverse 1.2.1 -- ## v ggplot2 3.2.0 v purrr 0.3.2 ## v tibble 2.1.3 v dplyr 0.8.1 ## v tidyr 0.8.3 v stringr 1.4.0 ## v readr 1.3.1 v forcats 0.4.0 ## -- Conflicts -----------------------------------------------...

457 sym R (6166 sym/31 pcs) 3 img

Working With Tables

08.11.2021

Two ways tables Tables thường được yêu cầu trình bày khi phân tích mối quan hệ giữa 2 biến số rời rạc (categorical variables). Data về bệnh diabetes type 2 được sử dụng trong bài minh hoạ này. library(stats) library(tidyverse) library("mlbench") data("PimaIndiansDiabetes2") Trước hết chúng ta t�...

5144 sym R (2902 sym/24 pcs)