Publications by DOEUN
판매- 고객구매 패턴 모델링 (예제)
고객구매예측 DOEUN 2020-04-07 setwd("C:/Users/Administrator/Desktop/BIG DATA") read.csv("commerce.csv") -> df # 1. NA 데이터 확인 View(df) colSums(is.na(df)) # ID Na 값이 많음 ## InvoiceNo StockCode Description Quantity InvoiceDate UnitPrice ## 0 0 0 0 0 ...
163 sym R (10868 sym/21 pcs) 1 img
Population visualization
Korea_Population Practice DOEUN 2020-04-07 Data Clearning Data- 데이터 공공 포털 다운로드 setwd("C:/Users/Administrator/Desktop/BIG DATA") data<-read.csv("popul_korea.csv") data %>% select("시점", "한국인인구", "한국인남자", "한국인여자") %>% filter("시점" > 1963) -> df_one names(df_one) <- c("Year", "...
97 sym R (3155 sym/2 pcs) 1 img
Babynames
#install.packages("remotes") #remotes::install_github("beanumber/mdsr") babynames ## # A tibble: 1,924,665 x 5 ## year sex name n prop ## <dbl> <chr> <chr> <int> <dbl> ## 1 1880 F Mary 7065 0.0724 ## 2 1880 F Anna 2604 0.0267 ## 3 1880 F Emma 2003 0.0205 ## 4 1880 F Elizabe...
40 sym R (8048 sym/12 pcs) 3 img
Statistic - T.Test
Loading Data Loading Data data("ToothGrowth") str(ToothGrowth) ## 'data.frame': 60 obs. of 3 variables: ## $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ... ## $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ... ## $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ... ToothGrowth$dose <- as.factor(ToothGrowth$dose...
901 sym R (4861 sym/25 pcs) 3 img
Interactive plot
Data Cleaning Visualization Data Cleaning setwd("C:/Users/Administrator/Desktop/BIG DATA/suicide") data<-read.csv("master.csv", header = TRUE) #rename(테이블이름, "바꿀 이름" = "원래 이름") data<-rename(data, "country"= "癤풻ountry") colSums(is.na(data)) # Missing data ## country year ...
318 sym R (9234 sym/21 pcs)
R_Statistics Exploring data
Exploring Assumptions Exploring data with graphs setwd("C:/Users/Administrator/Desktop/BIG DATA/R_Book") facebook <- read.delim("FacebookNarcissism.dat", header= TRUE) ## ggplot ggplot(data= facebook, aes(x=NPQC_R_Total, y= Rating, color=Rating_Type)) + geom_smooth()+ geom_point()+ theme_classic()+ theme(plot.background = eleme...
855 sym R (11164 sym/47 pcs) 17 img
Corona Analysis
Corona Interactive Map Corna Interactive Graph Corona Interactive Map link1 <- "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Recovered.csv" link2 <- "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series...
156 sym R (7720 sym/19 pcs)
Corona Virus in South Korea
Data Collection and Cleaning Corona Map of South Korea Corona EDA of South Korea Data Collection and Cleaning link <- "https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_South_Korea" link %>% read_html() %>% html_nodes(xpath = '//*[@id="mw-content-text"]/div/table[4]') %>% html_table() %>% .[[1]] -> df_corona names(df_c...
293 sym R (10196 sym/21 pcs) 5 img 4 tbl
Coupon CRM
DATA Preparation Statistic Analysis and Data Visualization Meachin Learning DATA Preparation Missing Data getwd() ## [1] "C:/Users/Administrator/Desktop/BIG DATA" setwd("C:/Users/Administrator/Desktop/UBS에 넣는것/R/04_Data2") coupon<-read.csv("coupon data.csv",header = TRUE) my_font <- "Ubuntu Condensed" ## 결측치 처리를 999, 9...
2196 sym R (27433 sym/89 pcs) 24 img
Lending Club Visualization (Updating)
0.1 About the dataset These files contain complete loan data for all loans issued through the 2007-2015, including the current loan status (Current, Late, Fully Paid, etc.) and latest payment information. The file containing loan data through the “present” contains complete loan data for all loans issued through the previous completed calenda...
2858 sym R (33138 sym/85 pcs) 35 img 16 tbl