Publications by Ching lee yan
Document
In-class exercises1 Find out the number of days you have spent at NCKU as a registered student or staff person. library(lubridate) ## ## Attaching package: 'lubridate' ## The following object is masked from 'package:base': ## ## date startDate <- dmy("1-September-2018") endDate <- dmy("18-May-2020") endDate - startDate ## Time differen...
444 sym R (1560 sym/17 pcs) 2 img
Document
社交網絡分析是一種無監督的機器學習方法,有一些類似於機器學習中的KNN,識別出不同的網絡群體,可以應用在 FB推薦朋友、或是商品推薦、影片推薦、甚至是疾病傳播、反詐欺等等。 社交網絡通常使用圖來描述,圖可以非常直觀的描述事物之間的關係。在圖�...
1725 sym R (36329 sym/113 pcs) 33 img
Document
exercise2 Use the read and math variables from the high schools data example for this problem. dta <- read.table("hs0.txt", h = T) dta.asian <- subset(dta, race=="asian") r0 <- cor(dta.asian$math, dta.asian$socst) nIter <- 1001 cnt <- function(nIter = 1001){ new <- replicate(nIter, sample(dta.asian$read)) r <- cor(new, dta.asian...
452 sym R (2238 sym/4 pcs) 1 img
Document
exercise1 Split the ChickWeight{datasets} data by individual chicks to extract separate slope estimates of regressing weight onto Time for each chick. dta1 <- ChickWeight #read the data sapply(split(dta1, dta1$Chick), function(x) lm(weight ~ Time, data = x)$coef) #split the data ## 18 16 15 13 9 ...
653 sym R (14786 sym/87 pcs) 1 img
Document
exercise1 The distribution of personal disposable income in Taiwan in 2015 has a story to tell. Revise the following plot to enhance that message. library(ggplot2) dta1 <- read.csv("income_tw.csv", header = T) %>% mutate(Percent = (Count / sum(Count)) * 100, Income = ordered(Income, levels = Income)) %>% ggplot(aes(x = Percent...
572 sym R (14658 sym/27 pcs) 7 img
Document
exercise1 Explore the answers to both questions with plots involving confidence intervals or error bars for the means. read the data dta1 <- read.table("stateAnxiety.txt", h = T) %>% gather(key = "key", value = "Anxiety") %>% mutate(Gender = c(rep("Female", 250), rep("Male", 250)), ID = rep(1:50, 10)) %>% mutate(Week = rep(c(re...
9962 sym R (12899 sym/29 pcs) 15 img 3 tbl
Document
第一題 Render the R script for replicating figures in Chapter 4 of Lattice: Multivariate Data Visualization with R (Sarkar, D. 2008) to html document with comments at each code chunk indicated by ‘##’. ##read the data VADeaths ## Rural Male Rural Female Urban Male Urban Female ## 50-54 11.7 8.7 15.4 8....
1211 sym R (8356 sym/40 pcs) 15 img
Document
第一題 This R script illustrates how to split the plot region to include histograms on the margins of a scatter diagram using the Galton{HistData} data set. Compile it as a html document with comments on each code chunk. Galton’s data on the heights of parents and their children install.packages(“HistData”) #install HistData dta <- HistD...
908 sym R (47014 sym/27 pcs) 24 img
Document
第一題 Change the ‘df’ parameter to a slightly larger integer and do it again. What statistical concept does this script illustrate? x <- seq(-pi*2, 2*pi, .05) z <- dnorm(x) #dnorm represents the probability density function value normally assigned to y <- dt(x, df=3) plot(x, z, type="l", bty="L", xlab="Standard unit", ylab="Density") ...
885 sym R (2627 sym/10 pcs) 65 img
Document
第一題 Select at random one school per county in the data set Caschool{Ecdat} and draw a scatter diagram of average math score mathscr against average reading score readscr for the sampled data set. Make sure your results are reproducible (e.g., the same random sample will be drawn each time). load the dataset pacman::p_load(Ecdat, tidyr, data...
5512 sym R (10193 sym/56 pcs) 3 img 1 tbl