Publications by Daniel
Imputing Missing Data with MICE
MCAR and MAR observations are called ignorable, while MNAR values are called non-ignorable or informative. Work with the unbalanced data and use the ML or REML method for parameter estimation. This approach works best if all missing values are ignorable. However, for MNAR values, this strategy results in biased estimates. Delete all incomplete...
2100 sym R (13637 sym/34 pcs) 11 img
Epidemiology
Epidemiology Brief summary from J. Craig Longenecker… What is epidemiology? “Epidemiology is the study of the distribution and determinants of health related states or events in specified populations, and the application of this study to control of health problems.” Framingham Heart Study Highlights: Some of the Most Significant Milestone...
14649 sym
Causal inference and associated regression
Causal inference and association regression Still need to consider the chronological order in design; Well designed observation study is suitable for causal inference; Generally associated analysis estimate is larger or smaller than the causal inference; Associated analysis is used for strata and it is hard to deal with interaction (with exposu...
1629 sym R (7460 sym/30 pcs) 2 img
Survival analysis
Brief summary from the book Basic Definitions Denote by T the random variable representing the survival time of a subject. Let f (t), t>=0, denote the probability density function (pdf) of T , and let F(t) , be the cumulative distribution function (cdf) of T. The objective of survival analysis is to estimate and model the following functions:...
24855 sym R (4140 sym/23 pcs) 8 img 2 tbl
Capstone Project
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── ## ✔ dplyr 1.1.1 ✔ readr 2.1.4 ## ✔ forcats 1.0.0 ✔ stringr 1.5.0 ## ✔ ggplot2 3.4.4 ✔ tibble 3.2.1 ## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0 ## ✔ purrr 1.0.1...
14 sym Python (1163 sym/3 pcs) 7 img
Imputing Missing Data with MICE
Imputing Missing Data with MICE Substitute the means for other variables with missingness, run a regression to get the imputed values for the first variable; next, run a regression for the second variable and imputed using the first variable’s updated values; repeat the above steps until all missing data was imputed by regression; that is the...
1328 sym R (13637 sym/34 pcs) 11 img
Simulations and Bootstrapping
Simulations and Bootstrapping The bootstrapping approach will always work because it doesn’t assume any underlying distribution of the data. Test the robust of t test what if violation of the normality assumption How sensitive is the two-sample t-test to violations of the normality assumption? Test this by using the rpois() function, and try ...
1688 sym 28 img
Sequence Analysis week 5
Sequence Analysis Based on the week 5 homework of “statistics for biomedical researchers” course. Sequence alignment # if (!requireNamespace("BiocManager", quietly = TRUE)) # install.packages("BiocManager") # # BiocManager::install("msa") library(msa) ## Loading required package: Biostrings ## Loading required package: BiocGenerics #...
1038 sym R (47014 sym/119 pcs) 12 img
Sequence Analysis week 5
Sequence Analysis Based on the week 5 homework of “statistics for biomedical researchers” course. Sequence alignment # if (!requireNamespace("BiocManager", quietly = TRUE)) # install.packages("BiocManager") # # BiocManager::install("msa") library(msa) ## Loading required package: Biostrings ## Loading required package: BiocGenerics #...
831 sym R (46055 sym/112 pcs) 12 img
Principle components analysis
Principle components analysis Load the data library(tidyverse) ## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── ## ✔ dplyr 1.1.1 ✔ readr 2.1.4 ## ✔ forcats 1.0.0 ✔ stringr 1.5.0 ## ✔ ggplot2 3.4.2 ✔ tibble 3.2.1 #...
354 sym R (2676 sym/15 pcs) 5 img