Publications by Daniel

Nested mixed model

09.04.2024

Nested mixed model in R reference Load package and data # load package library(nlme) # read in data setwd("C:\\Users\\hed2\\OneDrive - National Institutes of Health\\Mixed model by SAS and R") DF <- read.csv("Oxide.csv") # specific the reference group DF$Source=as.factor(DF$Source) DF <- within(DF, Source<- relevel(Source, ref = 2)) Rand...

940 sym R (5720 sym/17 pcs)

Data Mining

01.04.2024

library(ISLR2) View(Hitters) names(Hitters) ## [1] "AtBat" "Hits" "HmRun" "Runs" "RBI" "Walks" ## [7] "Years" "CAtBat" "CHits" "CHmRun" "CRuns" "CRBI" ## [13] "CWalks" "League" "Division" "PutOuts" "Assists" "Errors" ## [19] "Salary" "NewLeague" dim(Hitters) ## [1] 322 20 s...

125 sym R (27547 sym/113 pcs) 9 img

Common issues in Statistics

05.02.2024

reference No plotting before analysis Plot is sometimes better to check the assumptions than hypothesis test. Instead, use a probability plot (also know as a quantile plot or Q-Q plot). it is very hard to tell whether or not a small data set comes from a particular distribution. Histogram varies by the number of bins. Plot original lowess plo...

12047 sym

Common issues in Statistics

05.02.2024

reference No plotting before analysis Plot is sometimes better to check the assumptions than hypothesis test. Instead, use a probability plot (also know as a quantile plot or Q-Q plot). it is very hard to tell whether or not a small data set comes from a particular distribution. Histogram varies by the number of bins. Plot original lowess plo...

11958 sym

Central Limit Theorem

02.02.2024

Central Limit Theorem The Central Limit Theorem1 says that for most distributions, linear combinations (e.g., the sum or the mean) of a large enough number of independent random variables is approximately normal. For example, adult human heights (at least if we restrict to one sex3) are the sum of many heights: the heights of the ankles, lower ...

787 sym Python (523 sym/5 pcs) 5 img

Common issues in Statistics

02.02.2024

reference plot original lowess plots or other types of plots fitted line with x Interpreting causality/ association “The only legitimate way to try to establish a causal connection statistically is through the use of randomized experiments.” “On average, people who take this medication have a decrease in blood pressure”. “The rate o...

11585 sym

Fixed or Random Factors

02.02.2024

Fixed or Random Factors E.g. Two way ANOVA Fixed effect factor: Data has been gathered from all the levels of the factor that are of interest. Random effect factor: The factor has many possible levels, interest is in all possible levels, but only a random sample of levels is included in the data. The standard methods for analyzing random effe...

749 sym

Quantile Regression

02.02.2024

Quantile Regression Standard regression estimates the mean of the conditional distribution (conditioned on the values of the predictors) of the response variable. Quantile regression is a method for estimating conditional quantiles, including the median....

263 sym

Issues about Dividing a Continuous Variable into Categories

01.02.2024

Issues about Dividing a Continuous Variable into Categories Modern regression models do not require categorization. In general, continuous variables should remain continuous in regression models designed to study the effects of the variable on the outcome of interest. –by O. Naggara When doing hypothesis tests, the loss of information when d...

1444 sym

Logistic distribution and logistic regression

31.01.2024

Logistic distribution and logistic regression Generalized linear model, GLM Generalized linear models cover all these situations by allowing for response variables that have arbitrary distributions (rather than simply normal distributions), and for an arbitrary function of the response variable (the link function) to vary linearly with the pred...

2522 sym Python (768 sym/4 pcs) 3 img