Publications by Mircea Dumitru

Regression Models: Multivariable Regression - Week 3

28.02.2023

Module 7: Multivariable regression The first natural extension for linear regression is to assume additive effects, i.e. extending the regression line to a plane or a generalized version of a plane. If you were presented evidence of a relationship between breath mint usage (mints per day, \(X\)) and pulmonary function (measured in FEV), you would...

25038 sym R (17684 sym/91 pcs) 17 img

Regression Models: Multivariable Regression - Quiz 3

28.02.2023

Question 1 Consider the mtcars data set. Fit a model with mpg as the outcome that includes number of cylinders as a factor variable and weight as confounder. Give the adjusted estimate for the expected change in mpg comparing 8 cylinders to 4. Answer 1 data(mtcars) fit <- lm(mpg ~ as.factor(cyl) + wt, data = mtcars) fit$coef ## (Intercept) as....

3048 sym 2 img

Regression Models: Least Squares & Linear Regression - Week 1

22.02.2023

Module 1: Introduction to Regression & Least Squares Regression models are the workhorse of data science. They are the most well described, practical and theoretically understood models in statistics. One of the key insight for regression models is that they produce highly interpretable model fits, unlike machine learning algorithms, which often sa...

7631 sym R (6916 sym/31 pcs) 6 img

Regression Models: Least Squares & Linear Regression - Quiz 1

22.02.2023

Question 1 Consider the data set given below \((0.18, -1.54, 0.42, 0.95)\) and weights given by \((2, 1, 3, 1)\). Give the value of \(\mu\) that minimizes the least squares equation \(\sum_{i=1}^{n}\omega_{i}(x_i - mu)^2\) Answer 1 \[ \mu = \frac{\sum_{i=1}^{n} \omega_i x_i}{\sum_{i=1}^{n} \omega_i^2} \] x <- c(0.18, -1.54, 0.42, 0.95) w <- c(2, 1...

2955 sym

Regression Models: Linear Regression & Multivariable Regression - Week 2

22.02.2023

Module 4: Statistical Linear Regression Models Least squares is an estimate tool. How do we do inference? Consider developing a probabilistic model for linear regression \[ Y_i = \beta_0 + \beta_1 X_i + \epsilon_i \] Here the errors \(\epsilon_i\) are assumed \(\text{Normal} \left( x \mid 0, 1 \right)\). Note that \[ \mathbb{E}\left[ Y_i \mid X_i ...

11090 sym R (12967 sym/73 pcs) 12 img

Regression Models: Linear Regression & Multivariable Regression - Quiz 2

22.02.2023

Question 1 Consider the following data with \(x\) as the predictor and \(y\) as as the outcome. x <- c(0.61, 0.93, 0.83, 0.35, 0.54, 0.16, 0.91, 0.62, 0.62) y <- c(0.67, 0.84, 0.6, 0.18, 0.85, 0.47, 1.1, 0.65, 0.36) Give a \(p\)-value for the two sided hypothesis test of whether \(\beta_1\) from a linear regression model is 0 or not. Answer 1 Dire...

3725 sym

Statistical Inference: Probability & Expected Values - Week 1

20.02.2023

Module 1: Introduction Statistical inference - the process of generating conclusions about a population from a noisy sample. Statistical inference is the only formal system of inference that we have. Without statistical inference: we are simply living within our data. With statistical inference: we are trying to generate new knowledge, extendin...

22340 sym R (3269 sym/11 pcs) 5 img

Statistical Inference: Intervals, testing & p-values - Week 3

20.02.2023

Module 8: Confidence Intervals In the previous module, we discussed creating a confidence interval using the CLT. They took the form \[ \text{Estimate} \pm z_{1-\frac{\alpha}{2}} \times \text{Estimated Standard Error of the Estimate} \] In this module, methods for small samples are discussed, namely the t-distribution and the t confidence intervals...

18000 sym R (11026 sym/80 pcs) 4 img 1 tbl

Statistical Inference: Intervals, testing & p-values - Quiz 3

20.02.2023

Question 1 In a population of interest, a sample of 9 men yielded a sample average brain volume of 1,100cc and a standard deviation of 30cc. What is a 95% Student’s T confidence interval for the mean brain volume in this new population? Answer 1 n = 9 mu = 1100 sd = 30 SE = sd/sqrt(n) mu + c(-1,1) * SE* qt(0.975,n-1) ## [1] 1076.94 1123.06 Ques...

3637 sym

Statistical Inference: Power, Multiple Comparisons & Resampling - Week 4

20.02.2023

Module 11: Power In the last module, the Type I error rate \(\alpha\) was discussed, i.e. the probability of rejecting the null hypothesis when it’s true. The hypothesis test was structured so that the probability of Type I error is small. The other possible error is to fail to reject when the alternative is true, i.e. the Type II error. The po...

11240 sym R (5262 sym/58 pcs) 6 img 2 tbl