Publications by Keith Goldfeld
Complier average causal effect? Exploring what we learn from an RCT with participants who don’t do what they are told.
Inspired by a free online course titled Complier Average Causal Effects (CACE) Analysis and taught by Booil Jo and Elizabeth Stuart (through Johns Hopkins University), I’ve decided to explore the topic a little bit. My goal here isn’t to explain CACE analysis in extensive detail (you should definitely go take the course for that), but to desc...
9582 sym R (3909 sym/4 pcs) 4 img
A simstudy update provides an excuse to talk a little bit about latent class regression and the EM algorithm
I was just going to make a quick announcement to let folks know that I’ve updated the simstudy package to version 0.1.4 (now available on CRAN) to include functions that allow conversion of columns to factors, creation of dummy variables, and most importantly, specification of outcomes that are more flexibly conditional on previously defined va...
8828 sym R (6940 sym/11 pcs) 8 img
CACE closed: EM opens up exclusion restriction (among other things)
This is the third, and probably last, of a series of posts touching on the estimation of complier average causal effects (CACE) and latent variable modeling techniques using an expectation-maximization (EM) algorithm . What follows is a simplistic way to implement an EM algorithm in R to do principal strata estimation of CACE. The EM algorithm I...
7692 sym R (7430 sym/5 pcs) 6 img
A minor update to simstudy provides an excuse to talk a bit about the negative binomial and Poisson distributions
I just updated simstudy to version 0.1.5 (available on CRAN) so that it now includes several new distributions – exponential, discrete uniform, and negative binomial. As part of the release, I thought I’d explore the negative binomial just a bit, particularly as it relates to the Poisson distribution. The Poisson distribution is a discrete (i...
4305 sym R (4988 sym/7 pcs) 10 img
Can we use B-splines to generate non-linear data?
I’m exploring the idea of adding a function or set of functions to the simstudy package that would make it possible to easily generate non-linear data. One way to do this would be using B-splines. Typically, one uses splines to fit a curve to data, but I thought it might be useful to switch things around a bit to use the underlying splines to g...
5981 sym R (4322 sym/17 pcs) 26 img
Who knew likelihood functions could be so pretty?
I just released a new iteration of simstudy (version 0.1.6), which fixes a bug or two and adds several spline related routines (available on CRAN). The previous post focused on using spline curves to generate data, so I won’t repeat myself here. And, apropos of nothing really – I thought I’d take the opportunity to do a simple simulation to...
6923 sym R (2455 sym/5 pcs) 8 img
Thinking about different ways to analyze sub-groups in an RCT
Here’s the scenario: we have an intervention that we think will improve outcomes for a particular population. Furthermore, there are two sub-groups (let’s say defined by which of two medical conditions each person in the population has) and we are interested in knowing if the intervention effect is different for each sub-group. And here’s t...
9982 sym R (2632 sym/2 pcs) 2 img
A simstudy update provides an excuse to generate and display Likert-type data
I just updated simstudy to version 0.1.7. It is available on CRAN. To mark the occasion, I wanted to highlight a new function, genOrdCat, which puts into practice some code that I presented a little while back as part of a discussion of ordinal logistic regression. The new function was motivated by a reader/researcher who came across my blog in w...
3970 sym R (3797 sym/7 pcs) 6 img
Visualizing how confounding biases estimates of population-wide (or marginal) average causal effects
When we are trying to assess the effect of an exposure or intervention on an outcome, confounding is an ever-present threat to our ability to draw the proper conclusions. My goal (starting here and continuing in upcoming posts) is to think a bit about how to characterize confounding in a way that makes it possible to literally see why improperly ...
12819 sym R (7094 sym/15 pcs) 6 img
Characterizing the variance for clustered data that are Gamma distributed
Way back when I was studying algebra and wrestling with one word problem after another (I think now they call them story problems), I complained to my father. He laughed and told me to get used to it. “Life is one big word problem,” is how he put it. Well, maybe one could say any statistical analysis is really just some form of multilevel dat...
9129 sym R (6536 sym/6 pcs) 12 img