Publications by Keith Goldfeld
A GAM for time trends in a stepped-wedge trial with a binary outcome
In a previous post, I described some ways one might go about analyzing data from a stepped-wedge, cluster-randomized trial using a generalized additive model (a GAM), focusing on continuous outcomes. I have spent the past few weeks developing a similar model for a binary outcome, and have started to explore model comparison and methods to evaluat...
7521 sym R (4637 sym/11 pcs) 18 img
Modeling the secular trend in a stepped-wedge design
Recently I started a discussion about modeling secular trends using flexible models in the context of cluster randomized trials. I’ve been motivated by a trial I am involved with that is using a stepped-wedge study design. The initial post focused on more standard parallel designs; here, I want to extend the discussion explicitly to the stepped...
9598 sym R (5558 sym/7 pcs) 12 img
Presenting results for multinomial logistic regression: a marginal approach using propensity scores
Multinomial logistic regression modeling can provide an understanding of the factors influencing an unordered, categorical outcome. For example, if we are interested in identifying individual-level characteristics associated with political parties in the United States (Democratic, Republican, Libertarian, Green), a multinomial model would be a re...
9039 sym R (4124 sym/10 pcs) 8 img 1 tbl
Generating clustered data with marginal correlations
A student is working on a project to derive an analytic solution to the problem of sample size determination in the context of cluster randomized trials and repeated individual-level measurement (something I’ve thought a little bit about before). Though the goal is an analytic solution, we do want confirmation with simulation. So, I was a littl...
12051 sym R (8368 sym/13 pcs)
Simulating data from a non-linear function by specifying a handful of points
Trying to simulate data with non-linear relationships can be frustrating, since there is not always an obvious mathematical expression that will give you the shape you are looking for. I’ve come up with a relatively simple solution for somewhat complex scenarios that only requires the specification of a few points that lie on or near the desire...
4344 sym R (2012 sym/10 pcs) 14 img
Flexible simulation in simstudy with customized distribution functions
Really, the only problem with the simstudy package (😄) is that there is a hard limit to the possible probability distributions that are available (the current count is 15 – see here for a complete description). However, it turns out that there is more flexibility than first meets the eye, and we can easily accommodate a limitless number as l...
3468 sym R (934 sym/7 pcs) 8 img
To impute or not: the case of an RCT with baseline and follow-up measurements
Under normal conditions, conducting a randomized clinical trial is challenging. Throw in a pandemic and things like site selection, patient recruitment and patient follow-up can be particularly vexing. In any study, subjects need to be retained long enough so that outcomes can be measured; during a period when there are so many potential disrupti...
9442 sym R (4051 sym/7 pcs) 6 img
simstudy updated to version 0.5.0
A new version of simstudy is available on CRAN. There are two major enhancements and several new features. In the “major” category, I would include (1) changes to survival data generation that accommodate hazard ratios that can change over time, as well as competing risks, and (2) the addition of functions to allow users to sample from existi...
4224 sym R (4655 sym/11 pcs) 4 img 1 tbl
Adding competing risks in survival data generation
I am working on an update of simstudy that will make generating survival/time-to-event data a bit more flexible. There are two biggish enhancements. The first facilitates generation of competing events, and the second allows for the possibility of generating survival data that has time-dependent hazard ratios. This post focuses on the first enhan...
3148 sym R (2726 sym/6 pcs) 2 img
Everyone knows that loops in R are to be avoided but vectorization is not always possible
It goes without saying that there are always many ways to solve a problem in R, but clearly some ways are better (for example, faster) than others. Recently, I found myself in a situation where I could not find a way to avoid using a loop, and I was immediately concerned, knowing that I would want this code to be flexible enough to ru...
6362 sym R (7942 sym/11 pcs)