Publications by Keith Goldfeld

Can ChatGPT help construct non-trivial statistical models? An example with Bayesian “random” splines

07.10.2024

I’ve been curious to see how helpful ChatGPT can be for implementing relatively complicated models in R. About two years ago, I described a model for estimating a treatment effect in a cluster-randomized stepped wedge trial. We used a generalized additive model (GAM) with site-specific splines to account for general time trends, implemented using...

10620 sym R (4165 sym/10 pcs) 14 img

An IV study design to estimate an effect size when randomization is not ethical

02.09.2024

An investigator I frequently consult with seeks to estimate the effect of a palliative care treatment protocol for patients nearing end-stage disease, compared to a more standard, though potentially overly burdensome, therapeutic approach. Ideally, we would conduct a two-arm randomized clinical trial (RCT) to create comparable groups and obtain an ...

13728 sym R (8765 sym/17 pcs) 6 img

Generating binary data by specifying the relative risk, with simulations

01.07.2024

The most traditional approach for analyzing binary outcome data is logistic regression, where the estimated parameters are interpreted as log odds ratios or, if exponentiated, as odds ratios (ORs). No one other than statisticians (and maybe not even statisticians) finds the odds ratio to be a very intuitive statistic, and many feel that a risk diff...

4666 sym R (3856 sym/8 pcs)

simstudy: another way to generate data from a non-standard density

03.06.2024

One of my goals for the simstudy package is to make it as easy as possible to generate data from a wide range of data distributions. The recent update created the possibility of generating data from a customized distribution specified in a user-defined function. Last week, I added two functions, genDataDist and addDataDist, that allow data generati...

3064 sym R (4673 sym/13 pcs) 12 img

simstudy 0.8.0: customized distributions

20.05.2024

Over the past few years, a number of folks have asked if simstudy accommodates customized distributions. There’s been interest in truncated, zero-inflated, or even more standard distributions that haven’t been implemented in simstudy. While I’ve come up with approaches for some of the specific cases, I was never able to develop a general solu...

4209 sym R (1949 sym/7 pcs) 4 img

simstudy enhancement: specifying idiosyncratic follow-up times for longitudinal data

15.04.2024

A researcher reached out to me a few weeks ago. They were trying to generate longitudinal data that included irregularly spaced follow-up periods. The default periods generated by the function addPeriods in the simstudy package are \(\{0, 1, 2, …, n – 1\}\), where there are \(n\) total periods. However, when follow-up periods required more spec...

2498 sym R (3107 sym/6 pcs) 2 img

Perfectly balanced treatment arm distribution in a multifactorial CRT using stratified randomization

19.02.2024

Over two years ago, I wrote a series of posts (starting here) that described possible analytic approaches for a proposed cluster-randomized trial with a factorial design. That proposal was recently funded by NIA/NIH, and now the Emergency departments leading the transformation of Alzheimer’s and dementia care (ED-LEAD) trial is just getting under...

9947 sym R (4279 sym/11 pcs) 10 img

A three-arm trial using two-step randomization

18.12.2023

Clinical Decision Support (CDS) tools are systems created to support clinical decision-making. Health care professionals using these tools can get guidance about diagnostic and treatment options when providing care to a patient. I’m currently involved with designing a trial focused on comparing a standard CDS tool with an enhanced version (CDS+)....

6255 sym R (3082 sym/10 pcs) 4 img 2 tbl

Creating a nice looking Table 1 with standardized mean differences

25.09.2023

I’m in the middle of a perfect storm, winding down three randomized clinical trials (RCTs), with patient recruitment long finished and data collection all wrapped up. This means a lot of data analysis, presentation prep, and paper writing (and not so much blogging). One common (and not so glamorous) thread cutting across all of these RCTs is the ...

7868 sym R (2957 sym/6 pcs) 2 tbl

Finding logistic models to generate data with desired risk ratio, risk difference and AUC profiles

19.06.2023

About two years ago, someone inquired whether simstudy had the functionality to generate data from a logistic model with a specific AUC. It did not, but now it does, thanks to a paper by Peter Austin that describes a nice algorithm to accomplish this. The paper actually describes a series of related algorithms for generating coefficients that targe...

9018 sym R (4305 sym/17 pcs) 4 img