Publications by Keith Goldfeld

Any one interested in a function to quickly generate data with many predictors?

28.10.2019

A couple of months ago, I was contacted about the possibility of creating a simple function in simstudy to generate a large dataset that could include possibly 10’s or 100’s of potential predictors and an outcome. In this function, only a subset of the variables would actually be predictors. The idea is to be able to easily generate data for ...

3945 sym R (4100 sym/6 pcs) 2 img

What can we really expect to learn from a pilot study?

11.11.2019

I am involved with a very interesting project – the NIA IMPACT Collaboratory – where a primary goal is to fund a large group of pragmatic pilot studies to investigate promising interventions to improve health care and quality of life for people living with Alzheimer’s disease and related dementias. One of my roles on the project team is to ...

11927 sym R (5011 sym/11 pcs) 4 img

Adding a “mixture” distribution to the simstudy package

25.11.2019

I am contemplating adding a new distribution option to the package simstudy that would allow users to define a new variable as a mixture of previously defined (or already generated) variables. I think the easiest way to explain how to apply the new mixture option is to step through a few examples and see it in action. Specifying the “mixture�...

5062 sym R (3442 sym/10 pcs) 8 img

Repeated measures can improve estimation when we only care about a single endpoint

09.12.2019

I’ve been participating in the design of a new study that will evaluate interventions aimed at reducing both pain and opioid use for patients on dialysis. This study is likely to be somewhat complicated, involving multiple clusters, three interventions, a sequential and adaptive randomization scheme, and a composite binary outcome. I’m not go...

5349 sym R (4391 sym/12 pcs) 6 img

A brief account (via simulation) of the ROC (and its AUC)

20.01.2020

The ROC (receiver operating characteristic) curve visually depicts the ability of a measure or classification model to distinguish two groups. The area under the ROC (AUC), quantifies the extent of that ability. My goal here is to describe as simply as possible a process that serves as a foundation for the ROC, and to provide an interpretation of...

9614 sym R (1612 sym/10 pcs) 18 img

Analysing an open cohort stepped-wedge clustered trial with repeated individual binary outcomes

03.02.2020

I am currently wrestling with how to analyze data from a stepped-wedge designed cluster randomized trial. A few factors make this analysis particularly interesting. First, we want to allow for the possibility that between-period site-level correlation will decrease (or decay) over time. Second, there is possibly additional clustering at the patie...

8860 sym R (4977 sym/8 pcs) 6 img

Clustered randomized trials and the design effect

17.02.2020

I am always saying that simulation can help illuminate interesting statistical concepts or ideas. The design effect that underlies much of clustered analysis is could benefit from a little exploration through simulation. I’ve written about clustered-related methods so much on this blog that I won’t provide links – just peruse the list of en...

11152 sym R (3366 sym/19 pcs) 6 img

Alternatives to reporting a p-value: the case of a contingency table

02.03.2020

I frequently find myself in discussions with collaborators about the merits of reporting p-values, particularly in the context of pilot studies or exploratory analysis. Over the past several years, the American Statistical Association has made several strong statements about the need to consider approaches that measure the strength of evidence or...

7828 sym R (7192 sym/8 pcs) 10 img

When you want more than a chi-squared test, consider a measure of association for contingency tables

16.03.2020

In my last post, I made the point that p-values should not necessarily be considered sufficient evidence (or evidence at all) in drawing conclusions about associations we are interested in exploring. When it comes to contingency tables that represent the outcomes for two categorical variables, it isn’t so obvious what measure of association sho...

8770 sym R (1620 sym/7 pcs) 16 img

Can unbalanced randomization improve power?

30.03.2020

Of course, we’re all thinking about one thing these days, so it seems particularly inconsequential to be writing about anything that doesn’t contribute to solving or addressing in some meaningful way this pandemic crisis. But, I find that working provides a balm from reading and hearing all day about the events swirling around us, both here a...

5319 sym R (2424 sym/7 pcs) 2 img