Publications by Bob Carpenter
StanCon Helsinki streaming live now (and tomorrow)
We’re streaming live right now! Thursday 08:45-17:30: YouTube Link Friday 09:00-17:00: YouTube Link Timezone is Eastern European Summer Time (EEST) +0300 UTC Here’s a link to the full program [link fixed]. There have already been some great talks and they’ll all be posted with slides and runnable source code after the conference on the ...
939 sym
StanCon Helsinki streaming live now (and tomorrow)
We’re streaming live right now! Thursday 08:45-17:30: YouTube Link Friday 09:00-17:00: YouTube Link Timezone is Eastern European Summer Time (EEST) +0300 UTC Here’s a link to the full program [link fixed]. There have already been some great talks and they’ll all be posted with slides and runnable source code after the conference on the ...
939 sym
NYC Meetup Thursday: Under the hood: Stan’s library, language, and algorithms
I (Bob, not Andrew!) will be doing a meetup talk this coming Thursday in New York City. Here’s the link with registration and location and time details (summary: pizza unboxing at 6:30 pm in SoHo): Bayesian Data Analysis Meetup: Under the hood: Stan’s library, language, and algorithms After summarizing what Stan does, this talk will focus ...
2105 sym
Markov chain Monte Carlo doesn’t “explore the posterior”
First some background, then the bad news, and finally the good news. Spoiler alert: The bad news is that exploring the posterior is intractable; the good news is that we don’t need to explore all of it. Sampling to characterize the posterior There’s a misconception among Markov chain Monte Carlo (MCMC) practitioners that the purpose of sa...
3140 sym 14 img
Seeking postdoc (or contractor) for next generation Stan language research and development
The Stan group at Columbia is looking to hire a postdoc* to work on the next generation compiler for the Stan open-source probabilistic programming language. Ideally, a candidate will bring language development experience and also have research interests in a related field such as programming languages, applied statistics, numerical analysis, or...
2673 sym
Non-randomly missing data is hard, or why weights won’t solve your survey problems and you need to think generatively
Throw this onto the big pile of stats problems that are a lot more subtle than they seem at first glance. This all started when Lauren pointed me at the post Another way to see why mixed models in survey data are hard on Thomas Lumley’s blog. Part of the problem is all the jargon in survey sampling—I couldn’t understand Lumley’s languag...
5484 sym R (1692 sym/5 pcs) 68 img
Econometrics postdoc and computational statistics postdoc openings here in the Stan group at Columbia
Andrew and I are looking to hire two postdocs to join the Stan group at Columbia starting January 2020. I want to emphasize that these are postdoc positions, not programmer positions. So while each position has a practical focus, our broader goal is to carry out high-impact, practical research that pushes the frontier of what’s posisble in B...
5690 sym
Beautiful paper on HMMs and derivatives
I’ve been talking to Michael Betancourt and Charles Margossian about implementing analytic derivatives for HMMs in Stan to reduce memory overhead and increase speed. For now, one has to implement the forward algorithm in the Stan program and let Stan autodiff through it. I worked out the adjoint method (aka reverse-mode autodiff) derivatives ...
3973 sym 18 img
Naming conventions for variables, functions, etc.
The golden rule of code layout is that code should be written to be readable. And that means readable by others, including you in the future. Three principles of naming follow: 1. Names should mean something. 2. Names should be as short as possible. 3. Use your judgement to balance (1) and (2). The third one’s where all the fun arises. ...
1422 sym
Make Andrew happy with one simple ggplot trick
By default, ggplot expands the space above and below the x-axis (and to the left and right of the y-axis). Andrew has made it pretty clear that he thinks the x axis should be drawn at y = 0. To remove the extra space around the axes when you have continuous (not discrete or log scale) axes, add the following to a ggplot plot, plot <- plot + ...
918 sym R (103 sym/1 pcs)