Publications by R on kieranhealy.org
Dataviz Interview
I had a very nice chat recently about data visualization with Brian Fannin, a research actuary with the CAS. We covered a variety of topics from R and ggplot in particular, to how to think about data visualization in general, and what the dataviz community is learning from COVID. You can watch it here: Related To leave a comment for the author,...
707 sym
US Excess Mortality
The CDC recently released some new data on mortality counts by state and cause of death in the U.S., allowing us to get a look at excess mortality patterns due to the COVID-19 pandemic. I’ve folded the data into the covdata package. As an illustration of the sort of thing you can do with it—and of the sort of thing you can do with ggplot and ...
4917 sym 6 img
National Weekly Death Rates
Following up on yesterday’s post on within-state variation in deaths in the United States, here’s a quick look at all-cause mortality rates across twenty countries, courtesy of the excellent work of the demographers who maintain the Human Mortality Database. The panels show death rates across twenty countries. Within each panel you can compar...
1376 sym R (2368 sym/2 pcs) 2 img 1 tbl
Walk the Walk
The other day I was looking to make a bunch of graphs showing some recent data from the CDC about excess mortality due to COVID-19. The idea was to take weekly counts of deaths over the past few years, both overall and from various important causes, and then show how the weekly counts from this year compare so far. The United States has a very la...
10516 sym R (4678 sym/12 pcs) 4 img 6 tbl
Excess Deaths by Cause
As I was saying the other day, calculating excess deaths can be a tricky business, especially if your focus is on understanding counterfactuals like how many people died of some cause who would not have died due to some other competing risk over the period of interest. Moreover, even setting the counterfactuals aside, the whole business of accura...
5784 sym R (5907 sym/12 pcs) 2 img 6 tbl
Excess Deaths by Jurisdiction
Although yesterday’s excess deaths plots by cause graph was for the whole of the United States only, the table we made did the same calculations on the whole CDC dataset, so the resulting df_excess table has numbers for all U.S. states and several other jurisdictions, such as New York City. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18...
2688 sym R (1261 sym/2 pcs) 4 img 1 tbl
Cross National Death Rates
Data from the Short Term Morality Fluctuations dataset compiled by the Human Mortality Database continues to be very interesting. When thinking about how to interpret the 2020 data in a cross-national and longitudinal context, it’s clear that there are several things to bear in mind. For example, and at a bare minimum: Countries vary widely in...
5130 sym 2 img
Income and Happiness
People have been talking about this PNAS paper by Matthew Killingsworth: “Experienced well-being rises with income, even above $75,000 per year”. Here’s the abstract: Past research has found that experienced well-being does not increase above incomes of $75,000/y. This finding has been the focus of substantial attention from researchers an...
6114 sym R (2302 sym/6 pcs) 6 img 3 tbl
Excess Deaths February Update
The CDC continues to update its counts of deaths by cause for 2020 as data comes in from the jurisdictions that report to it. The data are by now fairly complete, though there are still significant gaps in several states due to delayed reporting. North Carolina, in particular, has yet to report almost any deaths for the entire final quarter of 20...
4508 sym 110 img
Map, Walk, Pivot
Recently I came across a question where someone was looking to take a bunch of CSV files, each of which contained numerical columns, and (a) get them into R, (b) calculate the mean and standard deviation of every column in every CSV file, and (c) calculate some overall summary like the mean of all the means and the mean of all the standard deviat...
8259 sym R (9277 sym/28 pcs) 2 img 14 tbl