Publications by R on kieranhealy.org
Burn Notice
Your Phone and Watch have a lot of data about you. I mean, like, a lot. Someone should really write a book all about the general issues for society that this raises. Yesterday I decided I wanted to take a look specifically at the health data on my iPhone. I’m not a huge user of the iPhone’s or the Apple Watch’s health features. I don’t use ...
11846 sym R (12997 sym/24 pcs) 4 img 12 tbl
Kerning and Kerning in a Widening Gyre
This post summarizes an extended period of deep annoyance. I have tried to solve the problem it describes more than once before and not quite done it. This has, in fact, happened again. I have still not satisfactorily solved the problem. But this time I know why I can’t solve it in a civilized manner. My goal is simple, and reasonable. I want to...
11884 sym R (6762 sym/16 pcs) 28 img 8 tbl
Halloween Data Cleaning
This week in Modern Plain Text Computing we put together some of the things we’ve been learning about cleaning and tidying data. Here’s a somewhat sobering example using data from the Fatality Analysis Reporting System, which is how the NTSA tracks information about road accidents in the United States. Our data file shows counts of pedestrians ...
3044 sym R (10485 sym/10 pcs) 4 img 5 tbl
Dr Drang and the Electoral College
The other week, the Internet’s most beloved creepy snowman wrote a blog post where he showed how to use a little Python to group states by their number of electoral college votes to make a table like this: Electors States PopPct ECPct 3 AK, DE, DC, ND, SD, VT, WY 1.61% 3.90% 4 HI, ID, ME, MT, NH, RI, WV 3.04% 5.20% 5 NE, NM 1.22% 1.86% 6 AR,...
8187 sym R (4633 sym/8 pcs) 5 tbl
Race and Ethnicity in New York City
I’m about to start work on a second edition of my Data Visualization book. As a result I continue to mess around with stuff I’m considering including in a new edition. The other day I pulled some block-level Census data and drew a map of the distribution of people of color in New York City, which is to say the share of the population that repor...
2572 sym 8 img
New York City’s POC Population
I was messing around with some Census data this morning. I had two main thoughts. One was to show the utility of old-fashioned grayscale when it comes to mapping data (or displaying it in general). The goal of most carefully thought-through dataviz color palettes is to make them legible to viewers, which mostly means making them as linear as possib...
3252 sym 2 img
gssr is now two packages: gssr and gssrdoc
Summary My gssr package is now two packages: gssr and gssrdoc. They’re also available as binary packages via R-Universe which means they will install much faster. The GSS is a big survey with a big codebook. Distributing it as an R package poses a few challenges. It’s too big for CRAN, of course, but that’s fine because CRAN is not a reposit...
2694 sym Python (311 sym/2 pcs) 6 img 1 tbl
Daily Average Sea Surface Temperature Animation
Yesterday evening I gave a talk about data visualization to Periodic Tables, a Science Cafe run by Misha Angrist. It was a lot of fun! Amongst other things, I made an animation of the NOAA Daily Sea Surface Temperature Graph from the other week. Here it is: Here’s the static graph. Global mean sea surface temperature 1981-2024 And because the ...
1334 sym 4 img
Make Your Own NOAA Sea Temperature Graph
Sea-surface temperatures in the North Atlantic have been in the news recently as they continue to break records. While there are already a number of excellent summaries and graphs of the data, I thought I’d have a go at making some myself. The starting point is the detailed data made available by the National Centers for Environmental Information...
7326 sym Python (22013 sym/30 pcs) 6 img 15 tbl
gssr Update
NORC released version 2a of the 1972-2022 General Social Survey cumulative file. I’ve updated {gssr}, an R package that makes it more convenient for R users to work with GSS Data. One handy feature of {gssr} is that it lets you see documentation for individual GSS variables as R help pages. Details on every GSS variable are available in the R he...
2276 sym R (10867 sym/16 pcs) 2 img 8 tbl