Publications by nsaunders

Searching for duplicate resource names in PMC article titles

16.09.2015

I enjoyed this article by Keith Bradnam, and the associated tweets, on the problem of duplicated names for bioinformatics software. I figured that to some degree at least, we should be able to search for such instances, since the titles of published articles that describe software often follow a particular pattern. There may even be a grammatical...

4078 sym R (1514 sym/7 pcs) 4 img

R and the Nobel Prize API

20.10.2015

The Nobel Prizes. Love them? Hate them? Are they still relevant, meaningful? Go on admit it, you always imagined you would win one day. Whatever you think of them, the 2015 results are in. What’s more, the good people of the Nobel Foundation offer us free access to data via an API. I’ve published a document over at RPubs, showing some of the ...

1452 sym R (93 sym/1 pcs) 16 img

Novelty: an update

21.10.2015

A recent tweet: @neilfws I enjoyed this: https://t.co/ynyHRbgpLN Have you published (or are you thinking about publishing) this analysis anywhere? — Marcus Munafo (@MarcusMunafo) October 7, 2015 PubMed articles containing “novel” in title or abstract 1845 – 2014made me think (1) has it really been 5 years, (2) gee, my ggplot skills were...

1463 sym 6 img

Let’s (briefly) revisit the Nobel API

09.10.2016

It’s always nice when 12-month old code runs without a hitch. Not sure why this did not become a Github repo first time around, but now it is: my RMarkdown code to generate a report using data from the Nobel Prize API. Now you too can generate a “gee, it’s all old white men” chart as seen in The Economist – Greying of the Nobel laureate...

1111 sym 4 img

The y-axis: to zero or not to zero

20.11.2016

I don’t “do politics” at this blog, but I’m always happy to do charts. Here’s one that’s been doing the rounds on Twitter recently: A quick look at turnout data: It seems 2016 was nothing special for the Rep-candidate. It's the Dem-candidate that didn't get the vote out. pic.twitter.com/wby3gta26m — D Yanagizawa-Drott (@yanagiz) No...

4075 sym R (2601 sym/7 pcs) 14 img

Putting data on maps using R: easier than ever

23.11.2016

New Zealand earthquake density 2010 – November 2016Using R to add data to maps has been pretty straightforward for a few years now. That said, it seems easier than ever to do things like use map APIs (e.g. Google, Open Street Map), overlay quite complex data visualisations (e.g. “heatmap-style” densities) and even generate animations. A cou...

1346 sym 6 img

An Analysis of Contributions to PubMed Commons

01.12.2016

I recently saw a tweet floating by which included a link to some recent statistics from PubMed Commons, the NCBI service for commenting on scientific articles in PubMed. Perhaps it was this post at their blog. So I thought now would be a good time to write some code to analyse PubMed Commons data. The tl;dr version: here’s the Github repository...

4920 sym 14 img

Evidence for a limit to effective peer review

18.12.2016

I missed it first time around but apparently, back in October, Nature published a somewhat-controversial article: Evidence for a limit to human lifespan. It came to my attention in a recent tweet: Just wow https://t.co/fupXIOAC43 pic.twitter.com/vsxT3VyTg6 — Nick Loman (@pathogenomenick) December 11, 2016 The source: a fact-check article from...

3156 sym 6 img

Taking steps (in XML)

01.02.2017

So the votes are in: Your established blog is mostly about your work. Your work changes. Do you continue at the current blog or start a new one? — Neil Saunders (@neilfws) January 23, 2017 I thank you, kind readers. So here’s the plan: (1) keep blogging here as frequently as possible (perhaps monthly), (2) on more general “how to do cool ...

3108 sym R (2395 sym/4 pcs) 6 img

The real meaning of spurious correlations

02.02.2017

Like many data nerds, I’m a big fan of Tyler Vigen’s Spurious Correlations, a humourous illustration of the old adage “correlation does not equal causation”. Technically, I suppose it should be called “spurious interpretations” since the correlations themselves are quite real, but then good marketing is everything. There is, however, ...

2330 sym R (1166 sym/4 pcs) 12 img