Publications by HighlandR
Moving back home
Apologies for this very poor blog post, but this is the easiest way to inform most readers at the same time. In case you did not notice, my website has changed back from my name dot net, to my name dot com. Just change the domain over and links to previous blog posts etc. should still work. I am no longer the owner of the old dot net domain and n...
784 sym
Population pyramid plots with base R
You will find the app demonstrating these pyramid plots here A brief timeline of this plot: It started off as a revised ggplot function for an internal charting package, with source data saved as parquet files. Then, because I wanted to try running shiny in the browser, I switched to plotly. It took a while to get up and running with shinylive. Som...
3223 sym
new programming with data.table
The newest version of data.table has hit CRAN, and there are lots of great new features. Among them, a %notin% function, a new let function that can be used instead of := ( I wasn’t too fussed about this originally but have tried it a few times today and I may well adopt it – although I do like that := really stands out in my code when assignin...
3266 sym R (3345 sym/10 pcs)
more .I in data.table
Following on from my last post, here is a bit more about the use of .I in data.table. Scenario : you want to obtain either the first, or last row, from a set of rows that belong to a particular group. For example, for a patient admitted to hospital, you may want to capture their first admission, or the entire time they were in a specific hospital (...
1990 sym R (714 sym/1 pcs)
.I in data.table
In this post I’m using a small extract from the SIMD2020 dataset to figure out what the special operator .I does. Files and code are on github if you’re interested # files and code : https://github.com/johnmackintosh/DT_dot_I library(data.table) DT <- fread("highdata.csv") lookup <- fread("https://raw.githubusercontent.com/johnmackintosh/ph_loo...
3237 sym R (4872 sym/18 pcs)
non-equi joins in data.table
I have been toying with some of the advent of code challenges (I am way behind though!). For day 5, I had to create a function, and I’m writing this up, because it’s an example of a non-equi join between two tables. In this particular sitation, there are are no common columns between the two tables, so my usual data.table hack of copying the c...
2121 sym R (808 sym/4 pcs)
Achieve your target
Last week I had to talk my colleagues through the architecture of an R project that we’ve been working on for a while. This is a large project, as we make our first moves into Reproducible Analystic Pipelines, and makes heavy use of the {targets} package. As I was going through it, I realised that it was way too complex, and it wasn’t reasonabl...
3426 sym R (131 sym/1 pcs)
Pivoting in tidyr and data.table
We all need to pivot data at some point, so these are just some notes for my own benefit really, because gather and spread are no longer in favour within tidyr. NB – this post has been updated with collapsible sections to show/hide the data and outputs. I tended to only ever need gather, and nearly always relied on the same key and value names, s...
1273 sym R (2297 sym/3 pcs)
Pivoting in tidyr and data.table
We all need to pivot data at some point, so these are just some notes for my own benefit really, because gather and spread are no longer in favour within tidyr. I tended to only ever need gather, and nearly always relied on the same key and value names, so it was an easy function for me to use. I have discovered that pivot_longer and pivot_wider ar...
4673 sym R (7668 sym/24 pcs)
Making headlines
In my current mammoth work project, I’m generating many plots. The titles are very descriptive (they tell you what the plot is about), but they are not really telling a story. That’s simply because there are so many on the production line. What we’d like, is to analyse the data, and extract the salient points. Better still, we’d want this...
4483 sym R (3143 sym/6 pcs) 2 img