Publications by John MacKintosh

The life changing magic of tidying text files

21.12.2024

Our team have been doing some work with the Scotland Census 2022 data. There are several ways to download the information – you can click around on maps or use a table builder to focus on specifics, or there is a large zip download that provides all the data in CSV format. You end up with 71 files, with around 46K rows and a variable number of co...

6017 sym R (586 sym/3 pcs)

Resolving errors connecting PowerBI to Linux version of MS SQL Server

14.03.2024

Situation: Cannot connect to SQL Server, running on Linux via WSL, from PowerBI, on MS Windows 10. I’m receiving an error message similar to below: DataSource.Error: Microsoft SQL: The target principal name is incorrect. Cannot generate SSPI context. Details: DataSourceKind=SQL Message=The target principal name is incorrect. Cann...

1293 sym R (278 sym/1 pcs)

new programming with data.table

04.02.2024

The newest version of data.table has hit CRAN, and there are lots of great new features. Among them, a %notin% function, a new let function that can be used instead of := ( I wasn’t too fussed about this originally but have tried it a few times today and I may well adopt it – although I do like that := really stands out in my code when assignin...

3272 sym R (3345 sym/10 pcs)

more .I in data.table

02.02.2024

Following on from my last post, here is a bit more about the use of .I in data.table. Scenario : you want to obtain either the first, or last row, from a set of rows that belong to a particular group. For example, for a patient admitted to hospital, you may want to capture their first admission, or the entire time they were in a specific hospital (...

1993 sym R (714 sym/1 pcs)

.I in data.table

02.01.2024

In this post I’m using a small extract from the SIMD2020 dataset to figure out what the special operator .I does. Files and code are on github if you’re interested # files and code : https://github.com/johnmackintosh/DT_dot_I library(data.table) DT <- fread("highdata.csv") lookup <- fread("https://raw.githubusercontent.com/johnmackintosh/ph_loo...

3240 sym R (4872 sym/18 pcs)

non-equi joins in data.table

21.12.2023

I have been toying with some of the advent of code challenges (I am way behind though!). For day 5, I had to create a function, and I’m writing this up, because it’s an example of a non-equi join between two tables. In this particular sitation, there are are no common columns between the two tables, so my usual data.table hack of copying the c...

2124 sym R (808 sym/4 pcs)

Ph Profiles

05.12.2023

why— layout: post title: That’s a (W)RAP! published: true date: 2023-12-06 image: path: /assets/img/blog/officer.png tags: rstats description: > An ambition realised as a suite of R powered publications enter the public domain — The third set in our series of public health profiles was published recently. These comprise of a suite of 13...

3782 sym

checks and {tiny}testing – a quick primer

06.10.2023

This material was presented to a meeting of KIND (Knowledge and Information Network) in April this year. checks What assumptions are you making about your data? (structure, names, types etc.) function arguments what users will and won’t do tests Describe what you expect your functions to do, and how they should behave with regards to user input...

3832 sym Python (5774 sym/19 pcs) 2 img

Pivoting in tidyr and data.table

17.02.2023

We all need to pivot data at some point, so these are just some notes for my own benefit really, because gather and spread are no longer in favour within tidyr. I tended to only ever need gather, and nearly always relied on the same key and value names, so it was an easy function for me to use. I have discovered that pivot_longer and pivot_wider ar...

4676 sym R (7668 sym/24 pcs)

Fixing my broken VSCode setup for R

03.02.2023

I recently updated my R installation, and then realised that I’d broken my VSCode/ R set up in the process – I could not launch an R terminal either directly or via radian. I have a repo where I’ve collated various blog posts relating to setting up VSCode for R, but that didn’t solve all my problems. I did get it resolved eventually, and ...

2602 sym R (3305 sym/2 pcs)