Publications by R on Alan Yeung
Trying out timeplyr
The timeplyr R package, created by my colleague Nick, was accepted on CRAN in October 2023. A direct quote from the CRAN page is that it provides a set of fast tidy functions for wrangling, completing and summarising date and date-time data. It looks like a really neat package for working with time series data in a way consistent with what people h...
2686 sym R (4125 sym/5 pcs) 6 img
Grouped Sequences in dplyr Part 2
I just wrote a post about grouped sequences in dplyr and following that, I’ve been made aware of another couple of solutions to this problem (credit John Mackintosh). The solution involves using the consecutive_id() function, available in dplyr since v1.1.0. In the help page for this function, it’s mentioned that it was inspired by rleid() func...
1324 sym R (5199 sym/3 pcs)
Grouped Sequences in dplyr
For a piece of work I had to calculate the number of matches that a team plays away from home in a row, which we will call days_on_the_road. I was not sure how to do this with dplyr but it’s basically a ‘grouped sequence’. For this post, I’ve created some dummy data to illustrate this idea. The num_matches_away variable is what we want to m...
1429 sym R (2933 sym/3 pcs)
A couple of case_when() tricks
Combining case_when() and across() If you want to use case_when() and across() different variables, then here is an example that can do this with the help of the get() and cur_column() functions. library(tidyverse) iris_df <- as_tibble(iris) %>% mutate(flag_Petal.Length = as.integer(Petal.Length > 1.5), flag_Petal.Width = as.integer(Pe...
999 sym R (1649 sym/2 pcs)
Summarising Dates with Missing Values
This blog post is just a note that when you try to do a grouped summary of a date variable but some groups have all missing values, it will return Inf. This means that the summary will not show up as an NA and this can cause issues in analysis if you are not careful. library(tidyverse) df <- tibble::tribble( ~id, ~dt, 1L, "01/01/2...
998 sym R (944 sym/3 pcs)
Glasgow R User Group
I am very excited to hear that there are attempts to create a brand new R user group in Glasgow! I had just talked in Post Number One about my guilt at not having been able to attend EdinbR as often as I wished but it should be much easier for me to find time to attend a group based in Glasgow. If you are based in (or near) Glasgow and would like...
1160 sym 2 img
Glasgow R User Group
I am very excited to hear that there are attempts to create a brand new R user group in Glasgow! I had just talked in Post Number One about my guilt at not having been able to attend EdinbR as often as I wished but it should be much easier for me to find time to attend a group based in Glasgow. If you are based in (or near) Glasgow and would like...
1160 sym 2 img
Getting Open Data into R from CKAN
Preamble Open Data in Scotland Querying CKAN Querying with Custom JSON Querying with SQL Conclusions and Further Ideas Preamble I’ve got lots of rough pieces of R code written as I’ve been exploring/testing various things in the past. A lot of this is currently stored in a pretty disorganised fashion so I thought it would be a good idea to ...
8046 sym R (3356 sym/4 pcs)
Getting Open Data into R from CKAN
Preamble Open Data in Scotland Querying CKAN Querying with Custom JSON Querying with SQL Conclusions and Further Ideas Preamble I’ve got lots of rough pieces of R code written as I’ve been exploring/testing various things in the past. A lot of this is currently stored in a pretty disorganised fashion so I thought it would be a good idea to ...
8046 sym R (3356 sym/4 pcs)
Hacking dbplyr for CKAN
Aim Create a dummy database Test dbplyr’s SQL translation Modify dbplyr’s SQL translation Testing the dbplyr hack Concluding notes At the end of my first post on CKAN discussing how to use the CKAN API to extract data from the NHS open data platform directly into R, I talked about how it would be neat to write some wrapper functions to make ...
5758 sym R (2217 sym/6 pcs)