Publications by kjytay

A small gotcha when comparing lists using testthat

11.05.2020

I recently encountered a small gotcha when using the testthat package to test the equality of two lists. Imagine I have the following two lists below: list1 <- list(a = 1, b = 2, c = 3, d = 4) list2 <- list(a = 0, b = 2, c = 3, d = 5) Imagine that: I know that the value associated with a is going to be different, so I don’t want to test equ...

1563 sym R (698 sym/4 pcs)

glmnet v4.0: generalizing the family parameter

14.05.2020

I’ve had the privilege of working with Trevor Hastie on an extension of the glmnet package which has just been released. In essence, the glmnet() function’s family parameter can now be any object of class family. This enables the user to fit any generalized linear model with the elastic net penalty. glmnet v4.0 is now available on CRAN here. ...

4948 sym R (190 sym/2 pcs) 66 img

What is isotonic regression?

24.05.2020

Isotonic regression is a method for obtaining a monotonic fit for 1-dimensional data. Let’s say we have data such that . (We assume no ties among the ‘s for simplicity.) Informally, isotonic regression looks for such that the ‘s approximate the ‘s well while being monotonically non-decreasing. Formally, the ‘s are the solution to the ...

2541 sym 34 img

What is nearly-isotonic regression?

26.05.2020

Let’s say we have data such that . (We assume no ties among the ‘s for simplicity.) Isotonic regression gives us a monotonic fit for the ‘s by solving the problem (See this previous post for more details.) Nearly-isotonic regression, introduced by Tibshirani et al. (2009) (Reference 1), generalizes isotonic regression by solving the prob...

3127 sym R (454 sym/1 pcs) 57 img

tidyr::complete to show all possible combinations of variables

22.07.2020

This is an issue I often face, so I thought it best to write it down. When doing data analysis, we often want to known how many observations there are in each subgroup. These subgroups can be defined by multiple variables. In the code example below, I want to know how many vehicles there are for each (cyl, gear) combination: library(tidyverse) d...

1525 sym R (1189 sym/3 pcs)

Image contours in R

05.08.2020

I recently came across this short fun post on R-bloggers that demonstrated how to use the image.ContourDetector package (available on CRAN) to extract contours from an image. The image of the contours looked really cool so I thought I would try it out for myself! As an example, I wanted to extract the contours from the image below: (Source: premi...

1823 sym R (525 sym/2 pcs) 10 img

Basic manipulation of GIF frames with magick

06.08.2020

The magick package is a really powerful package for image processing in R. The official vignette is a great place to start learning how to use the package. I’ve been playing around with using magick for manipulating GIFs and found some tips and tricks that don’t seem to be documented anywhere. Since the NBA restart is upon us, I’ll illustra...

1651 sym R (1386 sym/7 pcs) 12 img

NBA salaries

28.08.2020

I came across a dataset of NBA player salaries from the 1984-1985 season to the 2017-2018 season here, and I thought it would be a fun dataset to practice my tidyverse skills on. All the code for this post can be found here. First, let’s import the tidyverse package, set the plotting theme, and read in the data files. library(tidyverse) theme_s...

5417 sym R (7465 sym/15 pcs) 24 img

Using dplyr::filter when the condition is a string

03.09.2020

This took me a while to figure out and so I thought I would post this as future reference. Let’s say I have the mtcars data and I want to filter for just the rows with cyl == 6. I would do something like this: library(tidyverse) data(mtcars) mtcars %>% filter(cyl == 6) # mpg cyl disp hp drat wt qsec vs am gear carb # Mazd...

1312 sym R (1859 sym/4 pcs)

Simulating paths from a random walk

09.09.2020

If you’ve ever visited this blog at wordpress.com, you might have noticed a header image that looks like this: Ever wonder how it was generated? The image depicts 100 simulations of an asymmetric random walk. In this post, I’ll go through the code used to generate this image. All the code can also be found here. For , consider a series of i....

3208 sym R (2116 sym/6 pcs) 50 img