Publications by kjytay

The hidden diagnostic plots for the lm object

14.11.2019

When plotting an lm object in R, one typically sees a 2 by 2 panel of diagnostic plots, much like the one below: set.seed(1) x <- matrix(rnorm(200), nrow = 20) y <- rowSums(x[,1:3]) + rnorm(20) lmfit <- lm(y ~ x) summary(lmfit) par(mfrow = c(2, 2)) plot(lmfit) This link has an excellent explanation of each of these 4 plots, and I highly r...

2368 sym R (203 sym/3 pcs) 10 img

An unofficial vignette for the gamsel package

24.11.2019

I’ve been working on a project/package that closely mirrors that of GAMSEL (generalized additive model selection), a method for fitting sparse generalized additive models (GAMs). In preparing my package, I realized that (i) the gamsel package which implements GAMSEL doesn’t have a vignette, and (ii) I could modify the vignette for my package...

8346 sym R (2975 sym/20 pcs) 72 img

Non-negative least squares

27.11.2019

Imagine that one has a data matrix consisting of observations, each with features, as well as a response vector . We want to build a model for using the feature columns in . In ordinary least squares (OLS), one seeks a vector of coefficients such that In non-negative least squares (NNLS), we seek a vector coefficients such that it minimize...

1990 sym R (910 sym/4 pcs) 40 img

Generating correlation matrix for AR(1) model

07.02.2020

Assume that we are in the time series data setting, where we have data at equally-spaced times which we denote by random variables . The AR(1) model, commonly used in econometrics, assumes that the correlation between and is , where is some parameter that usually has to be estimated. If we were writing out the full correlation matrix for co...

1716 sym R (301 sym/2 pcs) 22 img

relgam: Fitting reluctant generalized additive models

22.02.2020

I’m proud to announce that my latest research project, reluctant generalized additive modeling (RGAM), is complete (for now)! In this post, I give a brief overview of the method: what it is trying to do and how you can fit such a model in R. (This project is joint work with my advisor, Rob Tibshirani.) For an in-depth description of the method...

6479 sym 117 img

A deep dive into glmnet: type.gaussian

13.03.2020

I’m writing a series of posts on various function options of the glmnet function (from the package of the same name), hoping to give more detail and insight beyond R’s documentation. In this post, we will look at the type.gaussian option. For reference, here is the full signature of the glmnet function (v3.0-2): glmnet(x, y, family = c("gauss...

4996 sym R (668 sym/1 pcs) 150 img

Extended floating point precision in R with Rmpfr

18.03.2020

I learnt from a recent post on John Cook’s excellent blog that it’s really easy to do extended floating point computations in R using the Rmpfr package. Rmpfr is R’s wrapper around the C library MPFR, which stands for “Multiple Precision Floating-point Reliable”. The main function that users will interact with is the mpfr function: it c...

2044 sym R (740 sym/5 pcs) 12 img

rOpenSci community calls

24.03.2020

This is a short PSA about an R resource that I recently learnt about (and participated in): rOpenSci community calls. According to the website, these community calls happen quarterly, and is a place where the public can learn about “best practices, new projects, Q&As with well known developers, and… rOpenSci developments”. I heard about the...

2923 sym

A deep dive into glmnet: predict.glmnet

27.03.2020

I’m writing a series of posts on various function options of the glmnet function (from the package of the same name), hoping to give more detail and insight beyond R’s documentation. In this post, instead of looking at one of the function options of glmnet, we’ll look at the predict method for a glmnet object instead. The object returned by...

7426 sym R (1978 sym/12 pcs) 40 img

What is a dgCMatrix object made of? (sparse matrix format in R)

30.03.2020

I’ve been working with sparse matrices in R recently (those created using Matrix::Matrix with the option sparse=TRUE) and found it difficult to track down documentation about what the slots in the matrix object are. This post describes the slots in a class dgCMatrix object. (Click here for full documentation of the Matrix package (and it is a l...

3750 sym R (3128 sym/8 pcs)