Publications by Nick Horton

Example 8.10: Combination dotplot/boxplot (teaching graphic in honor of World Statistics Day)

19.10.2010

In honor of World Statistics Day and the read paper that my co-authors Chris Wild, Maxine Pfannkuch, Matt Regan, and I are presenting at the Royal Statistical Society today, we present the R code to generate a combination dotplot/boxplot that is useful for students first learning statistics. One of the over-riding themes of the paper...

809 sym 12 img

Reader suggestions on alternative ways to create combination dotplot/boxplot

24.10.2010

Kudos to several of our readers, who suggested simpler ways to craft the graphical display (combination dotplot/boxplot) from our most recent example.Yihui Xie combines a boxplot with a coarsened version of the PCS scores (using the round() function) used in the stripchart() function.ds = read.csv("http://www.math.smith.edu/r/data/help.csv") smal...

1382 sym R (791 sym/2 pcs) 18 img

Example 8.11: violin plots

26.10.2010

We’ve continued to get useful feedback and ideas from our posts on the combination dotplot/boxplot and other ways to craft similar displays. Another notion is the violin plot, which combines a boxplot and a (doubled) kernel density plot. While the basic notion of the violin plot does not include the individual points, such a display has virtu...

1786 sym R (380 sym/2 pcs) 18 img

Example 8.14: generating standardized regression coefficients

15.11.2010

Standardized (or beta) coefficients from a linear regression model are the parameter estimates obtained when the predictors and outcomes have been standardized to have variance = 1. Alternatively, the regression model can be fit and then standardized post-hoc based on the appropriate standard deviations. The parameters are thus interpreted as c...

2214 sym R (1425 sym/6 pcs) 14 img

A plea for consistent style!

22.12.2010

As we get close to the end of the year, it’s time to look back over the past year and think of resolutions for 2011 and beyond. One that’s often on my mind relates to ways to structure my code to make it clearer to others (as well as to myself when I look back upon it months later).Style guides are common in many programming langu...

2540 sym Python (1607 sym/2 pcs) 14 img

Tools to tidy up R code

28.12.2010

Last week we made an impassioned plea for attention to style in formatting R and SAS code.While it’s always better to adopt a consistent style and use it whenever you write code, the reality is that sometimes formatting slips (or you end up repurposing code that others wrote. In those situations, the formatR package (due to Yihui X...

1211 sym R (1288 sym/3 pcs) 14 img

Example 8.19: Referencing lists of variables

03.01.2011

In section 1.11.4 (p. 50), we discuss referring to lists of variables in a data set. In SAS, this can be done for variable stored in adjacent columns with the “var_x — var_y” syntax and for variables with sequentially enumerated suffixes with the “var_n1 – var_n2” syntax. We state in the above referenced section that R h...

1661 sym R (569 sym/4 pcs) 14 img

Example 8.21: latent class analysis

18.01.2011

Latent class analysis is a technique used to classify observations based on patterns of categorical responses. Collins and Lanza’s book,”Latent Class and Latent Transition Analysis,” provides a readable introduction, while the UCLA ATS center has an online statistical computing seminar on the topic.We consider an example analysis from the ...

3958 sym R (4114 sym/9 pcs) 14 img

Example 8.22: latent class modeling using randomLCA

24.01.2011

In Example 8.21 we described how to fit a latent class model to data from the HELP dataset using SAS and R. Subjects were classified based on their observed (manifest) status on the following variables (on street or in shelter in past 180 days [homeless], CESD scores above 20, received substance abuse treatment [satreat], or linked t...

1472 sym R (1726 sym/4 pcs) 14 img

Example 8.23: Expanding latent class model results

31.01.2011

In Example 8.21 we described how to fit a latent class model to data from the HELP dataset using SAS and R (using poLCA(), and then followed up in example 8.22 using randomLCA(). In both entries, we classified subjects based on their observed (manifest) status on the following variables (on street or in shelter in past 180 days [home...

2173 sym R (2660 sym/4 pcs) 14 img