Publications by Ken Kleinman
Citing R or SAS
One of us recently read a colleague’s first draft of a paper, in which she had written: “All analyses were done in R 2.14.0.” We assume we’re preaching to the converted here, when we say that the enormous amount of work that goes into R needs to be recognized as often as possible, and that R’s creators deserve to reap some ...
3150 sym 14 img
Third year wrap-up
July marks the end of three years of blogging for us. By our count, we’ve posted 121 examples across the first three years. We aim to be helpful and interesting.As always, it’s hard to get a sense of our readership. At the time we wrote this, Feedburner reports about 1050 regular readers (up from 650 last year), but this (still)...
2852 sym 14 img
Example 10.1: Read a file byte by byte
More and more makers of electronic devices use standard storage media to record data. Sometimes this is central to the device’s function, as in a camera, so that the data must be easy to recover. Other times, it’s effectively incidental, and the device maker may not provide easy access to the stored data. For example, I recently was prescr...
5095 sym R (899 sym/5 pcs) 18 img
Example 10.2: Custom graphic layouts
In example 10.1 we introduced data from a CPAP machine. In brief, it’s hard to tell exactly what’s being recorded in the data set, but it seems to be related to the pattern of breathing. Measurements are taken five times a second, leading to on the order of 100,000 data points in a typical night. To get a visual sense of what a night’s b...
4130 sym R (1417 sym/4 pcs) 18 img
Example 10.3: Enhanced scatterplot with marginal histograms
Back in example 8.41 we showed how to make a graphic combining a scatterplot with histograms of each variable. A commenter suggested we change the R graphic to allow post-hoc plotting of, for example, lowess lines. In addition, there are further refinements to be made. In this R-only entry, we’ll make the figure more flexible and a bit more ...
4010 sym R (1941 sym/4 pcs) 20 img
Example 10.4: Multiple comparisons and confidence limits
A colleague is a devotee of confidence intervals. To him, the CI have the magical property that they are immune to the multiple comparison problem– in other words, he feels its OK to look at a bunch of 95% CI and focus on the ones that appear to exclude the null. This though he knows well the one-to-one relationship between 95% CIs that exclu...
4249 sym R (1695 sym/6 pcs) 18 img
Example 10.5: Convert a character-valued categorical variable to numeric
In some settings it may be necessary to recode a categorical variable with character values into a variable with numeric values. For example, the matching macro we discussed in example 7.35 will only match on numeric variables. One way to convert character variables to numeric values is to determine which values exist, then write a possibly lon...
3280 sym R (1887 sym/4 pcs) 16 img
Example 10.6: Should Poisson regression ever be used? Negative binomial vs. Poisson regression
In practice, we often find that count data is not well modeled by Poisson regression, though Poisson models are often presented as the natural approach for such data. In contrast, the negative binomial regression model is much more flexible and is therefore likely to fit better, if the data are not Poisson. In example 8.30 we comp...
7112 sym R (3956 sym/7 pcs) 14 img
Example 10.7: Fisher vs. Pearson
In the early days of the discipline of statistics, R.A. Fisher argued with great vehemence against Egon Pearson (and Jerzy Neyman) over the foundational notions supporting statistical inference. The personal invective recorded is somewhat amusing and also reminds us how very puerile even very smart people can be. Today, we’ll compare Fisher�...
7435 sym R (2702 sym/7 pcs) 18 img
Example 10.8: The upper 95% CI is 3.69
Apologies for the long and unannounced break– the longest since we started blogging, three and a half years ago. I was writing a 2-day course for SAS users to learn R. Contact me if you’re interested. And Nick and I are beginning work on the second edition of our book– look for it in the fall. Please let us know if you have ...
3546 sym R (805 sym/3 pcs) 18 img