Publications by Ken Kleinman

Example 7.25: compare draws with distribution

05.03.2010

In example 7.24, we demonstrated a Metropolis-Hastings algorithm for generating observations from awkward distributions. In such settings it is desirable to assess the quality of draws by comparing them with the target distribution.Recall that the distribution function is f(y) = c e^(-y^4)(1+|y|)^3The constant c was not needed to gen...

803 sym 2 img

Example 7.26: probability question

08.03.2010

Here’s a surprising problem, from the xkcd blog.Suppose I choose two (different) real numbers, by any process I choose. Then I select one at random (p= .5) to show Nick. Nick must guess whether the other is smaller or larger. Being right 50% of the time is easy. Can he do better?Of course, it wouldn’t be an interesting questio...

796 sym 2 img

Example 7.27: probability question reconsidered

15.03.2010

In Example 7.26, we considered a problem, from the xkcd blog:Suppose I choose two (different) real numbers, by any process I choose. Then I select one at random (p= .5) to show Nick. Nick must guess whether the other is smaller or larger. Being right 50% of the time is easy. Can he do better?Randall Munroe offers a solution which ...

803 sym 2 img

Example 7.28: Bubble plots

22.03.2010

A bubble plot is a means of displaying 3 variables in a scatterplot. The z dimension is presented in the size of the plot symbol, typically a circle. The area or radius of the circle plotted is proportional to the value of the third variable. This can be a very effective data presentation method. For example, consider Andrew Gelma...

803 sym 2 img

Example 7.29: Bubble plots colored by a fourth variable

27.03.2010

In Example 7.28, we generated a bubble plot showing the relationship among CESD, age, and number of drinks, for women. An anonymous commenter asked whether it would be possible to color the circles according to gender. In the comments, we showed simple code for this in R and hinted at a SAS solution for two colors. Here we show in ...

800 sym 2 img

Example 7.30: Simulate censored survival data

30.03.2010

To simulate survival data with censoring, we need to model the hazard functions for both time to event and time to censoring. We simulate both event times from a Weibull distribution with a scale parameter of 1 (this is equivalent to an exponential random variable). The event time has a Weibull shape parameter of 0.002 times a linea...

798 sym 2 img

Example 7.31: Contour plot of BMI by weight and height

05.04.2010

A contour plot is a simple way to plot a surface in two dimensions. Lines with a constant Z value are plotted on the X-Y plane.Typical uses include weather maps displaying “isobars” (lines of constant pressure), and maps displaying lines of constant elevation useful in, e.g., hiking. Unusual examples include maps of constant tra...

799 sym 2 img

Example 7.33: Specifying fonts in graphics

19.04.2010

For interactive data analysis, the default fonts used by SAS and R are acceptable, if not beautiful. However, for publication, it may be important to manipulate the fonts. For example, it would be desirable for the fonts in legends, axis labels, or other text printed in plots to approximate the typeface used in the rest of the text....

803 sym 2 img

Example 7.34: Propensity scores and causal inference from observational studies

26.04.2010

Propensity scores can be used to help make causal interpretation of observational data more plausible, by adjusting for other factors that may responsible for differences between groups. Heuristically, we estimate the probability of exposure, rather than randomize exposure, as we’d ideally prefer to do. The estimated probability o...

800 sym 2 img

Example 7.35: Propensity score matching

03.05.2010

As discussed in example 7.34, it’s sometimes preferable to match on propensity scores, rather than adjust for them as a covariate.SASWe use a suite of macros written by Jon Kosanke and Erik Bergstralh at the Mayo Clinic. The dist macro calculates the pairwise distances between observations, while the vmatch macro makes matches base...

799 sym 2 img