Publications by Nick Horton

Example 7.1: Create a Fibonacci sequence

12.06.2009

The Fibonacci numbers have many mathematical relationships and have been discovered repeatedly in nature. They are constructed as the sum of the previous two values, initialized with the values 1 and 1.A pdf of this example is available here.SASIn SAS, we use the lag function (section 1.4.17, p. 22) to retrieve the last value.data fibo; do i = 1 ...

849 sym R (418 sym/4 pcs) 2 img

Book excerpts now posted

18.07.2009

We’ve posted excerpts from the book on the book website. The excerpts include Chapter 3 (regression and ANOVA) in its entirety. This demonstrates how the entries (the generic descriptions of software functions) and the worked examples reinforce each other. We’ve also posted selected entries from each of the other chapters, desc...

1127 sym 4 img

packages and CRANtastic

24.08.2009

Additional functionality in R is added through packages, which consist of libraries of bundled functions, datasets, examples and help files that can be downloaded from CRAN (the Comprehensive R Archive Network). The function install.packages() or the windowing interface under Packages and Data (Mac) or Packages (Windows) are used to d...

2590 sym 4 img

Example 7.12: Calculate and plot a running average

17.09.2009

The Law of Large Numbers concerns the stability of the mean, as sample sizes increase. This is an important topic in mathematical statistics. The convergence (or lack thereof, for certain distributions) can easily be visualized in SAS and R (see also Horton, Qian and Brown, 2004).Assume that X1, X2, …, Xn are independent and ident...

798 sym 2 img

Example 7.13: Read a file with two lines per observation

24.09.2009

In example 7.6 we showed how to retrieve the Amazon sales rank of a book. A cron job on one of our machines grabs the sales rank hourly. We’d like to use this data to satisfy our curiosity about when and how often a book sells. A complication is that the result of the cron job is a file with the time of the sales rank retrieval on ...

803 sym 2 img

Example 7.16: assess robustness of permutation test to violations of exchangeability assumption

24.10.2009

Permutation tests (section 2.4.3) are a form of resampling based inference that can be used to compare two groups. A simple univariate two-group permutation test requires that the group labels for the observations are exchangeable under the null hypothesis of equal distributions, but allows relaxation of specific distributional assum...

801 sym 2 img

Example 7.17: The Smith College diploma problem

12.11.2009

Smith College is a residential women’s liberal arts college in Northampton, MA that is steeped in tradition. One such tradition is to give each student at graduation a diploma at random (or more accurately, in a haphazard fashion). At the end of the ceremony, a diploma circle is formed, and students pass the diplomas that they rece...

796 sym 2 img

Example 7.22: the Knapsack problem

13.01.2010

The website http://rosettacode.org/wiki/Knapsack_Problem describes a fanciful trip by a traveler to Shangri La. They can take as many as they want of three valuable items, as long as they fit in a knapsack. The knapsack will hold no more than 25 weight units, and no more than 25 volume units. The problem is to maximize the value of...

803 sym 2 img

Example 7.24: Sampling from a pathological distribution

01.03.2010

Evans and Rosenthal consider ways to sample from a distribution with density given by:f(y) = c e^(-y^4)(1+|y|)^3where c is a normalizing constant and y is defined on the whole real line.Use of the probability integral transform (section 1.10.8) is not feasible in this setting, given the complexity of inverting the cumulative density ...

801 sym 2 img

Augmented support for complex survey designs in R

03.03.2010

We’ll get back to code examples later this week, but wanted to let you know about an R package with updated functionality in the meantime.The appropriate analysis of sample surveys requires incorporation of complex design features, including stratification, clustering, weights, and finite population correction. These can be addres...

801 sym 2 img