Publications by Neil Gunther

Playing with Primes in R (Part II)

17.06.2010

Popping Part III off the stack—where I ended up unexpectedly discovering that the primes and primlist functions are broken in the schoolmath package on CRAN—let’s see what prime numbers look like when computed correctly in R. To do this, I’ve had to roll my own prime number generating function.Personalizing primes in RFor what...

11592 sym R (2249 sym/11 pcs) 4 img

Linear Modeling in R and the Hubble Bubble

22.06.2010

Here is a scatter plot with the coordinate labels deliberately omitted.Figure 1.Do you see any trends? How would you model these data? It just so happens that this scatterplot is arguably the most famous scatterplot in history. One aficionado, writing more than forty years after its publication, commented skeptically [1]:“[The] data...

7402 sym R (1780 sym/1 pcs) 10 img

Prime Parallels for Load Balancing

05.07.2010

Having finally popped the stack on computing prime numbers with R in Part II and Part III, we are now in a position to discuss their relevance for computational scalability. My original intent was to show how poor partitioning of a workload can defeat the linear scalability expected when full parallelism is otherwise attainable, i.e....

5709 sym R (1036 sym/4 pcs) 4 img 10 tbl

Go Guerrill… R on Your Data in August

05.07.2010

Only one month to go! Register now for the Guerrilla Data Analysis Techniques (GDAT) class to be held during the week of August 9-13, 2010. The focus will be on using R and the PDQ-R for computer performance analysis and capacity planning.(Click on the image for details)For those of you coming from international locations, here is a...

838 sym 2 img

Gone Guerrill_ R on Our Data

16.08.2010

Here’s a summary of some things we learnt about applying R to computer performance and capacity planning data in the GDAT Class last week. Neural nets pkg nnet applied to CPU performance data in the Ripley and Venables book (see Section 8.10). How to do stacked plots that Jim calls “spark plots.” Jim told us that ggplot has a...

1408 sym

Excel Errors and Other Numerical Nightmares

25.08.2010

Although I use Excel all the time, and I strongly encourage my students to use it for performance analysis and CaP, I was forced to include a warranty disclaimer in my GCaP book because I discovered a serious numerical error while writing Appendix B. There, my intention was just to show that Excel gives essentially the same results as...

3649 sym R (166 sym/1 pcs) 4 img

Where to Start with PDQ?

30.08.2010

Once you’ve downloaded PDQ with a view to solving your performance-related questions, the next step is getting started using it. Why not have some fun with blocks? Fun-ctional blocks, that is. Since all digital computers and network systems can be considered as a collection of functional blocks and these blocks often contain buffers...

2782 sym R (1186 sym/2 pcs) 2 img

Confidence Bands for Universal Scalability Models

07.09.2010

In the recent GDAT class, confidence intervals (CI) for performance data were discussed. Their generalization to confidence bands (CB) for scalability projections using the USL model also came up informally. I showed a prototype plot but it was an ugly hack. Later requests from GDAT attendees to apply CBs to their own data meant I ha...

3916 sym R (2622 sym/3 pcs) 4 img

Reporting Standard Errors for USL Coefficients

13.11.2010

In a recent Guerrilla CaP Group discussion, Baron S. wrote:.... BS> Using gnuplot against the dataset I gave, I get BS>    sigma   0.0207163 +/- 0.001323 (6.385%) BS>    kappa   0.000861226 +/- 5.414e-05 (6.287%) The Gnuplot output includes the errors for each of the universal scalability law (USL) coefficients. A question ab...

2233 sym R (2333 sym/6 pcs)

Applying PDQ in R to Load Testing

19.05.2011

PDQ is a library of functions that helps you to express and solve performance questions about computer systems using the abstraction of queues. The queueing paradigm is a natural choice because, whether big (a web site) or small (a laptop), all computer systems can be represented as a network or circuit of buffers and a buffer is a ty...

4151 sym R (5849 sym/8 pcs) 4 img