Publications by Wesley

Waiting in One Line or Multiple Lines

23.09.2013

Whenever I go to the grocery store it always seems to be a lesson in statistics. I go get the things I need to buy and then  I try to select the checkout register that will decrease the amount of time I have to wait. Inevitably, I select the one line where there is some sort of problem and I just sit there and wait and wait.  I will often mark ...

3385 sym R (3036 sym/1 pcs) 8 img

The Uncertainty of Predictions

02.10.2013

There are many kinds of intervals in statistics.  To name a few of the common intervals: confidence intervals, prediction intervals, credible intervals, and tolerance intervals. Each are useful and serve their own purpose. I’ve been recently working on a couple of projects that involve making predictions from a regression model and I’ve been...

3157 sym R (1462 sym/1 pcs) 12 img

That’s Smooth

10.10.2013

I had someone ask me the other day how to take a scatterplot and draw something other than a straight line through the graph using Excel.  Yes, it can be done in Excel and it’s really quite simple, but there are some limitations when using the stock Excel dialog screens. So it is probably in one’s best interest to use a higher quality statis...

7085 sym 20 img

Random Sequence of Heads and Tails: For R Users

10.10.2013

Rick Wicklin on the SAS blog made a post today on how to tell if a sequence of coin flips were random.  I figured it was only fair to port the SAS IML code over to R.  Just like Rick Wicklin did in his example this is the Wald-Wolfowitz test for randomness.  I tried to match his code line-for-line. flips = matrix(c('H','T','T','H','H','H','T',...

707 sym R (1086 sym/1 pcs)

Beta Distribution and the NJ U.S. Senate Election

14.10.2013

The beta distribution is highly flexible distribution and applies to many situations and environments. The beta distribution applies well when there are percentages. The upcoming New Jersey U.S. Senate election on Wednesday fits that criterion quite well. So here I applied the beta distribution to some pre-election polls where the numbers were ob...

5304 sym 4 img

Tracking the 2013 Hurricane Season

21.10.2013

With it being the end of hurricane season it’s only appropriate to do a brief summary of the activity this year.   It’s been a surprisingly low-key season as far as hurricanes are concerned.  There have been only a few hurricanes and the barometric pressure of any hurricane this season has not even come close to hurricane Sandy (which broke...

1671 sym R (2461 sym/1 pcs) 4 img

Spatial Clustering With Equal Sizes

04.11.2013

This is a problem I have encountered many times where the goal is to take a sample of spatial locations and apply constraints to the algorithm.  In addition to providing a pre-determined number of K clusters a fixed size of elements needs to be held constant within each cluster. An application of this algorithm is when one needs to geographical...

4148 sym R (5717 sym/1 pcs) 10 img 1 tbl

Some Options for Testing Tables

18.11.2013

Contingency tables are a very good way to summarize discrete data.  They are quite easy to construct and reasonably easy to understand. However, there are many nuances with tables and care should be taken when making conclusions related to the data. Here are just a few thoughts on the topic. Dealing with sparse data On one end of the spectrum t...

5428 sym 8 img 4 tbl

Probabilities and P-Values

02.12.2013

P-values seem to be the bane of a statistician’s existence.  I’ve seen situations where entire narratives are written without p-values and only provide the effects. It can also be used as a data reduction tool but ultimately it reduces the world into a binary system: yes/no, accept/reject. Not only that but the binary threshold is determine...

1905 sym Python (3003 sym/1 pcs) 2 img

Connecting TOAD For MySQL, MySQL Workbench, and R to Amazon AWS EC2 Using SSH Tunneling

07.01.2014

I often use Amazon EC2 to store and retrieve data when I need either additional storage or higher computing capacity.  In this tutorial I’ll share how to connect to a MySQL database so that one can retrieve the data and do the analysis.  I tend to use either TOAD for MySQL or MySQL Workbench to run and test queries against a MySQL database.�...

5898 sym 18 img