Publications by Wesley
Waiting in One Line or Multiple Lines
Whenever I go to the grocery store it always seems to be a lesson in statistics. I go get the things I need to buy and then I try to select the checkout register that will decrease the amount of time I have to wait. Inevitably, I select the one line where there is some sort of problem and I just sit there and wait and wait. I will often mark ...
3385 sym R (3036 sym/1 pcs) 8 img
The Uncertainty of Predictions
There are many kinds of intervals in statistics. To name a few of the common intervals: confidence intervals, prediction intervals, credible intervals, and tolerance intervals. Each are useful and serve their own purpose. I’ve been recently working on a couple of projects that involve making predictions from a regression model and I’ve been...
3157 sym R (1462 sym/1 pcs) 12 img
That’s Smooth
I had someone ask me the other day how to take a scatterplot and draw something other than a straight line through the graph using Excel. Yes, it can be done in Excel and it’s really quite simple, but there are some limitations when using the stock Excel dialog screens. So it is probably in one’s best interest to use a higher quality statis...
7085 sym 20 img
Random Sequence of Heads and Tails: For R Users
Rick Wicklin on the SAS blog made a post today on how to tell if a sequence of coin flips were random. I figured it was only fair to port the SAS IML code over to R. Just like Rick Wicklin did in his example this is the Wald-Wolfowitz test for randomness. I tried to match his code line-for-line. flips = matrix(c('H','T','T','H','H','H','T',...
707 sym R (1086 sym/1 pcs)
Beta Distribution and the NJ U.S. Senate Election
The beta distribution is highly flexible distribution and applies to many situations and environments. The beta distribution applies well when there are percentages. The upcoming New Jersey U.S. Senate election on Wednesday fits that criterion quite well. So here I applied the beta distribution to some pre-election polls where the numbers were ob...
5304 sym 4 img
Tracking the 2013 Hurricane Season
With it being the end of hurricane season it’s only appropriate to do a brief summary of the activity this year. It’s been a surprisingly low-key season as far as hurricanes are concerned. There have been only a few hurricanes and the barometric pressure of any hurricane this season has not even come close to hurricane Sandy (which broke...
1671 sym R (2461 sym/1 pcs) 4 img
Spatial Clustering With Equal Sizes
This is a problem I have encountered many times where the goal is to take a sample of spatial locations and apply constraints to the algorithm. In addition to providing a pre-determined number of K clusters a fixed size of elements needs to be held constant within each cluster. An application of this algorithm is when one needs to geographical...
4148 sym R (5717 sym/1 pcs) 10 img 1 tbl
Some Options for Testing Tables
Contingency tables are a very good way to summarize discrete data. They are quite easy to construct and reasonably easy to understand. However, there are many nuances with tables and care should be taken when making conclusions related to the data. Here are just a few thoughts on the topic. Dealing with sparse data On one end of the spectrum t...
5428 sym 8 img 4 tbl
Probabilities and P-Values
P-values seem to be the bane of a statistician’s existence. I’ve seen situations where entire narratives are written without p-values and only provide the effects. It can also be used as a data reduction tool but ultimately it reduces the world into a binary system: yes/no, accept/reject. Not only that but the binary threshold is determine...
1905 sym Python (3003 sym/1 pcs) 2 img
Connecting TOAD For MySQL, MySQL Workbench, and R to Amazon AWS EC2 Using SSH Tunneling
I often use Amazon EC2 to store and retrieve data when I need either additional storage or higher computing capacity. In this tutorial I’ll share how to connect to a MySQL database so that one can retrieve the data and do the analysis. I tend to use either TOAD for MySQL or MySQL Workbench to run and test queries against a MySQL database.�...
5898 sym 18 img