Publications by Cory Lesmeister
NFL week 3 update
With another NFL week down we are starting to see separation from the contenders and the “better luck next year” teams. Any paid TV mouthpiece worth their salt will tell you it is a quarterback driven league. Driven indeed. In the last post, I dabbled in the simple code of a correlation heat map. Now, I realize I may have led the floc...
1929 sym 6 img
Let’s have a "party" and tear this place "rpart"!
For many problems, classification and regression trees can be a simple and elegant solution, assuming you know their well-documented strengths and weaknesses. I first explored their use several years ago with JMP, which is easy to use. If you do not have JMP Pro, you will not be able to use the more advanced techniques (ensemble m...
3480 sym 18 img
Ensemble Methods, part 1
Last week I dabbled in building classification trees with the party and rpart packages. Now, I want to put together a series where I can apply those basic trees along with advanced techniques like bagging, boosting and random forest. Additionally, I’ve taken the leap and have started using the RStudio interface. I must say I�...
5221 sym 8 img
Ensemble, Part2 (Bootstrap Aggregation)
Part 1 consisted of building a classification tree with the “party” package. I will now use “ipred” to examine the same data with a bagging (bootstrap aggregation) algorithm. > library(ipred)> train_bag = bagging(class ~ ., data=train, coob=T)> train_bagBagging classification trees with 25 bootstrap replicationsCall: bagging...
2299 sym 2 img
Part 3: Random Forests and Model Selection Considerations
I want to wrap this series up on the breast cancer data set and move on to other topics. Here I will include the random forest technique and evaluate all three modeling techniques together, including the conditional inference tree and bootstrap aggregation. I was going to include several other techniques here, but have decided to ...
4008 sym 2 img
Spurious Regression of Time Series
spu.ri.ousadjective : not genuine, sincere, or authentic: based on false ideas or bad reasoninghttp://www.merriam-webster.com/dictionary/spuriousWhen it comes to analysis of time series, just because you can, doesn’t mean you should, particularly with regards to regression. I...
1874 sym Python (2645 sym/9 pcs) 8 img
An Inconvenient Statistic
As I sit here waiting on more frigid temperatures subsequent to another 10 inches of snow, suffering from metastatic cabin fever, I can’t help but ponder what I can do examine global warming/climate change. Well, as luck would have it, R has the tools to explore this controversy. Using two packages, vars and forecast, I will see...
9605 sym 6 img
The "Fighting Sioux" Surge
Even a casual fan of North Dakota Hockey will notice that in the era of Coach Dave Hakstol, the team seems to perform better in the second half of the season than the first. For the rabid fans of the team like myself, it has become a horrible and interesting fact of life. By mid-December we are inevitably calling for Hakstol’s h...
4399 sym 2 img
MoneyPuck – Best subsets regression of NHL teams
Spring is at hand and it is a time of renewal, March Madness and to settle scores in the NHL. There are many scores to be settled: Flyers vs. Penguins, Blackhawks vs. Red Wings, Leafs vs. Habs and pretty much everyone else vs. the Bruins. Like any fire and brimstone hockey fan, clutching my old testament (a well-worn VHS copy of S...
4976 sym 10 img
Mythbusting – Dr. Copper
Image by Justin Reznick “An economist is an expert who will know tomorrow why the things he predicted yesterday didn't happen today.” Laurence J. Peter (author and creator of the Peter Principle) If you were paying attention to financial sites last month, you probably noticed a number of articles on “Dr. Copper”. Here is just a small s...
5980 sym Python (4452 sym/12 pcs) 16 img