Publications by heuristicandrew
SAS: “The query requires remerging summary statistics back with the original data”
Coming from a background writing SQL code directly for “real” RDBMS (Microsoft SQL Server, MySQL, and SQLite), I was initially confused when SAS would give me the following ‘note’ for a simple summary PROC SQL query: 429 proc sql; 430 create table undel_monthly as 431 select 432 year(date) as year, 433 month(d...
1577 sym R (2167 sym/3 pcs) 12 img
Delete rows from R data frame
Deleting rows from a data frame in R is easy by combining simple operations. Let’s say you are working with the built-in data set airquality and need to remove rows where the ozona is NA (also called null, blank or missing). The method is a conceptually different than a SQL database that has a dedicated […] Related To leave a co...
731 sym 2 img
“Outlook cannot open this item.” and tasks missing
Recently Microsoft Office Outlook 2007 started giving me the vague error message Outlook cannot open this item. The item may be damaged. The message would appear randomly throughout the day. Sometimes five error message boxes would be stacked up on top of each other. OK, but which item? What kind of item? Is it an email, appointment, or task?...
1611 sym 12 img
Plot ROC curve and lift chart in R
This tutorial with real R code demonstrates how to create a predictive model using cforest (Breiman’s random forests) from the package party, evaluate the predictive model on a separate set of data, and then plot the performance using ROC curves and a lift chart. These charts are useful for evaluating model performance in data minin...
762 sym 2 img
Compare performance of machine learning classifiers in R
This tutorial demonstrates to the R novice how to create five machine learning models for classification and compare the performance graphically with ROC curves in one chart. For a simpler introduction, start with Plot ROC curve and lift chart in R. # load the mlbench package which has the BreastCancer data set require(mlbench) # if [...
761 sym 2 img
Error : .onLoad failed in ‘loadNamespace’ for ‘RWeka’
After installing Weka/RWeka in R, you may get this error if you try to load RWeka in the same session: require(RWeka) Cannot create Java virtual machine (-4) Error : .onLoad failed in 'loadNamespace' for 'RWeka' Solution: Just close R and re-open it. Cause: Apparently the installation requires some initialization. Tested on R 2.10.1 on Windows...
759 sym 12 img
R: Memory usage statistics by variable
Do you need a way to find out which individual variables in R consume the most memory? # create dummy variables for demonstration x Related To leave a comment for the author, please follow the link and comment on their blog: Heuristic Andrew » r-project. R-bloggers.com offers daily e-mail updates about R news and tutorials about...
555 sym 1 img
Setting the HTML title tag in SAS ODS (the right way)
In our department and various places on the Intertubes, SAS programmers set the HTML title tag (which sets the title in web browsers and on search engines) in ODS using the headtext option: ods html headtext="<title>My great report</title>" /* wrong! */ file="foo.html"; This may work in some situations, but it’s ugly and wrong. To see why,...
1370 sym R (471 sym/3 pcs) 12 img
Weighting model fit with ctree in party
Conditional inference trees (ctree) in package party allows weighting which is useful when one classification outcome is more important than another. Useful examples are not difficult to imagine: in a marketing direct mailing, a false positive (non-response) costs just paper and postage (say, $0.50) while a true positive (response) ma...
787 sym 2 img
Validating credit card numbers in SAS
Major credit card issuing networks (including Visa, MasterCard, Discover, and American Express) allow simple credit card number validation using the Luhn Algorithm (also called the “modulus 10″ or “mod 10″ algorithm). The following code demonstrates an implementation in SAS. The code also validates the credit card number by length and b...
1090 sym Python (1930 sym/1 pcs) 12 img