Publications by wrathematics
Autoplot: Graphical Methods with ggplot2
Background As of ggplot2 0.9.0 released in March 2012, there is a new generic function autoplot. This uses R’s S3 methods (which is essentially oop for babies) to let you have some simple overloading of functions. I’m not going to get deep into oop, because honestly we don’t need to. The idea is very simple. If I say “I’m sending ...
5030 sym R (7220 sym/15 pcs) 6 img
Some Quirks of the R Language
R is my favorite programming language. It’s just so useful for getting work done. Sometimes people will complain that R is a difficult language. To me, this begs the questions: difficult for what? And for whom? I personally think R is just about the easiest thing in the world for prototyping. Meaning if you want to quickly crank o...
5754 sym R (810 sym/7 pcs)
R at 12,000 Cores
I am very happy to introduce a new set of packages that has just hit the CRAN. We are calling it the Programming with Big Data in R Project, or pbdR for short (or as I like to jokingly refer to it, ‘pretty bad for dyslexics’). You can find out more about the pbdR project at http://r-pbd.org/ The packages are a natural programming framework t...
6644 sym
pbdR Updates – Distributed lm.fit() and More
Over the weekend, we updated all of the pbdR packages currently available on the CRAN. The updates include tons of internal housecleaning as well as many new features. Notably, pbdBASE_0.1-1 and pbdDMAT_0.1-1 were released, which contain lm.fit() methods. This function in particular has been available at my github for over a month, but didn�...
6077 sym
Intentionally Writing Obtuse Code
Sometimes intentionally writing bad code can be a lot of fun. Now here, when I say “bad”, I mean something that’s functional but completely incoherent to anything but the machine. There are even competitions for this kind of thing, but I only consider myself a dabbler in this dark art. Thankfully, it’s often pretty easy to make obtuse cod...
3417 sym R (1816 sym/6 pcs)
Rules for Naming Objects in R
Naming Rules in R How are objects allowed to be named in R? As it turns out, this is a very different question from how should objects be named. This isn’t about style conventions, camelCase, dots.verus_underscores, or anything like that; this is about what is strictly possible. I do a lot of outreach to HPC people who are starting to get an in...
4504 sym R (201 sym/8 pcs)
How to Make a Bad Password with R
I have a lot of projects that will take ages to finish (some are in such poor shape that I tuck them away in private repositories, so no one can see my shame). So sometimes it’s nice to just take a weekend and crank out something start to finish, even if it’s dumb and no one cares about it and fewer people want it. Which brings us to the ...
4520 sym R (306 sym/2 pcs) 2 img
Searching an R Function’s Source Code
This is not nearly as interesting as it might first sound, but every function in R contains R code; this is true of core R code as well as extension packages. Sometimes the R code is just a very shallow wrapper around some compiled code, such as in sum() and is.null(). Other times, as in lm.fit(), there is a vast expanse of R code. It’s easy ...
1704 sym R (4115 sym/6 pcs)
Modern Applied Statistics in R’lyeh
So you’ve probably heard of King James Programming; if not, you should check it out because it’s great. A quick summary is that someone took the King James Bible and Sussman’s Structure and Interpretation of Computer Programs (SICP) and used an n-gram babbler to generate new sentences that combine the texts in amusing ways. The generator it...
3831 sym R (63 sym/1 pcs) 2 img
Advanced R Profiling with pbdPAPI
R has some extremely useful utilities for profiling, such as system.time(), Rprof(), the often overlooked tracemem(), and the rbenchmark package. But if you want more than just simple timings of code execution, you will mostly have to look elsewhere. One of the best sources for profiling data is hardware performance counters, available in most mo...
5922 sym R (1086 sym/9 pcs) 2 img