Publications by richierocks
Pareto plot party!
A Pareto plot is an enhanced bar chart. It comes in useful for deciding which bars in your bar chart are important. To see this, take a look at some made up DVD sales data. set.seed(1234) dvd_names <- c("Toy Tales 3", "The Dusk Saga: Black Out", "Urban Coitus 2", "Dragon Training for Dummies", "Germination", "Fe Man 2", "Harold The Wizard", "Emb...
2097 sym R (1703 sym/3 pcs) 22 img
Really useful bits of code that are missing from R
There are some pieces of code that are so simple and obvious that they really ought to be included in base R somewhere. Geometric mean and standard deviation – a staple for anyone who deals with lognormally distributed data. geomean <- function(x, na.rm = FALSE, trim = 0, ...) { exp(mean(log(x, ...), na.rm = na.rm, trim = trim, ...)) } geosd...
1229 sym R (437 sym/4 pcs) 16 img
Introducing the Lowry Plot
Here at the Health and Safety Laboratory* we’re big fans of physiologically-based pharmacokinetic (PBPK) models (say that 10 times fast) for predicting concentrations of chemicals around your body based upon an exposure. These models take the form of a big system of ODEs. Because they contain many equations and consequently many parameters (ma...
4703 sym R (3442 sym/3 pcs) 18 img
When 1 * x != x
Trying to dimly recall things from my maths degree, it seems that in most contexts the whole point of the number one is that it is a multiplicative identity. That is, for any x in your set, 1 * x is equal to x. It turns out that when you move to floating point numbers, in some programming lanugages, this isn’t always true. In R, try the follo...
2516 sym R (58 sym/2 pcs) 22 img
Bad kitty!
The cat function bugs me a little. There are two quirks in particular that I find irritating on occasions that I use it. Firstly, almost everything that I want displayed onscreen, I want on its own line. > cat("cat messes up my command prompt position") cat messes up my command prompt position> So it would be really nice if the function appen...
1673 sym R (584 sym/4 pcs) 16 img
A legitimate use for the stupidest variable name ever
The help page to make.names describes how to make a valid variable name in R: A syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number. Names such as ‘”.2way”’ are not valid, and neither are the reserved words. What it doe...
2100 sym R (202 sym/3 pcs) 18 img
Legendary Plots
I was recently pointed in the direction of a thermal comfort model by the engineering company Arup (p27–28 of this pdf). Figure 3 at the top of p28 caught my attention. It’s mostly a nice graph; there’s not too much junk in it. One thing that struck me was that there is an awful lot of information in the legend, and that I found it impos...
2828 sym R (1582 sym/4 pcs) 22 img 1 tbl
Friday Function: setInternet2
Corporate IT networks are a pain for programmers. Ideally, when programming, you want the freedom to download, install and run any software that you want. Unfortunately, in the interests of security, many programmers find themselves a little restricted at the office. (I’m sure that many network admins will protest that the situation works bo...
1639 sym 16 img
Non-standard assignment with getSymbols
I recently came across a rather interesting investment blog, Timely Portfolio. I have a certain soft spot for that sort of thing, because using my data analysis skills to make a fortune is casually on my to-do list. This blog makes regular use of a function getSymbols in the quantmod package. The power and simplicity of the function is fantast...
2345 sym R (443 sym/4 pcs) 16 img
supercalifragilisticexpialidocious = 1
I notice that the latest version of R has upped the maximum length of variable names from 256 characters to a whopping 10 000! (See ?name.) It makes the 63 character limit in MATLAB look rather pitiful by comparison. Come on MathWorks! Let’s have the ability to be stupidly verbose in our variable naming! Tagged: matlab, r, variable names ...
751 sym 16 img