Publications by Derek-Jones

Variations in the literal representation of Pi

11.03.2010

The numbers system I am developing attempts to match numeric literals contained in a file against a database of interesting numbers. One of the things I did to quickly build a reasonably sized database of reliable values was to extract numeric literals from a few well known programs that I thought I could trust. R is a widely used statistical pa...

3596 sym R (479 sym/1 pcs) 16 img

Thinking in R: vectors

04.09.2010

While I have been using the R language/environment for over five years or so, whenever I needed to do some statistical calculations, until recently I was only a casual user. A few months ago I started to use R in more depth and made a conscious decision to try and do things the ‘R way’. While it is claimed that R is a functional language wit...

3469 sym R (230 sym/3 pcs) 2 img

Has the seed that gets software development out of the stone-age been sown?

25.12.2010

A big puzzle for archaeologists is why stone age culture lasted as long as it did (from approximately 2.5 millions years ago until the start of the copper age around 6.3 thousand years ago). Given the range of innovation rates seen in various cultures through-out human history a much shorter stone age is to be expected. A recent paper proposes ...

3389 sym 2 img

Empirical software engineering is five years old

31.03.2011

Science and engineering are built on theoretical models that are tested against measurements of ‘reality’. Until around 10 years ago there was very little software engineering ‘reality’ publicly available; companies rarely made source available and were generally unforthcoming about any bugs that had been discovered. What happened aroun...

3433 sym 2 img

Quality comparison of floating-point maths libraries

10.04.2011

What is the best way to compare the quality of floating-point math libraries (e.g., sin, cos and log)? The traditional approach for evaluating the quality of an algorithm implementing a mathematical function is based on mathematics; methods have been developed to calculate the maximum error between the calculated and the actual value. The answe...

3904 sym R (687 sym/4 pcs) 6 img

Unused function parameters

08.05.2011

I have started redoing the source code measurements that appear in my C book, this time using a lot more source, upgraded versions of existing tools, plus some new tools such as Coccinelle and R. The intent is to make the code and data available in a form that is easy for others to use (I am hoping that one or more people will measure the same c...

4595 sym 20 img

Searching for inaccurate literals in R

29.05.2011

In creating the numbers tool I wanted to be able to do two things, 1) obtain information about what source did by matching the numeric literals it contained against a database of ‘interesting’ values (now with over 14,000 entries) and 2) flag possible incorrect numeric literals (e.g., 3.1459265 when 3.14159265 had been intended in core/Helix....

5176 sym R (910 sym/1 pcs) 2 img

Halstead’s metrics and flat-Earthers are still with us

18.08.2011

I recently discovered a fascinating series of technical reports from the 1970s in the Purdue University e-Pubs archive that shine a surprising light on what are now known as the Halstead metrics. The first surprises came from Halstead’s A Software Physics Analysis of Akiyama’s Debugging Data; surprising in the size of the data set used (nine ...

4396 sym 2 img

Learning R as a language

29.11.2011

Books written to teach a general purpose programming language are usually organized according to the features of the language and examples often show how a particular language feature is interpreted by a compiler. Books about domain specific languages are usually organized in a way that makes sense in the corresponding application domain and exa...

3318 sym 2 img

Initial impressions of RangeLab

30.12.2011

I was rummaging around in the source of R looking for trouble, as one does, when I came across what I believed to be a less than optimally accurate floating-point algorithm (function R_pos_di in src/main/arithemtic.c). Analyzing the accuracy of floating-point code is notoriously difficult and those having the required skills tend to concentrate ...

3097 sym R (748 sym/4 pcs) 2 img