Publications by Derek-Jones

Parsing R code: Freedom of expression is not always a good idea

28.02.2012

With my growing interest in R it was inevitable that I would end up writing a parser for it. The fact that the language is relatively small (the add-on packages do the serious work) hastened the event because it did not look like much work; famous last words. I knew about R’s design and implementation being strongly influenced by the world vi...

9037 sym R (557 sym/2 pcs) 4 img

Go faster R for Google’s summer of code 2012

28.03.2012

The R Foundation has been accepted for Google’s summer of code and I thought I would suggest a few ideas for projects. My interests are in optimization and source code analysis, so obviously the suggestions involve these topics. There are an infinite number of possible optimizations that can be applied to code (well, at least more than the num...

4942 sym

An academic programming language paper about R

27.04.2012

The R language has passed another milestone, a paper aimed at the academic programming language community (or at least one section of this community) has been written about it, Evaluating the Design of the R Language by Morandat, Hill, Osvald and Vitek. Hardly earth shattering news, but it may have some impact on how R is viewed by nonusers of t...

6009 sym 2 img

Incompetence borne of excessive cleverness

29.04.2012

I have just got back from the 24 hour Data Science Global Hackathon; I was an on-site participant at Hub Westminster in London (thanks to Carlos and his team for doing such a great job looking after us all {around 50 turned up from the 100 who registered; the percentage was similar in other cities around the world}). Participants had to be regis...

7293 sym

EU rules that computer languages cannot be copyrighted

02.05.2012

The European Court of Justice has published its decision in SAS v WPL; the title of the press release says it all “The functionality of a computer program and the programming language cannot be protected by copyright”. To summarise the background, World Programming Ltd developed a system that was capable of emulating the input/output behavio...

3207 sym 2 img

Background to my book project “Empirical Software Engineering with R”

21.06.2012

This post provides background information that can be referenced by future posts. For the last 18 months I have been working in fits and starts on a book that has the working title “Empirical Software Engineering with R”. The idea is to provide broad coverage of software engineering issues from an empirical perspective (i.e., the discussion ...

3364 sym

Impact of hardware characteristics on detectable fault behavior

29.06.2012

Preface. This is the first of what I hope will be many posts analysing experimental data, that will eventually end up in my empirical software engineering with R book (this experiment was chosen because it happens to be the one I am currently working on; having just switched to using Asciidoc I have a backlog of editing to do on previously writt...

10089 sym R (101 sym/1 pcs) 2 img 1 tbl

Success does not require understanding

23.07.2012

I took part in the second Data Science London Hackathon last weekend (also my second hackathon) and it was a very different experience compared to the first hackathon. Once again Carlos and his team really looked after us. The data was released 24 hours before the competition started and even though I had spent less than half an hour looking at...

6599 sym

My no loops in R hair shirt

27.07.2012

Being professional involved with analyzing source code I get to work with a much larger number of programming languages than most people. There is a huge difference between knowing the intricate details of the semantics of a language and being able to fluently program in a language like a native developer. There are languages whose semantics I ...

4446 sym R (920 sym/4 pcs)

Descriptive statistics of some Agile feature characteristics

02.09.2012

The purpose of software engineering research is to figure out how software development works so that the software industry can improve its quality/timeliness (i.e., lower costs and improved customer satisfaction). Research is hampered by the fact that companies are not usually willing to make public good quality data about the details of their s...

24882 sym R (1475 sym/9 pcs) 32 img 11 tbl