Publications by Karsten W.

Reproducible blogging

10.07.2011

As a fact-based blog, the posts here contain very often diagrams and data tables. To enable you to reproduce the results and insights, I include the computations as computer code.Most blogposts I write are markdown text combined (or weaved) with computer code written in the R language. I created a small package mdtools that puts the tools togethe...

907 sym

Regional differences on what drives CO2 emissions

20.07.2011

If you are investigating the change of CO2 emissions, then you might ask: Where do the changes occur? Well here is the answer.The staircase plots show the contributing factors to CO2 emissions for each continent. population refers to population effects, gdp_pcap refers to income per capita, energy_intensity refers to energy used per dollar added ...

1164 sym 2 img

Heating costs

28.12.2011

In 2010, my heating costs exceeded my advance payments by about 25%. This motivated me to decompose the costs to see what drove the changes. Here is the result:The numbers refer to Euros. Read von right to left: 2010 was a cold year (+102EUR), but gas consumption in this house was relatively low (–89EUR). Also, running costs and gas price were ...

1551 sym 2 img

How much is a shower?

29.12.2011

After looking at my heating expenses, I turned to the costs for water heating. For some time, I looked at my water meter before and after taking a shower or a bath. Quite often, I forgot one or the other measurement, but I collected about 40 observations. Here is what they look like:The data suggest that for a shower, it takes between 17 and 26.5...

2657 sym 2 img

Tracking my expenses

08.01.2012

One new-year resolution I made last year was to understand where my money goes. From previous experiments I know that expense tracking has to be as simple as possible. My approach is toUse my cash card as often as possible. This automatically tracks the date and some information on the vendor. Use twitter to track my cash expenses. This supplemen...

1766 sym 2 img

Categorizing my expenses

28.01.2012

In order to analyse my expenses, a classification scheme is necessary. I need to identify categories that are meaningful to me. I decided to go with the “Classification of Individual Consumption by Purpose” (COICOP), for three reasons:It is made by people who have thought more about consumption classification than I ever will. It is feasible ...

1784 sym 2 img

Berlin’s children

04.02.2012

Few years ago, a newspaper claimed the block I live in — Prenzlauer Berg in Berlin — is the most fertile region in Europe. It was a hoax, as this (German) newspaper article points out. (The article has become quite famous because it coined the term Bionade Biedermeier to describe the life style in this area.)However, there are more children i...

1498 sym 2 img

Working with strings

10.04.2012

R has a lot of string functions, many of them can be found with ls("package:base", pattern="str"). Additionally, there are add-on packages such as stringr, gsubfn and brew that enhance R string processing capabilities. As a statistical language and environment, R has an edge compared to other programming languages when it comes to text mining alg...

3356 sym R (2131 sym/4 pcs)

A wrapper for R’s data() function

19.06.2012

The workflow for statistical analyses is discussed at several places. Often, it is recommended:never change the raw data, but transform it, keep your analysis reproducible, separate functions and data, use R package system as organizing structure. In some recent projects I tried an S4 class approach for this workflow, which I want to present and ...

974 sym R (987 sym/1 pcs)

Querying DBpedia from R

24.06.2012

DBpedia is an extract of structured information from wikipedia. The structured data can be retrieved using an SQL-like query language for RDF called SPARQL. There is already an R package for this kind of queries named SPARQL.There is an S4 class Dbpedia part of my datamart package that aims to support the creation of predefined parameterized quer...

1152 sym R (733 sym/1 pcs)