Publications by Econometrics and Free Software

Getting data from pdfs using the pdftools package

09.06.2018

It is often the case that data is trapped inside pdfs, but thankfully there are ways to extract it from the pdfs. A very nice package for this task is pdftools (Github link) and this blog post will describe some basic functionality from that package. First, let’s find some pdfs that contain interesting data. For this post, I’m using the diabe...

4574 sym R (14721 sym/14 pcs) 6 img

Forecasting my weight with R

23.06.2018

I’ve been measuring my weight almost daily for almost 2 years now; I actually started earlier, but not as consistently. The goal of this blog post is to get re-acquaiented with time series; I haven’t had the opportunity to work with time series for a long time now and I have seen that quite a few packages that deal with time series have been ...

6508 sym R (9948 sym/30 pcs) 8 img

Missing data imputation and instrumental variables regression: the tidy approach

30.06.2018

In this blog post I will discuss missing data imputation and instrumental variables regression. This is based on a short presentation I will give at my job. You can find the data used here on this website: http://eclr.humanities.manchester.ac.uk/index.php/IV_in_R The data is used is from Wooldridge’s book, Econometrics: A modern Approach. You c...

7156 sym R (8573 sym/24 pcs) 8 img

Dealing with heteroskedasticity; regression with robust standard errors using R

07.07.2018

First of all, is it heteroskedasticity or heteroscedasticity? According to McCulloch (1985), heteroskedasticity is the proper spelling, because when transliterating Greek words, scientists use the Latin letter k in place of the Greek letter κ (kappa). κ sometimes is transliterated as the Latin letter c, but only when these words entered the Eng...

5619 sym R (12863 sym/21 pcs) 6 img

The year of the GNU+Linux desktop is upon us: using user ratings of Steam Play compatibility to play around with regex and the tidyverse

07.09.2018

I’ve been using GNU+Linux distros for about 10 years now, and have settled for openSUSE as my main operating system around 3 years ago, perhaps even more. If you’re a gamer, you might have heard about SteamOS and how more and more games are available on GNU+Linux. I don’t really care about games, I play the occasional one (currently Tangled...

3577 sym R (7919 sym/17 pcs) 2 img

Going from a human readable Excel file to a machine-readable csv with {tidyxl}

10.09.2018

I won’t write a very long introduction; we all know that Excel is ubiquitous in business, and that it has a lot of very nice features, especially for business practitioners that do not know any programming. However, when people use Excel for purposes it was not designed for, it can be a hassle. Often, people use Excel as a reporting tool, which...

6087 sym R (3362 sym/10 pcs) 6 img

How Luxembourguish residents spend their time: a small {flexdashboard} demo using the Time use survey data

13.09.2018

In a previous blog post I have showed how you could use the {tidyxl} package to go from a human readable Excel Workbook to a tidy data set (or flat file, as they are also called). Some people then contributed their solutions, which is always something I really enjoy when it happens. This way, I also get to learn things! @expersso proposed a solut...

2736 sym R (4460 sym/3 pcs) 2 img

Exporting editable plots from R to Excel: making ggplot2 purrr with officer

04.10.2018

I was recently confronted to the following problem: creating hundreds of plots that could still be edited by our client. What this meant was that I needed to export the graphs in Excel or Powerpoint or some other such tool that was familiar to the client, and not export the plots directly to pdf or png as I would normally do. I still wanted to us...

3174 sym R (3029 sym/10 pcs) 6 img

Getting the data from the Luxembourguish elections out of Excel

20.10.2018

In this blog post, similar to a previous blog post I am going to show you how we can go from an Excel workbook that contains data to flat file. I will taking advantage of the structure of the tables inside the Excel sheets by writing a function that extracts the tables and then mapping it to each sheet! Last week, October 14th, Luxembourguish nat...

8203 sym R (10026 sym/22 pcs) 6 img

Maps with pie charts on top of each administrative division: an example with Luxembourg’s elections data

26.10.2018

Abstract You can find the data used in this blog post here: https://github.com/b-rodrigues/elections_lux This is a follow up to a previous blog post where I extracted data of the 2018 Luxembourguish elections from Excel Workbooks. Now that I have the data, I will create a map of Luxembourg by commune, with pie charts of the results on top of each...

5686 sym R (12639 sym/30 pcs) 24 img