Publications by Econometrics and Free Software

Pivoting data frames just got easier thanks to `pivot_wide()` and `pivot_long()`

19.03.2019

There’s a lot going on in the development version of {tidyr}. New functions for pivoting data frames, pivot_wide() and pivot_long() are coming, and will replace the current functions, spread() and gather(). spread() and gather() will remain in the package though: You may have heard a rumour that gather/spread are going away. This is simply not ...

4380 sym R (14127 sym/16 pcs) 4 img

Pivoting data frames just got easier thanks to `pivot_wide()` and `pivot_long()`

19.03.2019

There’s a lot going on in the development version of {tidyr}. New functions for pivoting data frames, pivot_wide() and pivot_long() are coming, and will replace the current functions, spread() and gather(). spread() and gather() will remain in the package though: You may have heard a rumour that gather/spread are going away. This is simply not ...

4380 sym R (14127 sym/16 pcs) 4 img

Get text from pdfs or images using OCR: a tutorial with {tesseract} and {magick}

30.03.2019

In this blog post I’m going to show you how you can extract text from scanned pdf files, or pdf files where no text recognition was performed. (For pdfs where text recognition was performed, you can read my other blog post). The pdf I’m going to use can be downloaded from here. It’s a poem titled, D’Léierchen (Dem Léiweckerche säi Lidd...

4692 sym R (4364 sym/15 pcs) 12 img

Get text from pdfs or images using OCR: a tutorial with {tesseract} and {magick}

30.03.2019

In this blog post I’m going to show you how you can extract text from scanned pdf files, or pdf files where no text recognition was performed. (For pdfs where text recognition was performed, you can read my other blog post). The pdf I’m going to use can be downloaded from here. It’s a poem titled, D’Léierchen (Dem Léiweckerche säi Lidd...

4485 sym R (4364 sym/15 pcs) 12 img

Historical newspaper scraping with {tesseract} and R

06.04.2019

I have been playing around with historical newspapers data for some months now. The “obvious” type of analysis to do is NLP, but there is also a lot of numerical data inside historical newspapers. For instance, you can find these tables that show the market prices of the day in the L’Indépendance Luxembourgeoise: I wanted to see how easy ...

7506 sym R (14266 sym/17 pcs) 28 img

Historical newspaper scraping with {tesseract} and R

06.04.2019

I have been playing around with historical newspapers data for some months now. The “obvious” type of analysis to do is NLP, but there is also a lot of numerical data inside historical newspapers. For instance, you can find these tables that show the market prices of the day in the L’Indépendance Luxembourgeoise: I wanted to see how easy ...

7506 sym R (14266 sym/17 pcs) 28 img

Fast food, causality and R packages, part 1

27.04.2019

I am currently working on a package for the R programming language; its initial goal was to simply distribute the data used in the Card and Krueger 1994 paper that you can read here (PDF warning). The gist of the paper is to try to answer the following question: Do increases in minimum wages reduce employment? According to Card and Krueger’s pa...

5836 sym R (1786 sym/10 pcs) 12 img

Fast food, causality and R packages, part 1

27.04.2019

I am currently working on a package for the R programming language; its initial goal was to simply distribute the data used in the Card and Krueger 1994 paper that you can read here (PDF warning). The gist of the paper is to try to answer the following question: Do increases in minimum wages reduce employment? According to Card and Krueger’s pa...

5836 sym R (1786 sym/10 pcs) 12 img

Fast food, causality and R packages, part 2

03.05.2019

I am currently working on a package for the R programming language; its initial goal was to simply distribute the data used in the Card and Krueger 1994 paper that you can read here (PDF warning). However, I decided that I would add code to perform diff-in-diff. In my previous blog post I showed how to set up the structure of your new package. In...

5066 sym R (7978 sym/13 pcs) 6 img

Fast food, causality and R packages, part 2

03.05.2019

I am currently working on a package for the R programming language; its initial goal was to simply distribute the data used in the Card and Krueger 1994 paper that you can read here (PDF warning). However, I decided that I would add code to perform diff-in-diff. In my previous blog post I showed how to set up the structure of your new package. In...

5066 sym R (7978 sym/13 pcs) 6 img