Publications by Roel M. Hogervorst
Cleaning up and combining data, a dataset for practice
tldr: I created an open dataset for the explicit practice of data munging. Feel free to use it in assignments, but do mention where you got it from (CC-by-4.0). Also unicorns are awesome. Find the dataset at: https://github.com/RMHogervorst/unicorns_on_unicycles Data munging / cleaning / engineering At work I was working with a two excel files th...
3786 sym
Reading in an epub (ebook) file with the pubcrawl package
In this tutorial I show how to read in a epub file (f.i. from your ebook collection on you computer) into R with the pubcrawl package. In emoji speak: ???????????? . I will show the reading in part, (one line of code) and some other actions you might want to perform on textfiles before they are ready for text analysis. After you read in your epub...
7948 sym R (13720 sym/23 pcs) 12 img
Arthur blinked, Ford shrugs, but Zaphod leapt; text as graph
Can we make the computer say something about characters in a book? In this piece I will search for the names of characters and the words around those names in books. What can we learn about a character from text analysis? Of course it’s also just another excuse for me to read the Hitchhikers series! I will break down the text into chunks of two...
6619 sym R (17120 sym/19 pcs) 24 img
Make more useless packages!
You should make more useless packages. To be more specific: make packages that are useful to you, but might be useless to others. Because building silly stuff is fun and sets the bar low for you to play and learn. I’m a big fan of Simone Giertz (see all the gifs in this post). Simone is known as the ‘Queen of Shitty Robots’ and has a yout...
4222 sym 6 img
Use `purrr` to feed four cats
Use purrr to feed four cats In this example we will show you how to go from a ‘for loop’ to purrr. Use this as a cheatsheet when you want to replace your for loops. Imagine having 4 cats. (like this one:) Four real cats who need food, care and love to live a happy life. They are starting to meow, so it’s time to feed them. Our real life al...
5748 sym R (5924 sym/8 pcs) 2 img
interactive ggplot with tooltip using plotly
A quick Random R thing I use a lot, recently learned, and I want you to know it too. In this post I’ll show you how to make a quick interactive plot with ggplot and plotly, so that values are displayed when you hover your mouse over it. Why would you want this? If you are exploring the data, you want some quick insights into which values are wh...
1346 sym R (5111 sym/4 pcs) 2 img
Tweeting wikidata info
In this explainer I walk you through the steps I took to create a twitter bot that tweets daily about people who died on that date. I created a script that queries wikidata, takes that information and creates a sentence. That sentence is then tweeted. For example: A tweet I literally just send out from the docker container I hope you are has ex...
5526 sym R (4163 sym/5 pcs) 4 img
Running an R script on heroku
In this post I will show you how to run an R script on heroku every day. This is a continuation of my previous post on tweeting a death from wikidata. Why would I want to run a script on heroku? It is extremely simple, you don’t need to spin up a machine in the cloud on AWS, Google, Azure or Nerdalize. You can just run the script and it works....
3066 sym R (322 sym/2 pcs) 4 img
Graphing My Daily Phone Use
How many times do I look at my phone? I set up a small program on my phone to count the screen activations and logged to a file. In this post I show what went wrong and how to plot the results. The data I set up a small program on my phone that counts every day how many times I use my phone (to be specific, it counts the times the screen has bee...
2599 sym R (1365 sym/5 pcs) 4 img
Quick post – detect and fix this ggplot2 antipattern
Recently one of my coworkers showed me a ggplot and although it is not wrong, it is also not ideal. Here is the TL:DR : Whenever you find yourself adding multiple geom_* to show different groups, reshape your data In software engineering there are things called antipatterns, ways of programming that lead you into potential trouble. This is one ...
2423 sym R (5648 sym/6 pcs) 10 img