Publications by Paul van der Laken

Why Gordon Shotwell uses R

06.01.2020

This blog by Gordon Shotwell has passed my Twitter feed a couple of times now and I thought I’d share it here: blog.shotwell.ca/posts/why_i_use_r It in, Gordon present his reasons for using R, describing R’s four unique selling point, and outlining a discussion full of perfectly quotable thoughts and opinions. Do have a look at the original ...

3963 sym 2 img

Learn Julia for Data Science

10.02.2020

Most data scientists favor Python as a programming language these days. However, there’s also still a large group of data scientists coming from a statistics, econometrics, or social science and therefore favoring R, the programming language they learned in university. Now there’s a new kid on the block: Julia. Via Medium Advantages & Disadva...

2152 sym 4 img 1 tbl

Simulating data with Bayesian networks, by Daniel Oehm

11.02.2020

Daniel Oehm wrote this interesting blog about how to simulate realistic data using a Bayesian network. Bayesian networks are a type of probabilistic graphical model that uses Bayesian inference for probability computations. Bayesian networks aim to model conditional dependence, and therefore causation, by representing conditional dependence by ...

3092 sym 6 img

Solutions to working with small sample sizes

10.03.2020

Both in science and business, we often experience difficulties collecting enough data to test our hypotheses, either because target groups are small or hard to access, or because data collection entails prohibitive costs. Such obstacles may result in data sets that are too small for the complexity of the statistical model needed to answer the qu...

1750 sym 2 img

paletteer: Hundreds of color palettes in R

17.03.2020

Looking for just the right colors for your data visualization? I often cover tools to pick color palettes on my website (e.g. here, here, or here) and also host a comprehensive list of color packages in my R programming resources overview. However, paletteer is by far my favorite package for customizing your colors in R! The paletteer package o...

1379 sym R (218 sym/1 pcs) 2 img 1 tbl

How to standardize group colors in data visualizations in R

20.03.2020

One best practice in visualization is to make your color scheme consistent across figures. For instance, if you’re making multiple plots of the dataset — say a group of 5 companies — you want to have each company have the same, consistent coloring across all these plots. R has some great data visualization capabilities. Particularly the g...

4194 sym R (5374 sym/18 pcs) 30 img

Visualizing decision tree partition and decision boundaries

31.03.2020

Grant McDermott develop this new R package I had thought of: parttree parttree includes a set of simple functions for visualizing decision tree partitions in R with ggplot2. The package is not yet on CRAN, but can be installed from GitHub using: # install.packages("remotes") remotes::install_github("grantmcdermott/parttree") Using the familiar...

1469 sym R (680 sym/2 pcs) 2 img

Curated Regular Expression Resources

07.04.2020

Regular expression (also abbreviated to regex) really is a powertool any programmer should know. It was and is one of the things I most liked learning, as it provides you with immediate, godlike powers that can speed up your (data science) workflow tenfold. I’ve covered many regex related topics on this blog already, but thought I’d combine t...

1924 sym 12 img

Simulating and visualizing the Monty Hall problem in Python & R

14.04.2020

I recently visited a data science meetup where one of the speakers spoke about playing out the Monty Hall problem with his kids. The Monty Hall problem is probability puzzle. Based on the American television game show Let’s Make a Deal and its host, named Monty Hall: You’re given the choice of three doors. Behind one door sits a prize:...

4729 sym R (12620 sym/4 pcs) 14 img

Free Springer Books during COVID19

24.04.2020

Book publisher Springer just released over 400 book titles that can be downloaded free of charge following the corona-virus outbreak. Here’s fhe full overview: https://link.springer.com/search?facet-content-type=%22Book%22&package=mat-covid19_textbooks&facet-language=%22En%22&sortOrder=newestFirst&showAll=true Most of these books will normally ...

1756 sym 2 img