Publications by Onno
Introducing ACEA water level challenge
About 2 week ago, yes right around the New year, I was browsing Kaggle just for fun. It made me remember how much fun it actually is to play around with random data. Not only that but very often with a cool purpose too. One of my new year goals is to have a little bit more fun with data again, so it became a quick 1-2 and I have been diving int...
6580 sym R (3057 sym/3 pcs) 2 img
Predicting blue Gold, ACEA Kaggle challenge
Blog 2: Data preparation and research question About 2 week ago, yes right around the New year, I was browsing Kaggle just for fun. It made me remember how much fun it actually is to play around with random data. Not only that but very often with a cool purpose too. One of my new year goals is to have a little bit more fun with data again, so i...
9003 sym R (3054 sym/1 pcs)
Modelling water levels, taking care of hindsight bias with Caret
This is blog 3 of my endeavors for the currently ongoing Kaggle challenge posted by ACEA. A short introduction of the challenge is below. What I am trying to do with these blogs is not to create perfect tutorials that contain perfectly revised code, but rather document my own learning process and findings at any moment. So hopefully you enjoy som...
8031 sym R (4776 sym/2 pcs) 4 img
ACEA Smart Water Analytics Competition; Final Model overview
This is blog 4 of my endeavors for the currently ongoing Kaggle challenge posted by ACEA. A short introduction of the challenge is below. What I am trying to do with these blogs is not to create perfect tutorials that contain perfectly revised code, but rather document my own project process and findings at any moment. This blog shows the ACEA Sm...
6051 sym R (6242 sym/1 pcs)
Code Review Checklist R Code Edition Top 3
Great code review is one of the most underrated skills a Data Scientist can have. In this blog I will share my top 3 code review checklist, specifically for R Code. In our data science team we regularly do code review to make sure it is up to the standard it needs to be. The top 3 elements mentioned in this blog should in my opinion be included i...
7473 sym 2 img
Code Review Example R Code Caret
Today we will go through a practical code review example. I will analyze some code chunks I used during the Kaggle ACEA challenge. The code chunks come from a function I used to run my final model using Caret in R. Code review is a very important step in any project, I recommend doing it after any project. It helps condense your code and make i...
9183 sym R (8124 sym/6 pcs) 2 img
Designing data driven decision making; Kaggle ColeRidge
There is an interesting challenge running on Kaggle at the moment. It has been designed in cooperation with the Coleridge Initiative (https://coleridgeinitiative.org/) . This initiative is established at the New York University, it’s goal is to facilitate data driven decision making by governments. In the challenge we get to optimize automated ...
7145 sym R (1904 sym/1 pcs) 4 img