Publications by Florian Teschner
Analyzing Job Postings; A Cross Country Comparison
Writing my first post about the German data scientist job market, I was surprised to find such a low number of open positions. The scripts are easy to extend to a cross-country comparison. To simplify the analysis, this time I focused on the Indeed website only. As a baseline, I scanned the websites for the number of open positions which contain ...
2259 sym R (123 sym/2 pcs) 6 img
Image Recognition and Face Detection
Image recognition and face detection has been around for some years. However, usage and adoption was limited due to quality and ease of development. With the release of Microsoft’s Project Oxford, the accessibility to such tools has massively improved. Their simple to use REST API provides an excellent opportunity for the average developer to...
1925 sym 4 img
Who is going down? Bundesliga Betting Odds
An essential part of the typical office talk in Germany is about soccer and the Bundesliga. One of the current key questions is; which team will be relegated. The two local teams (SV Darmstadt and Eintracht Frankfurt) are (hot) candidates. While I love the banter, let’s be data-driven and have a close look at the current odds. I wrote a small R...
2309 sym 4 img
Artificial Intelligence in the News?
In a previous post, I expressed the feeling that artificial intelligence, data science and big data are currently “hot” in the news. One of my favorite news outlet (“Die Zeit”) has an open data policy in a sense that they have a public API. I thought it would be worth to check my feeling. I should mention that “Die Zeit” is not very t...
1551 sym 4 img
On Panel Sizes
In the face of upcoming elections in the US and in Germany, polling is big news. One thing that strikes me as enormously missing in the debate is how inaccurate a single poll is. Moreover, one never reads about the uncertainty around a single poll. What is the range of expected outcomes or a least a confidence interval? I am by no means an expert...
4391 sym R (167 sym/6 pcs) 6 img
Revisiting Data-driven Marketing
One of the key trends in the advertising industry is (digital) data-driven marketing. The whole thing starts with massive, passive data collection. No matter which website we visit or which app we use: We leave a digital footprint. These footprints are compiled for individual users and form so-called user profiles. A set of similar profiles is th...
3191 sym 2 img 1 tbl
Revisiting Data-driven Marketing, part II
In the last post, I discussed how the current digital measurement approach is biased towards targeted ad buying. The key reason is that ad effectiveness is calculated on a cost per order/conversion basis. As particular user segments -which are addressed with digital targeting- have a high base purchase probability, the segment looks more responsi...
2286 sym 2 img
Revisiting Data-driven Marketing, part III
In the last two posts 1, 2, I tried to discuss how measurement and false metrics drive optimization towards low hanging fruits and in the end degrade ad effectiveness. I would like to follow up with a short example of how the issue extends into the paid search (e.g. Google Adwords) channel. Search traffic is split up into two parts; the organic, ...
3573 sym 4 img
Web-Scraping JavaScript rendered Sites
Gathering data from the web is one of the key tasks in order to generate easy data-driven insights into various topics. Thanks to the fantastic Rvest R package web scraping is pretty straight forward. It basically works like this; go to a website, find the right items using the selector gadget and plug the element path into your R-code. There are...
2799 sym
From Image Recognition to Brand Logo Detection
I previously did a short review on Microsoft’s image recognition and face detection API. A couple of weeks ago Google announced their vision API providing some similar features. Even though there is no R package or code to dive into this API and their API documentation is rather sparse, I thought it could be fun and inspiring to give it a try. ...
2932 sym 4 img 2 tbl