Publications by Florian Teschner

Analyzing Job Postings; A Cross Country Comparison

28.12.2015

Writing my first post about the German data scientist job market, I was surprised to find such a low number of open positions. The scripts are easy to extend to a cross-country comparison. To simplify the analysis, this time I focused on the Indeed website only. As a baseline, I scanned the websites for the number of open positions which contain ...

2259 sym R (123 sym/2 pcs) 6 img

Image Recognition and Face Detection

30.12.2015

Image recognition and face detection has been around for some years. However, usage and adoption was limited due to quality and ease of development. With the release of Microsoft’s Project Oxford, the accessibility to such tools has massively improved. Their simple to use REST API provides an excellent opportunity for the average developer to...

1925 sym 4 img

Who is going down? Bundesliga Betting Odds

08.01.2016

An essential part of the typical office talk in Germany is about soccer and the Bundesliga. One of the current key questions is; which team will be relegated. The two local teams (SV Darmstadt and Eintracht Frankfurt) are (hot) candidates. While I love the banter, let’s be data-driven and have a close look at the current odds. I wrote a small R...

2309 sym 4 img

Artificial Intelligence in the News?

16.01.2016

In a previous post, I expressed the feeling that artificial intelligence, data science and big data are currently “hot” in the news. One of my favorite news outlet (“Die Zeit”) has an open data policy in a sense that they have a public API. I thought it would be worth to check my feeling. I should mention that “Die Zeit” is not very t...

1551 sym 4 img

On Panel Sizes

22.01.2016

In the face of upcoming elections in the US and in Germany, polling is big news. One thing that strikes me as enormously missing in the debate is how inaccurate a single poll is. Moreover, one never reads about the uncertainty around a single poll. What is the range of expected outcomes or a least a confidence interval? I am by no means an expert...

4391 sym R (167 sym/6 pcs) 6 img

Revisiting Data-driven Marketing

27.02.2016

One of the key trends in the advertising industry is (digital) data-driven marketing. The whole thing starts with massive, passive data collection. No matter which website we visit or which app we use: We leave a digital footprint. These footprints are compiled for individual users and form so-called user profiles. A set of similar profiles is th...

3191 sym 2 img 1 tbl

Revisiting Data-driven Marketing, part II

01.03.2016

In the last post, I discussed how the current digital measurement approach is biased towards targeted ad buying. The key reason is that ad effectiveness is calculated on a cost per order/conversion basis. As particular user segments -which are addressed with digital targeting- have a high base purchase probability, the segment looks more responsi...

2286 sym 2 img

Revisiting Data-driven Marketing, part III

06.03.2016

In the last two posts 1, 2, I tried to discuss how measurement and false metrics drive optimization towards low hanging fruits and in the end degrade ad effectiveness. I would like to follow up with a short example of how the issue extends into the paid search (e.g. Google Adwords) channel. Search traffic is split up into two parts; the organic, ...

3573 sym 4 img

Web-Scraping JavaScript rendered Sites

26.03.2016

Gathering data from the web is one of the key tasks in order to generate easy data-driven insights into various topics. Thanks to the fantastic Rvest R package web scraping is pretty straight forward. It basically works like this; go to a website, find the right items using the selector gadget and plug the element path into your R-code. There are...

2799 sym

From Image Recognition to Brand Logo Detection

27.03.2016

I previously did a short review on Microsoft’s image recognition and face detection API. A couple of weeks ago Google announced their vision API providing some similar features. Even though there is no R package or code to dive into this API and their API documentation is rather sparse, I thought it could be fun and inspiring to give it a try. ...

2932 sym 4 img 2 tbl