Publications by Florian Teschner
Transfer Learning with augmented Data for Logo Detection
The last months, I have worked on brand logo detection in R with Keras. Starting with a model from scratch adding more data and using a pretrained model. The goal is to build a (deep) neural net that is able to identify brand logos in images. Just to recall, the dataset is a combination of the Flickr27-dataset, with 270 images of 27 classes and...
2855 sym R (2959 sym/1 pcs) 2 img
IMDB Genre Classification using Deep Learning
The Internet Movie Database (Imdb) is a great source to get information about movies. Keras provides access to some part of the cleaned dataset (e.g. for sentiment classification). While sentiment classification is an interesting topic, I wanted to see if it is possible to identify a movie’s genre from its description. The image illustrates the...
2624 sym R (2027 sym/2 pcs) 4 img
Deep learning and the German Data Science Job Market
Almost 2 years ago, I wrote a short post on the German data science market by analysing open position on the key job platforms; Stepstone, Monster and Indeed. Recently I received requests to update the post with fresh data. So here it comes. To add a slightly different spin, I thought it would be interesting to see how widespread and prevalent th...
2586 sym 8 img
Content Evaluation: What is the Value of Social Media?
As most bloggers, I do check my analytics stats on a regular basis. What I do not really look at is social shares. For most blog posts traffic follows a clear pattern; 2 days of increased traffic, followed by a steady decrease to the base traffic volume. The amplitude varies massively depending on how interesting/viral the post was. However, in ...
2810 sym R (438 sym/2 pcs) 4 img 1 tbl
Wrapping Access to Web-Services in R-functions.
One of the great features of R is the possibility to quickly access web-services. While some companies have the habit and policy to document their APIs, there is still a large chunk of undocumented but great web-services that help the regular data scientist. In the following short post, I will show how we can turn a simple web-serivce in a nice R...
2184 sym R (3129 sym/4 pcs) 2 img
Exploring Embeddings for Categorical Variables with Keras
In order to stay up to date, I try to follow Jeremy Howard on a regular basis. In one of his recent videos, he shows how to use embeddings for categorical variables (e.g. weekdays). First off; what are embeddings? An embedding is a mapping of a categorical vector in a continuous n-dimensional space. The idea is to represent a categorical represen...
3492 sym R (2340 sym/4 pcs) 4 img
Concatenate Embeddings for Categorical Variables with Keras
In my last post, I explored how to use embeddings to represent categorical variables. Furthermore, I showed how to extract the embeddings weights to use them in another model. While the concept of embedding representation has been used in NLP for quite some time, the idea to represent categorical variables with embeddings appreared just recently ...
3733 sym R (5177 sym/5 pcs) 2 img
tfestimators – Package: Embeddings for Categorical Variables
In my last posts (here and here) I explored how to use embeddings to represent categorical variables. Furthermore, I showed how to represent categorical variables with embeddings and add other variable to create a more complex model. Both posts focused on the Keras (R) functionality. I concluded that it feels artificial to represent categorical v...
3165 sym R (2353 sym/4 pcs) 2 img
Cluster Analysis – Naming Pattern in the last Century.
A while back I got interested in (baby) surnames. It is interesting to follow how friends name their newly born. There are various naming “strategies”, a) pick a name that has been used in previous generations but now uncommon (e.g. in Germany that would be Oskar), b) pick a biblical name that never ages such as Johannes or David, or c) use ...
4100 sym R (1490 sym/8 pcs) 8 img