Publications by tlfvincent

How Netflix reverse-engineered Hollywood

14.01.2014

I was recently made aware of this great article by Alexis C. Madrigal, senior editor at the Atlantic. Although the underlying analysis he performs is relatively straightforward, it is more the fact that he actually thought of doing it…and went through with it! It is a relatively long read for an article but worth a read in my opinion. http://ww...

975 sym 4 img

Reading files in JSON format – a comparison between R and Python

18.01.2014

A file format that I am seeing more and more often is the JSON (JavaScript Object Notation) format. JSON is an open standard format in human-readable form that is used to transmit data between servers and web applications. Below is a typical example of data in JSON format. {"votes":  {   "funny": 0,   "useful": 7,   "cool": 0  },  "us...

1506 sym R (1082 sym/2 pcs) 4 img

Mapping the taste profile of Scottish whishkeys

26.01.2014

Recently, I came across this interesting blog post http://blog.revolutionanalytics.com/2013/12/k-means-clustering-86-single-malt-scotch-whiskies.html by the Revolutions blog poster Luba Gloukhov. This post initially caught my attention because of the originality of the dataset: 86 scottish whiskeys marked on a scale of 0-4 in 12 different taste...

2013 sym 12 img

Restaurant Inspection Results

03.02.2014

Living in NYC is not good for one’s cooking skills. There are just too many mouth-watering options out there that always convince me to eat out rather that stay in line for two hours at Trader Joe’s. Also, this means that my fridge always has room for life essentials such as beer, siracha (aka the juice of gods) and liquor mixers. Crazy lines...

2183 sym 12 img 1 tbl

World tourism and country expenditure

13.02.2014

I’ve recently come across the https://www.undata-api.org/ website, which makes available all the great data that has been gathered by the UN. There’s literally a thousand different datasets one could analyze, and I intend on doing just that, but for some reason I opted to look at some of the world tourism data they have collected. Perhaps thi...

4046 sym R (56 sym/1 pcs) 10 img

New York crime rates

23.02.2014

While browsing through different sites, I randomly cam across the ominous-sounding disaster center website. There is a fair amount of data that could be analyzed there, but my attention was caught by an entry stating that they had just updated the “1965 to 2012 State Crime Pages”. From there, I chose the completely biased option of analyzing...

1435 sym 6 img

Using APIs in Python: a quick example

04.03.2014

Python has an extremely intuitive and straightforward way of dealing with APIs, and makes it simple for people like you or me to access and retrieve information from databases. Before I quickly describe how to use APIs in Python, maybe we should begin with: What is an API? API (Application Programming Interface): An API is a software intermedi...

2062 sym Python (664 sym/1 pcs) 4 img

Dynamic arrays in R, Python and Julia

18.03.2014

Although I have a heavy background in statistics (and therefore am primarily an R user), I find that my overall knowledge in computer science is generally lacking. Therefore, I have recently delved deeper into learning about data structures, their associated ADT’s and how they can be implemented. At the same time, I am using this as an opportu...

3401 sym R (396 sym/3 pcs) 8 img

Which states are the most concerned by gun crime?

24.03.2014

I recently discovered the Capitol Words API and have had some fun playing around with it. One of the categories in the API allows you to search for the words spoken by the senators of each state in the USA, and I was interested in finding out the number of times the words “gun” were recorded on a state bill between January 2012 and Decembe...

2637 sym 12 img

President Approval Ratings from Roosevelt to Obama

29.03.2014

I have been watching the awesome Netflix show “House of Cards” and been fascinated by the devious schemes that Underwood is constantly plotting. The show often mentions approval ratings and it got me to wondering what Obama’s ratings currently were, and all other past US president  for that matter. However, I didn’t have much chance find...

1928 sym 8 img