Publications by Sascha W.
Let’s go! (and Disclaimer)
Let’s get that started …Fun stuff with R coming soon!Disclaimer beforehand:The analyses I’ll present are not meant to be taken too seriously in a scientific way. I just wanna show what you can do with R as a programming language, basic statistics and different kinds of visualizations. Also, I’m not going to proofread my posts several time...
989 sym
Soccer is all about money (?) – Part 1: Getting the Data
Teams with more money always win, right? At least they have a bigger chance of having success in their national championship. Let’s look into that…NOTE: If you are not interested in the details of programming with R or getting data from the internet with it, you want to skip this post and have a look at Part 2.First, we have to ge...
5518 sym
Soccer is all about money (?) – Part 2: Simple analyses
Alright, now we have all the data we need in one dataframe. To make this code work, I assume you ran the code from Part 1. We need the dataframe big.tab.All the data presented here is based on the data from 18/10/2012. You can run an analysis with the actual data or I can do it at some point later in the season.Let’s plot some stuff...
2722 sym 6 img
Soccer is all about money (?) – Part 3: More plots & analyses
Let’s play around a bit more with the dataset we built in Part 1 of this series.Now we are going to compare data from more championships in Europe.Let’s check out the first divisions from the following countries:– Germany (1. Bundesliga)– England (Premier League)– Spain (Primera División)– Italy (Serie A)– France (Leagu...
4466 sym 6 img
Going to the Movies…
Today, let us have a look at movies. The Internet Movie Database (IMDb) has some data dumps available on their website. It’s a subset of the information available on the IMDb site, but it’s more than enough. I will spare you my code to convert these data dumps in R dataframes, because the code is boring and complicated (unfortunat...
8830 sym 10 img
Josh vs. himself (or: Firefly > all)
For Jan…I’ve got no data for “S.H.I.E.L.D.” 🙁Maybe, but just maybe, “Firefly” gets the way-to-early-cancelled bonus by the voting community. Related To leave a comment for the author, please follow the link and comment on their blog: Rcrastinate. R-bloggers.com offers daily e-mail updates about R news and tutorials ...
546 sym 2 img
Creating PDFs and websites with the "knitr" package
Just a fast note: I came across the R-package “knitr” which enables you to generate PDF files by mixing LaTeX and R code in one document. The result looks very nice and is great to create documentations, manuals and so on. I find knitr much easier to use than the quite popular Sweave (but I guess this has to do with personal pref...
1517 sym
Fun stuff with subtitles or "The Tarantino Threshold"
Fortunately, there is a page called www.opensubtitles.org, where you can get subtitle (.SRT) files for virtually every movie. Now let’s see what we can do with these. SRT files are in plain text format (human readable) and can thus be read quite easily with R.First thing we need is a reading function for an SRT file. This function i...
6410 sym 4 img
"The Dude" takes the Tarantino threshold
Just as a quick reply to a friend of mine who suggested testing the swearing capabilities of The Dude:Click to enlarge.As you can see, “The Big Lebowski” (2.79 % swear words) takes the Tarantino threshold (0.98 %) easily, but it’s no match against “Reservoir Dogs” (3.28 %). Indeed, it’s just below “Pulp Fiction” (2.83 ...
729 sym 2 img
R-bloggers
As long as I can’t find the time to post my newest adventuRes, why don’t you check out the great collection of other R-blogs on the web:www.r-bloggers.com Have fun! Related To leave a comment for the author, please follow the link and comment on their blog: Rcrastinate. R-bloggers.com offers daily e-mail updates about R news ...
570 sym