Publications by Rolf Fredheim

Experiments in python and d3 from R: GDELT made easy

29.04.2013

Related To leave a comment for the author, please follow the link and comment on their blog: Quantifying Memory. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here...

400 sym

Fun simulating Wimbledon in R and Python

04.07.2013

R and Python have different strengths. There’s little you can do in R you absolutely can’t do in Python and vice versa, but there’s a lot of stuff that’s really annoying in one and nice and simple in the other. I’m sure simulations can be run in R, but it seems frightfully tricky. Recently I wrote a simple Tennis simulator in Python, ...

4532 sym R (2210 sym/5 pcs) 8 img 1 tbl

Scaling up text processing and Shutting up R: Topic modelling and MALLET

29.10.2013

In this post I show how a combination of MALLET, Python, and data.table means we can analyse quite Big data in R, even though R itself buckles when confronted by textual data. Topic modelling is great fun. Using topic modelling I have been able to separate articles about the ‘Kremlin’ as a) a building, b) an international actor c) the advers...

6473 sym 4 img

Databases for text analysis: archive and access texts using SQL

07.11.2013

This post is a collection of scripts I’ve found useful for integrating a SQL database into more complex applications. SQL allows quickish access to largish repositories of text (I wrote about this at some length here), and are a good starting point for taking textual analysis beyond thousands of texts.I timed Python to be thirteen t...

5824 sym

Visualising Structure in Topic Models

11.11.2013

How exactly should we visualise topic models to get an overview of how topics relate to each other? This post is a brief lit review of that debate – I realise the subject matter is sooo last year. I also present my chosen solution to the dilemma: I use dendrograms to position topic, and add a network visualisation using an arcplot to expose lin...

12823 sym 10 img

Plugging hierarchical data from R into d3

20.11.2013

Here I show how to convert tabulated data into a json format that can be used in d3 graphics. The motivation for this was an attempt at getting an overview of topic models (link). Illustrations like the one to the right are very attractive; my motivation to learn how to make them was that the radial layout sometimes saves a lot of space – in my...

7179 sym R (444 sym/1 pcs) 6 img

Web-Scraping: the Basics

19.02.2014

Slides from the first session of my course about web scraping through R: Web scraping for the humanities and social sciencesIncludes an introduction to the paste function, working with URLs, functions and loops. Putting it all together we fetch data in JSON format about Wikipedia page views from http://stats.grok.se/Solutions here: Do...

794 sym

Web Scraping part2: Digging deeper

25.02.2014

Slides from the second web scraping through R session: Web scraping for the humanities and social sciencesIn which we make sure we are comfortable with functions, before looking at XPath queries to download data from newspaper articles. Examples including BBC news and Guardian commentsDownload the .Rpres file to use in Rstudio hereA r...

804 sym

Web Scraping: Scaling up Digital Data Collection

05.03.2014

The latest slides from web scraping through R: Web scraping for the humanities and social sciencesSlides from the first session hereSlides from the second session hereThis week we look in greater detail at scaling up digital data-collection: coercing scraper output into dataframes, how to download files (along with a cursory look at t...

991 sym

Web Scraping: working with APIs

12.03.2014

APIs present researchers with a diverse set of data sources through a standardised access mechanism: send a pasted together HTTP request, receive JSON or XML in return. Today we tap into a range of APIs to get comfortable sending queries and processing responses. These are the slides from the final class in Web Scraping through R: Web...

1801 sym