Publications by Tony Breyal

Code Optimization: One R Problem, Ten Solutions – Now Eleven!

02.11.2011

Earlier this year I came across a rather interesting page about optimisation in R from rwiki. The goal was to find the most efficient code to produce strings which follow the pattern below given a single integer input n: # n = 4 [1] "i001.002" "i001.003" "i001.004" "i002.003" "i002.004" "i003.004" # n = 5 [1] "i001.002" "i001.003" "i001.004" "i0...

3289 sym R (2842 sym/6 pcs) 22 img

Code Optimization: One R Problem, Eleven Solutions – Now Thirteen!

07.11.2011

Following up from my previous post “Code Optimisation: One R Problem, Ten Solutions – Now Eleven!” I figured out a twelfth solution after writing that blog post. Furthermore, half way through writing this blog post I figured out a thirteenth solution too. As a recap, the problem is taken from rwiki where the goal is to find the most effic...

2489 sym R (2428 sym/6 pcs) 20 img

Web Scraping Google URLs

07.11.2011

Google slightly changed the html code it uses for hyperlinks on search pages last Thursday, thus causing one of my scripts to stop working. Thankfully, this is easily solved in R thanks to the XML package and the power and simplicity of XPath expressions: # load packages library(RCurl) library(XML) get_google_page_urls <- function(u) { # read ...

871 sym R (1133 sym/1 pcs) 18 img

Web Scraping Google Scholar (Partial Success)

08.11.2011

I wanted to scrape the information returned by a Google Scholar web search into an R data frame as a quick XPath exercise. The following will successfully extract  the ‘title’, ‘url’ , ‘publication’ and ‘description’.  If any of these fields are not available, as in the case of a citation, the corresponding cell in ...

1231 sym R (3309 sym/3 pcs)

Web Scraping Google Scholar: Part 2 (Complete Success)

08.11.2011

This is a followup to a post I uploaded earlier today about web scraping data off Google Scholar. In that post I was frustrated because I’m not smart enough to use xpathSApply to get the kind of results I wanted. However fast-forward to the evening whilst having dinner with a friend, as a passing remark, she told me how she had finally figured ...

1946 sym R (4605 sym/3 pcs) 18 img

Facebook Graph API Explorer with R

10.11.2011

I wanted to play around with the Facebook Graph API  using the Graph API Explorer page as a coding exercise. This facility allows one to use the API with a temporary authorisation token. Now, I don’t know how to make an R package for the proper API where you have to register for an API key and do some Oath stuff because that is above my cu...

2258 sym R (6657 sym/4 pcs) 18 img

Web Scraping Yahoo Search Page via XPath

10.11.2011

Seeing as I’m on a bit of an XPath kick as of late, I figured I’d continue on scraping search results but this time from Yahoo.com Rolling my own version of xpathSApply to handle NULL elements seems to have done the trick and so far it’s been relatively easy to do the scraping. I’ve created an R function which will scrape information from...

2304 sym R (5908 sym/2 pcs) 18 img

Web Scraping Google+ via XPath

11.11.2011

Google+ just opened up to allow brands, groups, and organizations to create their very own public Pages on the site. This didn’t bother me to much but I’ve been hearing a lot about Google+ lately so figured it might be fun to set up an XPath scraper to extract information from each post of a status update page. I was originally going to do o...

2124 sym R (1095 sym/3 pcs) 18 img

GScholarXScraper: Hacking the GScholarScraper function with XPath

13.11.2011

Kay Cichini recently wrote a word-cloud R function called GScholarScraper on his blog which when given a search string will scrape the associated search results returned by Google Scholar, across pages, and then produce a word-cloud visualisation. This was of interest to me because around the same time I posted an independent Google Scholar scr...

5636 sym 20 img

fgui: Automatically Creating Widgets for Arguments of a Function – A Quick Example

16.11.2011

Here’s something I came across by accident, an R package called fgui which has the ability to automatically create a widget just by passing it a function with parameters, e.g.: # load packages require(fgui) # add two number together and return the value add <- function(x1,  x2) { return(x1 + x2) } # execute function through GUI y <- guiv(a...

758 sym R (180 sym/1 pcs) 6 img