Publications by Kay Cichini

Apprentice Piece with Lattice Graphs

28.02.2012

Lattice graphs can be quite tedious. I don’t use them too often and  when I need them I usually have to dig the archives for the parameter-details.The here presented example may serve as a welcome template for the usage of panel functions, panel ordering, for drawing of lattice keys, etc. You can download the example data HERE.(Also, check thi...

799 sym R (1102 sym/1 pcs) 4 img

R-Function to Read Data from Google Docs Spreadsheets

13.03.2012

I used this idea posted on Stack Overflow to plug together a function for reading data from Google Docs spreadsheets into R. google_ss { if (is.na(gid)) {stop(“\nWorksheetnumber (gid) is missing\n”)} if (is.na(key)) {stop(“\nDocumentkey (key) is missing\n”)} require(RCurl) url “&single=true&gid=�...

1205 sym 4 img

Creating a Stratified Random Sample of a Dataframe

14.03.2012

Expanding on a question on Stack Overflow I’ll show how to make a stratified random sample of a certain size: d <- expand.grid(id = 1:35000, stratum = letters[1:10]) p = 0.1 dsample <- data.frame() system.time( for(i in levels(d$stratum)) { dsub <- subset(d, d$stratum == i) B = ceiling(nrow(dsub) * p) dsub <- dsub[sample(1:...

518 sym R (368 sym/1 pcs) 2 img

Custom Summary Stats as Dataframe or List

24.03.2012

On Stackoverflow I found this useful example on how to apply custom statistics on a dataframe and return the results as list or dataframe:somedata<- data.frame(                 year=rep(c(1990,1995,2000,2005,2010),times=3),                 country=rep(c("US", "Brazil", "Asia"), each=5),          ...

546 sym R (776 sym/1 pcs) 2 img

Classification Trees and Spatial Autocorrelation

25.03.2012

I’m currently trying to model species presence / absence data (N = 523) that were collected over a geographic area and are possibly spatially autocorrelated. Samples come from preferential sites (sea level > 1200 m, obligatory presence of permanent waterbodies, etc). My main goal is to infere on environmental factors determining the occurrence ...

1748 sym R (1324 sym/1 pcs) 6 img

How to Extract Citation from a Body of Text

26.03.2012

Say, you have a text and you want to retrieve the cited names and years of publication. You wouldn’t want to this by hand, wouldn’t you?Try the following approach:(the text sample comes from THIS freely available publication)library(stringr) (txt <- readLines("http://dl.dropbox.com/u/68286640/Test_Doc.txt")) [1] "1 Introduction"...

633 sym R (12684 sym/1 pcs) 2 img

Applying Same Changes to Multiple Dataframes

28.03.2012

How to apply the same changes to several dataframes andsave them to CSV: # a dataframe a <- data.frame(x = 1:3, y = 4:6) # make a list of several dataframes, then apply function (change column names, e.g.): my.list <- list(a, a) my.list <- lapply(my.list, function(x) {names(x) <- c("a", "b") ; return(x)}) # save dfs to csv with simi...

662 sym R (1125 sym/2 pcs) 2 img

Playing with XML-Package: Get No. of Google Search Hits with R

30.03.2012

GoogleHits <- function(input) { require(XML) require(stringr) require(RCurl) url <- paste("https://www.google.com/search?q=\"", input, "\"", sep = "") CAINFO = paste(system.file(package="RCurl"), "/CurlSSL/ca-bundle.crt", sep = "") script <- getURL(url, followlocation = TRUE, cainfo = CAINFO) doc <...

506 sym R (673 sym/2 pcs) 2 img

A Little Web Scraping Exercise with XML-Package

05.04.2012

Some months ago I posted an example of how to get the links of the contributing blogs on the R-Blogger site. I used readLines() and did some string processing using regular expressions.With package XML this can be drastically shortened – see this:# get blogger urls with XML: library(RCurl) library(XML) script <- getURL("www.r-blogge...

892 sym R (646 sym/2 pcs)

R-Bloggers’ Web-Presence

06.04.2012

We love them, we hate them: RANKINGS!Rankings are an inevitable tool to keep the human rat race going. In this regard I’ll pick up my last two posts (HERE & HERE) and have some fun with it by using it to analyse R-Bloggers’ web presence. I will use number of hits in Google Search as an indicator.I searched for URLs like this: http...

1072 sym R (3024 sym/1 pcs) 1 tbl