Publications by Kay Cichini
Apprentice Piece with Lattice Graphs
Lattice graphs can be quite tedious. I don’t use them too often and when I need them I usually have to dig the archives for the parameter-details.The here presented example may serve as a welcome template for the usage of panel functions, panel ordering, for drawing of lattice keys, etc. You can download the example data HERE.(Also, check thi...
799 sym R (1102 sym/1 pcs) 4 img
R-Function to Read Data from Google Docs Spreadsheets
I used this idea posted on Stack Overflow to plug together a function for reading data from Google Docs spreadsheets into R. google_ss { if (is.na(gid)) {stop(“\nWorksheetnumber (gid) is missing\n”)} if (is.na(key)) {stop(“\nDocumentkey (key) is missing\n”)} require(RCurl) url “&single=true&gid=�...
1205 sym 4 img
Creating a Stratified Random Sample of a Dataframe
Expanding on a question on Stack Overflow I’ll show how to make a stratified random sample of a certain size: d <- expand.grid(id = 1:35000, stratum = letters[1:10]) p = 0.1 dsample <- data.frame() system.time( for(i in levels(d$stratum)) { dsub <- subset(d, d$stratum == i) B = ceiling(nrow(dsub) * p) dsub <- dsub[sample(1:...
518 sym R (368 sym/1 pcs) 2 img
Custom Summary Stats as Dataframe or List
On Stackoverflow I found this useful example on how to apply custom statistics on a dataframe and return the results as list or dataframe:somedata<- data.frame( year=rep(c(1990,1995,2000,2005,2010),times=3), country=rep(c("US", "Brazil", "Asia"), each=5), ...
546 sym R (776 sym/1 pcs) 2 img
Classification Trees and Spatial Autocorrelation
I’m currently trying to model species presence / absence data (N = 523) that were collected over a geographic area and are possibly spatially autocorrelated. Samples come from preferential sites (sea level > 1200 m, obligatory presence of permanent waterbodies, etc). My main goal is to infere on environmental factors determining the occurrence ...
1748 sym R (1324 sym/1 pcs) 6 img
How to Extract Citation from a Body of Text
Say, you have a text and you want to retrieve the cited names and years of publication. You wouldn’t want to this by hand, wouldn’t you?Try the following approach:(the text sample comes from THIS freely available publication)library(stringr) (txt <- readLines("http://dl.dropbox.com/u/68286640/Test_Doc.txt")) [1] "1 Introduction"...
633 sym R (12684 sym/1 pcs) 2 img
Applying Same Changes to Multiple Dataframes
How to apply the same changes to several dataframes andsave them to CSV: # a dataframe a <- data.frame(x = 1:3, y = 4:6) # make a list of several dataframes, then apply function (change column names, e.g.): my.list <- list(a, a) my.list <- lapply(my.list, function(x) {names(x) <- c("a", "b") ; return(x)}) # save dfs to csv with simi...
662 sym R (1125 sym/2 pcs) 2 img
Playing with XML-Package: Get No. of Google Search Hits with R
GoogleHits <- function(input) { require(XML) require(stringr) require(RCurl) url <- paste("https://www.google.com/search?q=\"", input, "\"", sep = "") CAINFO = paste(system.file(package="RCurl"), "/CurlSSL/ca-bundle.crt", sep = "") script <- getURL(url, followlocation = TRUE, cainfo = CAINFO) doc <...
506 sym R (673 sym/2 pcs) 2 img
A Little Web Scraping Exercise with XML-Package
Some months ago I posted an example of how to get the links of the contributing blogs on the R-Blogger site. I used readLines() and did some string processing using regular expressions.With package XML this can be drastically shortened – see this:# get blogger urls with XML: library(RCurl) library(XML) script <- getURL("www.r-blogge...
892 sym R (646 sym/2 pcs)
R-Bloggers’ Web-Presence
We love them, we hate them: RANKINGS!Rankings are an inevitable tool to keep the human rat race going. In this regard I’ll pick up my last two posts (HERE & HERE) and have some fun with it by using it to analyse R-Bloggers’ web presence. I will use number of hits in Google Search as an indicator.I searched for URLs like this: http...
1072 sym R (3024 sym/1 pcs) 1 tbl