Publications by Deciphering life: One bit at a time :: R

Portable, personal packages

15.10.2013

Portable, personal packages ProgrammingR had an interesting post recently about keeping a set of R functions that are used often as a gist on Github, and sourceing that file at the beginning of R analysis scripts. There is nothing inherently wrong with this, but it does end up cluttering the user workspace, and there is no real documentation on t...

1852 sym

R, RStudio, and release and dev Bioconductor

16.10.2013

R, RStudio, and release and dev Bioconductor I have one Bioconductor package that I am currently responsible for. Each bi-annual release of Bioconductor requires testing and squashing errors, warnings and bugs in a given package. Doing this means being able to work with multiple versions of R and multiple versions of Bioconductor libraries on a s...

2253 sym R (385 sym/4 pcs)

Open VS Closed Analysis Languages

21.10.2013

Open VS Closed Analysis Languages TL;DR I think data scientists should choose to learn open languages such as R and python because they are open in the sense that anyone can obtain them, use them and modify them for free, and this has lead to large, robust groups of users, making it more likely that packages exist that you can use, and others can...

5218 sym

Pre-calculating large tables of values

22.10.2013

Pre-calculating large tables of values I'm currently working on a project where we want to know, based on a euclidian distance measure, what is the probability that the value is a match to the another value. i.e. given an actual value, and a theoretical value from calculation, what is the probability that they are the same? This can be calculated...

3886 sym R (3179 sym/10 pcs) 6 img

Using Samatha

07.12.2013

Using Samatha So I decided to try out David Springates Samatha package for statically building this blog. If you decide to use it, be warned, right now it is a little rough around the edges, and needs some work. David created his own blog using it, and has been using that as a test case. Unfortunately, that has meant that there are bugs that have...

2092 sym R (262 sym/3 pcs)

R Interface for Teaching

07.12.2013

R Interface for Teaching Kaitlin Thaney asked on Twitter last week about using Ramnath Vaidyanathan's new interactive R notebook 1 2 for teaching. Liking the look of interactive R notebook by @ramnath_vaidya. Any success stories in using to teach? http://t.co/wmVuFM2Rst (HT @_inundata)— Kaitlin Thaney (@kaythaney) July 9, 2013 Now, to be clear ...

4697 sym

Tim Hortons Density

08.12.2013

Tim Hortons Density Inspired by this post, I wanted to examine the locations and density of Tim Hortons restaurants in Canada. Using Stats Canada data, each census tract is queried on Foursquare for Tims locations. Setup options(stringsAsFactors = F) require(timmysDensity) require(plyr) require(maps) require(ggplot2) require(geosphere) Statistic...

4546 sym R (5674 sym/13 pcs) 4 img

Storing package data in custom environments

09.12.2013

Storing package data in custom environments If you do R package development, sometimes you want to be able to store variables specific to your package, without cluttering up the users workspace. One way to do this is by modifying the global options. This is done by packages grDevices and parallel. Sometimes this doesn't seem to work quite right (...

2239 sym

Hive Plots using R and Cytoscape

09.12.2013

I found out about HivePlots this past summer, and although I thought they looked incredibly useful and awesome, I didn't have a personal use for them at the time, and therefore put off doing anything with them. That recently changed when I encountered some particularly nasty hairballs of force-directed graphs. Unfortunately, the HiveR packa...

5597 sym R (6749 sym/14 pcs) 4 img

Writing papers using R Markdown

10.12.2013

Writing papers using R Markdown I have been watching the activity in RStudio and knitr for a while, and have even been using Rmd (R markdown) files in my own work as a way to easily provide commentary on an actual dataset analysis. Yihui has proposed writing papers in markdown and posting them to a blog as a way to host a statistics journal, and ...

7669 sym R (2784 sym/11 pcs) 2 img 1 tbl