Publications by Vinh Nguyen

serialize or turn a large parallel R job into smaller chunks for use with SGE

16.06.2011

I use the snow package in R with OpenMPI and SGE quite often for my simulation studies; I’ve outlined how this can be done in the past. The ease of these methods make it so simple for me to just specify the maximum number of cores available all the time. However, unless you own your own dedicated cluster, you are most likely sharing the reso...

2544 sym R (2266 sym/2 pcs)

My own programming style convention for most languages

01.07.2011

I write code mainly in R, and from times to times, in C, C++, SAS, bash, python, and perl. There are style guides out there that help make your code more consistent and readable to yourself and others. Here is a style guide for C++, and here is Google’s style guide for R and here is Hadley Wickam’s guide for R. For R, I agree more with Google...

2941 sym

R from source

11.07.2011

The following are notes for myself. I like to use the bleeding edge version of R: svn checkout https://svn.r-project.org/R/trunk/ r-devel cd r-devel ./tools/rsync-recommended ## use the following to update sources: svn update ## pre-reqs sudo apt-get build-dep r-base #sudo apt-get install gcc g++ gfortran libreadline-dev libx11-dev xorg-dev #...

489 sym R (334 sym/1 pcs)

Build multiarch R (32 bit and 64 bit) on Debian/Ubuntu

11.08.2011

I have the 64 bit version of R compiled from source on my Ubuntu laptop. I recently had a need for R based on 32 bit since a package I needed to compile and use only works in 32 bit. I thought it was readily available on Ubuntu since both 32 bit and 64 bit versions of R are shipped with the Windows and Mac OS X installers. I tried figuring out h...

1435 sym Python (840 sym/2 pcs)

Build 32 bit R on 64 bit Ubuntu by utilizing chroot

30.03.2012

In the past, I’ve described how one could build multiarch (64 bit and 32 bit) versions of R on a 64 bit Ubuntu machine. The method based on this thread no longer works as of R 2.13 or 2.14 I believe. I received advice from someone on #R over on freenode (forgot who) a few months ago that suggested the chroot route (see this also). I recently...

1318 sym Python (1575 sym/1 pcs)

Better decision tree graphics for rpart via party and partykit

29.05.2012

I’ve been using Graphviz to create better decision tree graphics “by hand” for rpart objects created in R (final tree). I stumbled on this post that shows how one could convert an rpart object to a party project via the as.party function in partykit to utilize the plot functions in party. It looks quite nice. I might have to do addition...

941 sym R (939 sym/1 pcs)

Guide to accessing MS SQL Server and MySQL server on Mac OS X

06.04.2013

Native GUI client access to MS-SQL and MySQL We can use Oracle SQL Developer with the jTDS driver to access Microsoft SQL Server. Note: jTDS version 1.3.0 did not work for me; I had to use version 1.2.6. Detailed instructions can be found here. We can use MySQL Workbench to access MySQL server. Setup is intuitively obvious. Overview of OD...

4229 sym R (2074 sym/6 pcs)

Delimited file where delimiter clashes with data values

01.08.2013

A comma-separated values (CSV) file is a typical way to store tabular/rectangular data. If a data cell contain a comma, then the cell with the commas is typically wrapped with quotes. However, what if a data cell contains a comma and a quotation mark? To avoid such scenarios, it is typically wise to use a delimiter that has a low chance of sh...

1858 sym