Publications by JD Long

Starting an EC2 Machine Then Setting Up a Socks Proxy… From R!

16.07.2010

I do some work from home, some work from an office in Chicago and some work on the road. It’s not uncommon for me to want to tunnel all my web traffic through a VPN tunnel. In one of my previous blog posts I alluded to using Amazon EC2 as a way to get around your corporate IT mind control voyeurs service providers. This tunneling method is one ...

3763 sym R (52 sym/1 pcs) 2 img

Stochastic Simulation With Copulas in R

30.08.2010

A friend of mine gave me a call last week and was wondering if I had a little R code that could illustrate how to do a Cholesky decomposition. He ultimately wanted to build a Monte Carlo model with correlated variables. I pointed him to a number of packages that do Cholesky decomp but then I recommended he consider just using a Gaussian Copula a...

2059 sym 2 img

Even Simpler Multivariate Correlated Simulations

31.08.2010

So after yesterday’s post on Simple Simulation using Copulas I got a very nice email that basically begged the question, “Dude, why are you making this so hard?” The author pointed out that if what I really want is a Gaussian correlation structure for Gaussian distributions then I could simply use the mvrnorm() function from the MASS packag...

1884 sym 2 img

Third, and Hopefully Final, Post on Correlated Random Normal Generation (Cholesky Edition)

02.09.2010

André-Louis Cholesky is my homeboy When I did a brief post three days ago I had no plans on writing two more posts on correlated random number generation. But I’ve gotten a couple of emails, a few comments, and some Twitter feedback. In response to my first post, Gappy, calls me out and says, “the way mensches do multivariate (log)normal var...

2164 sym R (624 sym/1 pcs) 2 img

Principal Component Analysis (PCA) vs Ordinary Least Squares (OLS): A Visual Explanation

16.09.2010

Over at stats.stackexchange.com recently, a really interesting question was raised about principal component analysis (PCA). The gist was “Thanks to my college class I can do the math, but what does it MEAN?” I felt like this a number of times in my life. Many of my classes were focused on the technical implementations they kinda missed the s...

4045 sym 12 img

Connecting to SQL Server from R using RJDBC

22.09.2010

A few months ago I switched my laptop from Windows to Ubuntu Linux. I had been connecting to my corporate SQL Server database using RODBC on Windows so I attempted to get ODBC connectivity up and running on Ubuntu. ODBC on Ubuntu turned into an exercise in futility. I spent many hours over many days and never was able to connect from R on Ubuntu ...

2259 sym R (647 sym/2 pcs) 2 img

Controlling Amazon Web Services using rJava and the AWS Java SDK

30.11.2010

I’ve been messing around with using Amazon Web Services for a while. I’ve had some projects where I wanted to upload files to S3 or fire off EMR jobs. I’ve been controlling AWS services using a hodgepodge of command line tools and the R system() function to call the tools from the command line. This has some real disadvantages, however. Us...

2728 sym 2 img

Where the heck has JD been?

22.03.2011

It’s been pointed out to me that I haven’t had any blog posts in a while. It’s true. I’m fairly slack. But in the last few months I’ve changed jobs (same firm, new role), written an R abstraction on top of Hadoop, been to China, and managed to stay married. While that sounds pretty awesome, I’m nothing compared to Hideaki Akaiwa. And ...

1037 sym 2 img

Fast Two Way Sync in Ubuntu!

09.04.2011

I love the portability of a laptop. I have a 45 min train ride twice a day and I fly a little too, so having my work with me on my laptop is very important. But I hate doing long running analytics on my laptop when I’m in the office because it bogs down my laptop and all those videos on The Superficial get all jerky and stuff. I get around this...

7674 sym 6 img

Details of two-way sync between two Ubuntu machines

18.04.2011

In a previous post I discussed my frustrations with trying to get Dropbox or Spideroak to perform BOTH encrypted remote backup and AND fast two way file syncing. This is the detail of how I set up for two machines, both Ubuntu 10.10, to perform two way sync where a file change on either machine will result in that change being replicated on the o...

5442 sym 2 img