Publications by JD Long
Data Analysis Workflow… Part 1 of Infinity
One of the many things that I sit around pondering when I should be doing productive things is the idea of analytical workflow. I have only worked with one analytical guru who I felt really gave thought and structure to workflow and its impact on analyist productivity. When I talk about workflow I mean the whole process from the time the analytic...
3321 sym
R in The Windy City
In honor of me moving to Chicago, the powers who abide have decided to hold the first annual “R/Finance conference for applied finance using R” conference in Chicago this year. The dates are April 24-25, 2009. R/Finance 2009: Applied Finance with R To those who made the decision on location, I’m pleased but slightly embarrassed that you let...
1003 sym
Choosing an SQL Engine for Analytics
I’ve been struggling for a while on which database to use for my working data. I used to use MS Access quite a lot. The problems with MS Access include but are not limited to: 2 GB file size limit, at least historically Versions change with each edition of MS Office Sort of tough to write SQL scripts Very little automation, ie compression, bac...
3846 sym 4 img
Twitter from R… Sure, why not!
So I have started following the #RStats tag in twitter. Prior to a week ago I had never Twitterbated so I thought I would give it a go since I am not one to shy away from new technology… much. I think of Twitter like a call in radio show where I get to cut off callers when they annoy me. Well one of the interesting things I ran across was this ...
1895 sym R (406 sym/1 pcs) 2 img
Not Just Normal… Gaussian
Pretty Normal Dave, over at The Revolutions Blog, posted about the big ‘ol list of graphs created with R that are over at Wikimedia Commons. As I was scrolling through the list I recognized the standard normal distribution from the Wikipedia article on the same topic. Below is the fairly simple source code with lots of comments. Here’s the so...
795 sym R (1654 sym/1 pcs) 2 img
Who’s Tweets Do I Read… Magic R Code Says…
So one glace at my user logs shows the truth: no one gives a rat’s rump that I just quit my job; you just love you some Twitter R code. And I’m nothing but an attention whore, so come get some! So in my last ‘Twitter with R’ post I gave you some code I’d written ripped off that allowed you to update your status from R. That’s kinda co...
3864 sym 2 img
A Fast Intro to PLYR for R
I’m not dead yet! Although it has been rumored that I am. The new job is going great and I’m thrilled to be with a new firm doing interesting work alongside smart people. It makes me seem smarter by simple association. There’s been a lot going on recently in the R user community. There was an R flash mob of Stack Overflow which resulted in ...
4214 sym R (967 sym/4 pcs) 2 img
Kicking Ass with plyr
Tonight (October 29, 2009) at 5:30 PM is the Chicago R meetup at Jaks tap. Here’s more info. I’ll be making a presentation based on my earlier blog post about plyr. The presentation will only be 8 minutes long so I’ve had to pick and choose my info carefully. OK, who am I kidding? I had a couple of Schlitz (in a bottle!) for lunch over at...
1468 sym 2 img
Loading Big (ish) Data into R
So for the rest of this conversation big data == 2 Gigs. Done. Don’t give me any of this ‘that’s not big, THIS is big’ shit. There now, on with the cool stuff: This week on twitter Vince Buffalo asked about loading a 2 gig comma separated file (csv) into R (OK, he asked about tab delimited data, but I ignored that because I use mostly com...
3440 sym 2 img
Struggling with apply() in R
It’s common knowledge that I struggle wrapping my head around the apply functions in R. That is illustrated very clearly in the following discussion on Stack Overflow: Dirk’s comment is actually spot on. I’ve asked the same damn question at least 4-5 times. Only I didn’t really understand it was the same question. That’s one of the pro...
1958 sym 2 img