Publications by Joseph Rickert

The Generalized Lambda Distribution and GLDEX Package: Fitting Financial Return Data

07.10.2014

by Daniel Hanson, with contributions by Steve Su (author of the GLDEX package). Part 1 of a series. Introduction As most readers are well aware, market return data tends to have heavier tails than that which can be captured by a normal distribution; furthermore, skewness will not be captured either. For this reason, a four parameter distribution ...

11695 sym 8 img

A Note on Tweedie

09.10.2014

by Joseph Rickert In a recent post I talked about the information that can be developed by fitting a Tweedie GLM to a 143 million record version of the airlines data set. Since I started working with them about a year or so ago, I now see Tweedie models everywhere. Basically, any time I come across a histogram that looks like it might be a sample...

4105 sym 4 img

The Generalized Lambda Distribution and GLDEX Package for Fitting Financial Return Data – Part 2

14.10.2014

Part 2 of a seriesby Daniel Hanson, with contributions by Steve Su (author of the GLDEX package)   Recap of Part 1 In our previous article, we introduced the four-parameter Generalized Lambda Distribution (GLD) and looked at fitting a 20-year set of returns from the Wilshire 5000 Index, comparing the results of two methods, namely the Method of ...

8982 sym 8 img

A first look at Distributed R

23.10.2014

by Joseph Rickert One of the most interesting R related presentations at last week’s Strata Hadoop World Conference in New York City was the session on Distributed R by Sunil Venkayala  and Indrajit Roy, both of HP Labs. In short, Distributed R is an open source project with the end goal of running  R code in parallel on data that is distribu...

5225 sym 2 img

Type III tests and R

28.10.2014

by Terry M. Therneau Ph.D.Faculty, Mayo Clinic About a year ago there was a query about how to do “type 3” tests for a Cox model on the R help list, which someone wanted because SAS does it. The SAS addition looked suspicious to me, but as the author of the survival package I thought I should understand the issue more deeply. It took far long...

4946 sym

Some R Highlights from the Bay Area Data Science Camp and Unconference

30.10.2014

by Joseph Rickert The San Francisco Bay Area Chapter of the Association of Computing Machinery (ACM) has been holding an annual Data Mining Camp and “unconference” since 2009. This year, to reflect the times, the group held a Data Science Camp and unconference, and we at Revolution Analytics were, once again, very happy to be a sponsor for th...

2477 sym 6 img

A Look at the World Values Survey

04.11.2014

by Peggy FanPh.D. Candidate at Stanford's Graduate School of Education Part of my dissertation at Stanford Graduate School of Education, International Comparative Education program, is looking at the World Values Survey (WVS), a cross-national social survey that started in 1981. Since then there has been 6 waves, and the surveys include questions...

3976 sym R (1905 sym/2 pcs) 6 img

Looking into a very messy data set

06.11.2014

by Joseph Rickert I recently had the opportunity to look at the data used for the 2009 KDD Cup competition. There are actually two sets of files that are still available from this competition. The “large” file is a series of five .csv files that when concatenated form a data set with 50,000 rows and 15,000 columns. The “small” file al...

3803 sym R (2934 sym/2 pcs) 4 img

3D Plots with ggplot2 and Plotly

11.11.2014

by Matt SundquistPlotly, co-founder Plotly is a platform for data analysis, graphing, and collaboration. You can use ggplot2, Plotly's R API, and Plotly's web app to make and share interactive plots. Now, you can you can also make 3D plots. Immediately below are a few examples of 3D plots.  In this post we will show how to make 3D plots with ggp...

2991 sym R (965 sym/3 pcs) 10 img

A look at the igraph package

13.11.2014

by Joseph Rickert The igraph package has become a fundamental tool for the study of graphs and their properties, the manipulation and visualization of graphs and the statistical analysis of networks. To get an idea of just how firmly igraph has become embedded into the R package ecosystem consider that currently igraph lists 72 reverse depends...

2948 sym R (1171 sym/2 pcs) 2 img 1 tbl