Publications by Joseph Rickert

Book review: "Doing Data Science" by Rachel Schutt and Cathy O’Neil

23.01.2014

by Joseph Rickert Every once in a while a single book comes to crystallize a new discipline. If books still have this power in the era of electronic media, “Doing Data Science, Straight Talk from the Frontline” by Rachel Schutt and Cathy O’Neil: O'Reilly, 2013 might just be the book that defines data science. “Doing Data Science”, which...

7030 sym

Quantitative Finance Applications in R – 3: Plotting xts Time Series

28.01.2014

by Daniel Hanson, QA Data Scientist, Revolution Analytics Introduction and Data Setup Last time, we included a couple of examples of plotting a single xts time series using the plot(.) function (ie, said function included in the xts package).  Today, we’ll look at some quick and easy methods for plotting overlays of multiple xts time series i...

10175 sym 8 img

A First Look at rxDForest()

30.01.2014

by Joseph RIckert Last July, I blogged about rxDTree() the RevoScaleR function for building classification and regression trees on very large data sets. As I explaned then, this function is an implementation of the algorithm introduced by Ben-Haim and Yom-Tov in their 2010 paper that builds trees on histograms of data and not on the raw data i...

3999 sym 4 img

Revolution Analytics announces $999 site licenses for universities and public service organizations

04.02.2014

by Joseph Rickert Revolution Analytics is announcing three new programs today that we hope will be modest but positive contributions to data science education and public service analytics. The first new program, the Academic Institution Program (AIP) enables colleges, universities and other educational institutions to obtain a site license for Re...

3747 sym

R and the Weather

06.02.2014

by Joseph Rickert The weather is on everybody's mind these days: too much ice and snow east of the Rockies and no rain to speak fo in California. Ram Narasimhan has made it a little easier for R users to keep track of what's going on and also get a historical perspective. His new R package weatherData makes it easy to down load weather data from...

1058 sym R (564 sym/1 pcs) 2 img

Revolution R Enterprise in the Amazon Cloud

12.02.2014

by Oliver Vagner, Cloud Solutions Lead Architect at Revolution Analytics Today, I am pleased to announce our new offering in the Amazon Web Services Big Data Marketplace – Revolution R Enterprise 7 for AWS. Of course, if you follow this blog, then you are quite familiar with Revolution R Enterprise (RRE) and what it brings to the table with its...

3869 sym

3D Plots in R

13.02.2014

by Joseph Rickert Recently, I was trying to remember how to make a 3D scatter plot in R when it occurred to me that the documentation on how to do this is scattered all over the place. Hence, this short organizational note that you may find useful. First of all, for the benefit of newcomers, I should mention that R has three distinct graphics sy...

3736 sym R (379 sym/1 pcs) 2 img

Princeton vs. Facebook: modeling contagion

18.02.2014

by James Paul Peruvankal, Senior Program Manager at Revolution Analytics Three weeks ago, researchers at Princeton released a study on Epidemiological modeling of online social network dynamics that states Facebook might lose 80% of its users by 2015-2017. Facebook data scientists hilariously debunked the study stating that Princeton itself would...

3447 sym 8 img

Sampling from a torus

19.02.2014

by Joseph Rickert One of the key ideas in topological data analysis is to consider a data set to be a sample from a manifold in some high dimensional topological space and then to use the tools of algebraic topology to reconstruct the manifold. It turns out that the converse problem of taking a random sample from a given topological manifold also...

2952 sym R (158 sym/1 pcs) 2 img

Quantitative Finance Applications in R – 4: Using the Generalized Lambda Distribution to Simulate Market Returns

25.02.2014

by Daniel Hanson, QA Data Scientist, Revolution Analytics Introduction As most readers are well aware, market return data tends to have heavier tails than that which can be captured by a normal distribution; furthermore, skewness will not be captured either. For this reason, a four parameter distribution such as the Generalized Lambda Distributio...

7612 sym 2 img