Publications by Joseph Rickert

The First NY R Conference

30.04.2015

by Joseph Rickert Last Friday and Saturday the NY R Conference briefly lit up Manhattan's Union Square neighborhood as the center of the R world. You may have caught some of the glow on twitter. Jared Lander, volunteers from the New York Open Statistical Programming Meetup along with the staff at Workbench (the conference venue) set the bar pre...

3333 sym 4 img

Data Science in HR

05.05.2015

by Joseph Rickert Last year in a post on interesting R topics presented at the JSM I described how data scientists in Google's human resources department were using R and predictive analytics to better understand the characteristics of its workforce.  Google may very well have done the pioneering work, but predictive analytics for HR application...

1808 sym 2 img

Digging up embedded plots

07.05.2015

by Joseph Rickert The following multi-panel graph, which graces the cover of the most recent issue of the Journal of Computational and Graphical Statistics ,JCGS, (Vol 24, Num 1, March 2015) is from the paper by Grolemund and Wickham entitled Visualizing Complex Data With Embedded Plots. The four plots are noteworthy for a couple or reasons:  T...

4671 sym R (4244 sym/2 pcs) 4 img

Using Azure as an R data source, Part 1

12.05.2015

by Gregory VandenbrouckSoftware Engineer at Microsoft This post is the first in a series that covers pulling data from various Windows Azure hosted storage solutions (such as MySQL, or Microsoft SQL Server) to an R client on Windows or Linux. We’ll start with a relatively simple case of pulling data from SQL Azure to an R client on Windows. C...

4713 sym R (1846 sym/5 pcs) 4 img

A first look at htmlwidgets

14.05.2015

by Joseph Rickert A strong case can be made that base R graphics supplemented with either the lattice library or ggplot2 for plotting by subgroups provides everything a statistician might need for both exploratory data analysis and for developing clear, crisp for communicating results. However, it is abundantly clear that web based graphics, driv...

3775 sym R (970 sym/2 pcs)

Fast parallel computing with Intel Phi coprocessors

19.05.2015

by Andrew EkstromRecovering physicist, applied mathematician and graduate student in applied Stats and systems engineering We know that R is a great system for performing statistical analysis. The price is quite nice too 😉 . As a graduate student, I need a cheap replacement for Matlab and/or Maple. Well, R can do that too. I’m running a larg...

10296 sym 2 img

First Day Highlights from the Extremely Large Databases Conference

21.05.2015

by Joseph Rickert The 8th XLDB (Extremely Large Databases) Conference open at Stanford on Tuesday with an outstanding program. This conference has been providing leadership in the “Big Data” world since its first workshop which was held in 2007. For example, the summary report for that year notes: “Both communities (industry and scie...

4243 sym 4 img

Situational Baseball: Analyzing Runs Potential Statistics

26.05.2015

By Mark Malter A few weeks ago, I wrote about my Baseball Stats R shiny application, where I demonstrated how to calculate runs expectancies based on the 24 possible bases/outs states for any plate appearance.  In this article, I’ll explain how I expanded on that to calculate the probability of winning the game, based on the current score/inni...

3481 sym

RevoScaleR’s Naive Bayes Classifier rxNaiveBayes()

28.05.2015

by Joseph Rickert, Because of its simplicity and good performance over a wide spectrum of classification problems the Naïve Bayes classifier ought to be on everyone's short list of machine learning algorithms. Now, with version 7.4 we have a high performance Naïve Bayes classifier in Revolution R Enterprise too. Like all Parallel External Mem...

4148 sym R (6255 sym/9 pcs) 2 img

Using Azure as an R datasource: Part 2 – Pulling data from MySQL/MariaDB

02.06.2015

by Gregory VandenbrouckSoftware Engineer, Mirosoft This post is the second in a series that covers pulling data from various Windows Azure hosted storage solutions (such as MySQL, or Microsoft SQL Server) to an R client on Windows or Linux. Last time we covered pulling data from SQL Azure to an R client on Windows. This time we’ll be pulling d...

4921 sym R (1700 sym/6 pcs) 4 img