Publications by Joseph Rickert
The First NY R Conference
by Joseph Rickert Last Friday and Saturday the NY R Conference briefly lit up Manhattan's Union Square neighborhood as the center of the R world. You may have caught some of the glow on twitter. Jared Lander, volunteers from the New York Open Statistical Programming Meetup along with the staff at Workbench (the conference venue) set the bar pre...
3333 sym 4 img
Data Science in HR
by Joseph Rickert Last year in a post on interesting R topics presented at the JSM I described how data scientists in Google's human resources department were using R and predictive analytics to better understand the characteristics of its workforce. Google may very well have done the pioneering work, but predictive analytics for HR application...
1808 sym 2 img
Digging up embedded plots
by Joseph Rickert The following multi-panel graph, which graces the cover of the most recent issue of the Journal of Computational and Graphical Statistics ,JCGS, (Vol 24, Num 1, March 2015) is from the paper by Grolemund and Wickham entitled Visualizing Complex Data With Embedded Plots. The four plots are noteworthy for a couple or reasons: T...
4671 sym R (4244 sym/2 pcs) 4 img
Using Azure as an R data source, Part 1
by Gregory VandenbrouckSoftware Engineer at Microsoft This post is the first in a series that covers pulling data from various Windows Azure hosted storage solutions (such as MySQL, or Microsoft SQL Server) to an R client on Windows or Linux. We’ll start with a relatively simple case of pulling data from SQL Azure to an R client on Windows. C...
4713 sym R (1846 sym/5 pcs) 4 img
A first look at htmlwidgets
by Joseph Rickert A strong case can be made that base R graphics supplemented with either the lattice library or ggplot2 for plotting by subgroups provides everything a statistician might need for both exploratory data analysis and for developing clear, crisp for communicating results. However, it is abundantly clear that web based graphics, driv...
3775 sym R (970 sym/2 pcs)
Fast parallel computing with Intel Phi coprocessors
by Andrew EkstromRecovering physicist, applied mathematician and graduate student in applied Stats and systems engineering We know that R is a great system for performing statistical analysis. The price is quite nice too 😉 . As a graduate student, I need a cheap replacement for Matlab and/or Maple. Well, R can do that too. I’m running a larg...
10296 sym 2 img
First Day Highlights from the Extremely Large Databases Conference
by Joseph Rickert The 8th XLDB (Extremely Large Databases) Conference open at Stanford on Tuesday with an outstanding program. This conference has been providing leadership in the “Big Data” world since its first workshop which was held in 2007. For example, the summary report for that year notes: “Both communities (industry and scie...
4243 sym 4 img
Situational Baseball: Analyzing Runs Potential Statistics
By Mark Malter A few weeks ago, I wrote about my Baseball Stats R shiny application, where I demonstrated how to calculate runs expectancies based on the 24 possible bases/outs states for any plate appearance. In this article, I’ll explain how I expanded on that to calculate the probability of winning the game, based on the current score/inni...
3481 sym
RevoScaleR’s Naive Bayes Classifier rxNaiveBayes()
by Joseph Rickert, Because of its simplicity and good performance over a wide spectrum of classification problems the Naïve Bayes classifier ought to be on everyone's short list of machine learning algorithms. Now, with version 7.4 we have a high performance Naïve Bayes classifier in Revolution R Enterprise too. Like all Parallel External Mem...
4148 sym R (6255 sym/9 pcs) 2 img
Using Azure as an R datasource: Part 2 – Pulling data from MySQL/MariaDB
by Gregory VandenbrouckSoftware Engineer, Mirosoft This post is the second in a series that covers pulling data from various Windows Azure hosted storage solutions (such as MySQL, or Microsoft SQL Server) to an R client on Windows or Linux. Last time we covered pulling data from SQL Azure to an R client on Windows. This time we’ll be pulling d...
4921 sym R (1700 sym/6 pcs) 4 img