Publications by Joseph Rickert

Fun with ddR: Using Distributed Data Structures in R

08.12.2015

by Edward Ma and Vishrut Gupta (Hewlett Packard Enterprise) A few weeks ago, we revealed ddR (Distributed Data-structures in R), an exciting new project started by R-Core, Hewlett Packard Enterprise, and others that provides a fresh new set of computational primitives for distributed and parallel computing in R. The package sets the seed for what...

5604 sym 2 img

Wald’s graphical sequential inspection procedure

10.12.2015

by John Mount Ph.D.Data Scientist at Win-Vector LLC Our most recent article was a dynamic programming solution to the A/B test problem. Explicitly solving such dynamic programs is a long and tedious process, so you are well served by finding and introducing clever invariants to track (something better than just raw win-rates). This clever idea, ...

6954 sym 6 img

Trade-offs to consider when reading a large dataset into R using the RevoScaleR package

15.12.2015

by Seth Mottaghinejad, Data Scientist at Microsoft R and big data There are many R packages dedicated to letting users (or useRs if you prefer) deal with big data in R. (We will intentionally avoid using proper case for 'big data', because (1) the term has been somewhat hackneyed, and (2) for the sake of this article we can think of big data as a...

17064 sym 2 img 1 tbl

Looking forward to 2016

24.12.2015

by Joseph Rickert The following map of all of the R user groups listed in Microsoft's Local R User Group Directory is good way to visualize the R world as we rocket into 2016. As a member of the useR!2016 planning committee, foremost in my mind right now is that in just a few months people will be coming to Stanford from all points plotted and al...

3099 sym 2 img

7th Meeting of Spanish R Users. 5-6 November 2015. Salamanca (Spain)

05.01.2016

By Virgilio Gómez Rubio, Spanish R Users Organizing Committee As every autumn since 2009, Spanish R users gathered at their annual meeting. It is organised by Spanish R users group ‘Comunidad R-Hispano’and took place in 5-6 November in the historic city of Salamanca. The 7th Meeting of Spanish R Users attracted more than 100 R entusiasts and...

2964 sym 2 img

Getting Started with Markov Chains

07.01.2016

 by Joseph Rickert There are number of R packages devoted to sophisticated applications of Markov chains. These include msm and SemiMarkov for fitting multistate models to panel data, mstate for survival analysis applications, TPmsm for estimating transition probabilities for 3-state progressive disease models, heemod for applying Markov models...

7579 sym 6 img

Microsoft R Server available free to students with DreamSpark

12.01.2016

by Joseph Rickert Over the last 6 years, thousands of students and faculty have downloaded Revolution R Enterprise (RRE) from Revolution Analytics for free, making it possible for them to do statistical modeling on large data sets with the same R language used by savvy statisticians and data scientists in business and industry. In addition to t...

4069 sym 2 img

New Data Sources for R

14.01.2016

by Joseph Rickert Over the past few months, a number of new CRAN packages have appeared that make it easier for R users to gain access to curated data. Most of these provide interfaces to a RESTful API written by the data publishers while a few just wrap the data set inside the package. Some of the new packages are only very simple, one function ...

4879 sym 8 img

A gentle introduction to parallel computing in R

19.01.2016

by John Mount Ph.D.Data Scientist at Win-Vector LLC Let's talk about the use and benefits of parallel computation in R. IBM's Blue Gene/P massively parallel supercomputer (Wikipedia). Parallel computing is a type of computation in which many calculations are carried out simultaneously.” Wikipedia quoting: Gottlieb, Allan; Almasi, George S. (1...

2986 sym 2 img

Getting Started with Markov Chains: Part 2

22.01.2016

by Joseph Rickert In a previous post, I showed some elementary properties of discrete time Markov Chains could be calculated, mostly with functions from the markovchain package. In this post, I would like to show a little bit more of the functionality available in that package by fitting a Markov Chain to some data. In this first block of code, I...

6248 sym 4 img