Publications by Joseph Rickert

R Package ‘smbinning’: Optimal Binning for Scoring Modeling


by Herman Jopia What is Binning? Binning is the term used in scoring modeling for what is also known in Machine Learning as Discretization, the process of transforming a continuous characteristic into a finite number of intervals (the bins), which allows for a better understanding of its distribution and its relationship with a binary variable. ...

4833 sym R (730 sym/1 pcs) 8 img

Review of "Hands-On Programming with R"


by Joseph Rickert There have been well over a hundred books on R published within the last ten years. Most of these texts with titles like “Introduction Statistics with R” or “Time Series with R” offer the reader a way to jump right in and perform some concrete statistical analysis using R’s myriad built-in functions and extensive visua...

5944 sym

Targeted Learning R Packages for Causal Inference and Machine Learning


by Sherri RoseAssistant Professor of Health Care PolicyHarvard Medical School Targeted learning methods build machine-learning-based estimators of parameters defined as features of the probability distribution of the data, while also providing influence-curve or bootstrap-based confidence internals. The theory offers a general template for creati...

6156 sym R (1157 sym/1 pcs) 2 img

Coarse Grain Parallelism with foreach and rxExec


by Joseph Rickert I have written a several posts about the Parallel External Memory Algorithms (PEMAs) in Revolution Analytics’ RevoScaleR package, most recently about rxBTrees(), but I haven’t said much about rxExec(). rxExec() is not itself a PEMA, but it can be used to write parallel algorithms. Pre-built PEMAs such as rxBTrees(), rxLinMod...

5129 sym R (3254 sym/5 pcs)

Exploring San Francisco with choroplethrZip


by Ari Lamstein Introduction Today I will walk through an analysis of San Francisco Zip Code Demographics using my new R package choroplethrZip. This package creates choropleth maps of US Zip Codes and connects to the US Census Bureau. A choropleth is a map that shows boundaries of regions (such as zip codes) and colors those regions according to...

4604 sym 10 img

Where are the R users?


by Joseph Rickert A recent post by David Smith included a map that shows the locations of R user groups around the world. While is exhilarating to see how R user groups span the globe, the map does not give any idea about the size of the community at each location. The following plot, constructed from information on the websites of the groups li...

2708 sym 2 img

RPowerLabs: Electric power system virtual laboratories online


by Ben UbahFounder, RPowerLabs No disregard to R's colleagues, R is pioneering the creation of online virtual electric power system laboratories via RPowerLABS. RPowerLABS is a project, with the vision of deploying online, a vast array of highly demanded power system simulations for teaching and research using R. It started as an attempt to a...

4140 sym 10 img

R User Group Meetings this week in the Bay Area and around the world


by Joseph Rickert Tracking R user group meetings is a good way to stay informed about what's happening in the R world. On Tuesday the Bay Area useR Group (BARUG) met at AdRoll in San Francisco. It was a mini-conference with 6 talks: Bryan Galvin our host at AdRoll (many thanks for the pizza and beer) kicked off the evening by showing how his com...

3770 sym 2 img

R for more powerful clustering


by Vidisha VachharajaniFreelance Statistical Consultant R showcases several useful clustering tools, but the one that seems particularly powerful is the marriage of hierarchical clustering with a visual display of its results in a heatmap. The term “heatmap” is often confusing, making most wonder – which is it? A “colorful visual represen...

4455 sym R (1374 sym/2 pcs) 4 img

The new science journalism and open science


by Joseph Rickert The New York Times is quietly changing the practice of science journalism. The Tuesday April 21, 2015 article: Ebola Lying in Wait, reports on “A growing body of scientific clues – some ambiguous, other substantive” that the Ebola virus may have lain dormant in West African rain forest for years before igniting last year's...

2848 sym R (2369 sym/1 pcs)