Publications by Henrik Bengtsson
TO STUDENTS: matrixStats for Google Summer of Code
We are pleased to announce our proposal ‘Subsetted and parallel computations in matrixStats‘ for Google Summer of Code. The project is aimed for a student with experience in R and C, it runs for three months, and the student gets paid 5500 USD by Google. Students from (almost) all over the world can apply. Application deadline ...
1403 sym 2 img
PERFORMANCE: Calling R_CheckUserInterrupt() every 256 iteration is actually faster than ever 1,000,000 iteration
If your native code takes more than a few seconds to finish, it is a nice courtesy to the user to check for user interrupts (Ctrl-C) once in a while, say, every 1,000 or 1,000,000 iteration. The C-level API of R provides R_CheckUserInterrupt() for this (see 'Writing R Extensions' for more information on this function). Here's what the code woul...
3453 sym R (1140 sym/4 pcs) 2 img 1 tbl
Milestone: 7000 packages on CRAN
Another 1000 packages were added to CRAN, which took less than 9 months. Today (August 12, 2015), the Comprehensive R Archive Network (CRAN) package page reports: “Currently, the CRAN package repository features 7002 available packages.” While the previous 1000 packages took 355 days, going from 6000 to 7000 packages took 286 da...
1737 sym 1 img
matrixStats: Optimized subsetted matrix calculations
The matrixStats package provides highly optimized functions for computing common summaries over rows and columns of matrices. In a previous blog post, I showed that, instead of using apply(X, MARGIN=2, FUN=median), we can speed up calculations dramatically by using colMedians(X). In the most recent release (version 0.50.0), matrixStats has bee...
4687 sym R (674 sym/8 pcs) 2 img 1 tbl
A Future for R: Slides from useR 2016
Unless you count DSC 2003 in Vienna, last week's useR conference at Stanford was my very first time at useR. It was a great event, it was awesome to meet our lovely and vibrant R community in real life, which we otherwise only get know from online interactions, and of course it was very nice to meet old friends and make new ones. The future is p...
2129 sym R (261 sym/1 pcs) 2 img 1 tbl
A Future for R: Slides from useR 2016
Unless you count DSC 2003 in Vienna, last week's useR conference at Stanford was my very first time at useR. It was a great event, it was awesome to meet our lovely and vibrant R community in real life, which we otherwise only get know from online interactions, and of course it was very nice to meet old friends and make new ones. The future is p...
2091 sym R (261 sym/1 pcs) 2 img 1 tbl
Remote Processing Using Futures
A new version of the future package has been released and is available on CRAN. With futures, it is easy to write R code once, which later the user can choose to parallelize using whatever resources s/he has available, e.g. a local machine, a set of local notebooks, a set of remote machines, or a high-end compute cluster. The future provides com...
4639 sym R (673 sym/2 pcs) 4 img 2 tbl
Start me up
The startup package makes it easy to control your R startup processes and to share part of your startup settings with others (e.g. as a public Git repository) while keeping secret parts to yourself. Instead of having long and windy .Renviron and .Rprofile startup files, you can split them up into short specific files under corresponding .Renviro...
1760 sym R (770 sym/2 pcs)
future: Reproducible RNGs, future_lapply() and more
future 1.3.0 is available on CRAN. With futures, it is easy to write R code once, which the user can choose to evaluate in parallel using whatever resources s/he has available, e.g. a local machine, a set of local machines, a set of remote machines, a high-end compute cluster (via future.BatchJobs and soon also future.batchtools), or in the clo...
5545 sym R (811 sym/1 pcs) 2 img 1 tbl
doFuture: A universal foreach adaptor ready to be used by 1,000+ packages
doFuture 0.4.0 is available on CRAN. The doFuture package provides a universal foreach adaptor enabling any future backend to be used with the foreach() %dopar% { ... } construct. As shown below, this will allow foreach() to parallelize on not only multiple cores, multiple background R sessions, and ad-hoc clusters, but also cloud-based clusters ...
4312 sym R (1782 sym/1 pcs) 2 img 1 tbl