Publications by matloff

Snowdoop, Part II

07.12.2014

In my last post, I questioned whether the fancy Big Data processing tools such as Hadoop and Spark are really necessary for us R users.  My argument was that (a) these tools tend to be difficult to install and configure, especially for non-geeks; (b) the tools require learning new computation paradigms and function calls; and (c) one should be a...

3416 sym R (1246 sym/2 pcs) 4 img

New Package: partools

15.12.2014

I mentioned last week that I would be putting together a package, based in part on my posts on Snowdoop.  I’ve now done so, in a package partools., with the name alluding to the fact that they are intended for use with the cluster-based part of R’s parallel package.  The main ingredients are: Various code snippets to faciltate parallel co...

945 sym 4 img

More Snowdoop Coming

16.12.2014

In spite of the banter between Yihui and me, I’m glad to hear that he may be interested in Snowdoop, as are some others.  I’m quite busy this week (finishing writing my Parallel Computation for Data Science book, and still have a lot of Fall Quarter grading to do 🙂 ), but you’ll definitely be hearing more from me on Snowdoop and partoo...

1145 sym 4 img

Snowdoop/partools Update

27.12.2014

I’ve put together an updated version of my partools package, including Snowdoop, an alternative to MapReduce algorithms.  You can download it here, version 1.0.1. To review:  The idea of Snowdoop is to create your own file chunking, rather than having something like Hadoop do it for you, and then using ordinary R coding to perform parallel op...

1615 sym 4 img

Snowdoop/partools Package Now on CRAN

03.01.2015

I’ve now placed the partools package, including Snowdoop, on CRAN.  No major new functions since my last posting, but the existing functions have been made more versatile and convenient, and the documentation is now more detailed, with more examples and so on.  I do have more functions planned. It is all platform independent, except for the d...

846 sym 4 img

Debugging Parallel Code with dbs()

04.01.2015

I mentioned yesterday that my partools package is now on CRAN.  A number of people have expressed interest in the Snowdoop section, but in this post I want to call attention to the dbs() debugging tool in the package, useful for debugging code written for the portion of R’s parallel library that came from the old snow package. I like to contin...

4348 sym R (42 sym/2 pcs) 6 img

OpenMP Tutorial, with R Interface

17.01.2015

Almost any PC today is multicore.  Dual-core is standard, quad-core is easily attainable for the home, and larger systems, say 16-core, are easily within reach of even smaller research projects. In addition, large multicore systems can be “rented” on Amazon EC2 and so on. The most popular way to program on multicore machines is to use OpenMP...

7225 sym Python (4171 sym/5 pcs) 4 img

GPU Tutorial, with R Interfacing

24.01.2015

You’ve heard that graphics processing units — GPUs — can bring big increases in computational speed.  While GPUs cannot speed up work in every application, the fact is that in many cases it can indeed provide very rapid computation.  In this tutorial, we’ll see how this is done, both in passive ways (you write only R), and in more direc...

11891 sym R (3790 sym/8 pcs) 4 img

Tutorial on High-Performance Computing in R

03.02.2015

I wanted to call your attention to what promises to be an outstanding tutorial on High-Performance Computing (HPC) in R, presented in Web streaming format. My Rth package coauthor Drew Schmidt, who is also one of the authors of the pbdR package, will be one of the presenters.  Should very interesting and useful. Related To leave a comment fo...

718 sym 4 img

My New Book and Other Matters

22.05.2015

I haven’t posted for a while, so here are some news items: My new book, Parallel Computation for Data Science, will be out in June or July. I believe it will be useful to anyone doing computationally intensive work. After a few months being busy with the book and other things, I have returned to my Snowdoop project and my associated package, p...

1095 sym 4 img