Publications by hrbrmstr
Use quick formula functions in purrr::map (+ base vs tidtyverse idiom comparisons/examples)
I’ve converted the vast majority of my *apply usage over to purrr functions. In an attempt to make this a quick post, I’ll refrain from going into all the benefits of the purrr package. Instead, I’ll show just one thing that’s super helpful: formula functions. After seeing this Quartz article using a visualization to compare the frequency...
3958 sym R (2576 sym/7 pcs)
U.S. Drought Animations with the “Witch’s Brew” (purrr + broom + magick)
This is another purrr-focused post but it’s also an homage to the nascent magick package (R interface to ImageMagick) by @opencpu. We’re starting to see/feel the impact of the increasing drought up here in southern Maine. I’ve used the data from the U.S. Drought Monitor before on the blog, but they also provide shapefiles and this seemed l...
1840 sym R (3215 sym/1 pcs) 2 img
QuickLookR – A macOS QuickLook plugin for R Data files
I had tried to convert my data-saving workflows to feather but there have been issues with it supporting large files (that seem to be near resolution), so I’ve been continuing to use R Data files for local saving of processed/cleaned data. I make many of these files and sometimes I do it as a one-off effort, thinking that I’ll come back to it...
2142 sym 2 img
Counting [U.S.] Expatriation with R (a.k.a. a Decade of Desertion)
If you’re even remotely following the super insane U.S. 2016 POTUS circus election you’ve no doubt seen a resurgence of “if X gets elected, I’m moving to Y” claims by folks who are “anti” one candidate or another. The Washington Examiner did a story on last quarter’s U.S. expatriation numbers. I didn’t realize we had a departmen...
4683 sym R (3250 sym/6 pcs) 4 img
Survey on Data Science In Two Year Colleges
The ASA (American Statistical Association) has been working in collaboration with the ACM (Association for Computing Machinery) on developing a data science curriculum for Two Year Colleges. Part of this development is the need to understand the private-sector demand for two-year college data science graduates and the prevalence of the need to in...
1322 sym
Interacting With Amazon Athena from R
This is a short post for those looking to test out Amazon Athena with R. Amazon makes Athena available via JDBC, so you can use RJDBC to query data. All you need is their JAR file and some setup information. Here’s how to get the JAR file to the current working directory: URL <- 'https://s3.amazonaws.com/athena-downloads/drivers/AthenaJDBC41-1....
1404 sym R (2890 sym/5 pcs)
Minding the zoo[keeper] with R
I’ve been drafting a new R package — sergeant — to work with Apache Drill and have found it much easier to manage having Drill operating in a single node cluster vs drill-embedded mode (esp when I need to add a couple more nodes for additional capacity). That means running Apache Zookeeper and I’ve had the occasional need to ping the Zook...
1618 sym R (3869 sym/2 pcs)
Package update: longurl 0.3.0 is hitting CRAN mirrors
The longurl package has been updated to version 0.3.0 as a result of a bug report noting that the URL expansion API it was using went pay-for-use. Since this was the second time a short URL expansion service either went belly-up or had breaking changes the package is now completely client-side-based and a very thin, highly-focused wrapper around ...
3834 sym R (2328 sym/2 pcs) 2 img
sergeant : An R Boot Camp for Apache Drill
I recently mentioned that I’ve been working on a development version of an Apache Drill R package called sergeant. Here’s a lifted “TLDR” on Drill: Drill supports a variety of NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files...
5874 sym R (513 sym/3 pcs)
Pipes (%>%) Everywhere
An R user asked a question regarding whether it’s possible to have the RStudio pipe (%>%) shortcut (Cmd-Shift-M) available in other macOS applications. If you’re using Alfred then you can use this workflow for said task (IIRC this requires an Alfred license which is reasonably cheap). When you add it to Alfred you must edit it to make Cmd-Shi...
1370 sym