Publications by hrbrmstr

The Fix Is In: Finding infix functions inside contributed R package “utilities” files

29.05.2018

Regular readers will recall the “utility belt” post from back in April of this year. This is a follow-up to a request made asking for a list of all the % infix functions in those files. We’re going to: collect up all of the sources parse them find all the definitions of % infix functions write them to a file We’ll start by grabbing the ...

1932 sym R (5274 sym/6 pcs)

OS Secrets Exposed: Extracting Extended File Attributes and Exploring Hidden Download URLs With The xattrs Package

30.05.2018

Most modern operating systems keep secrets from you in many ways. One of these ways is by associating extended file attributes with files. These attributes can serve useful purposes. For instance, macOS uses them to identify when files have passed through the Gatekeeper or to store the URLs of files that were downloaded via Safari (though most ot...

4789 sym R (5593 sym/7 pcs) 2 img

Hello, Dorling! (Creating Dorling Cartograms from R Spatial Objects + Introducing Prism Skeleton)

03.06.2018

NOTE: There is some iframed content in this post and you can bust out of it if you want to see the document in a full browser window. Also, apologies for some lingering GitHub links. I’m waiting for all the repos to import into to other services and haven’t had time to setup my own self-hosted public instance of any community-usable git-ish e...

4243 sym Python (531 sym/1 pcs) 4 img

Making World Tile Grid-Grids

07.06.2018

A colleague asked if I would blog about how I crafted the grid of world tile grids in this post and I accepted the challenge. The technique isn’t too hard as it just builds on the initial work by Jon Schwabish and a handy file made by Maarten Lambrechts. The Premise For this particular use-case, I sifted through our internet scan data and class...

2979 sym R (6096 sym/5 pcs) 4 img

Running RStudio (1.2) Background Jobs

09.06.2018

The forthcoming RStudio 1.2 release has a new “Jobs” feature for running and managing background R tasks. I did a series of threaded screencaps on Twitter but that doesn’t do the feature justice. So I threw together a quick ‘splainer on how to run and Python (despite RStudio not natively supporting Python) code in the background while you...

784 sym

Build httr Functions Automagically from Manual Browser Requests with the middlechild Package

15.06.2018

You can catch a bit of the @rOpenSci 2018 Unconference experience at home w with this short-ish ‘splainer video on how to use the new middlechild package (https://github.com/ropenscilabs/middlechild) & mitmproxy to automagically create reusable httr verb functions from manual browser form interactions. Related To leave a comment for the autho...

697 sym

Freeing PDF Data to Account for the Unaccounted

02.07.2018

I’ve mentioned @stiles before on the blog but for those new to my blatherings, Matt is a top-notch data journalist with the @latimes and currently stationed in South Korea. I can only imagine how much busier his life has gotten since that fateful, awful November 2016 Tuesday, but I’m truly glad his eyes, pen and R console are covering the imp...

2849 sym R (6119 sym/4 pcs) 2 img

Visualizing macOS App Usage with a Little Help from osqueryr & mactheknife

06.07.2018

Both my osqueryr and macthekinfe packages have had a few updates and I wanted to put together a fun example (it being Friday, and all) for what you can do with them. All my packages are now on GitHub and GitLab and I’ll be maintaining them on both so I can accommodate the comfort-level of any and all contributors but will be prioritizing issues...

4912 sym R (6079 sym/5 pcs) 4 img

Alleviating AWS Athena Aggravation with Asynchronous Assistance

14.07.2018

I’ve blogged about how to use Amazon Athena with R before and if you are a regular Athena user, you’ve likely run into a situation where you prepare a dplyr chain, fire off a collect() and then wait. And, wait. And, wait. And, wait. Queries that take significant processing time or have large result sets do not play nicely with the provided OD...

2234 sym R (6840 sym/2 pcs)

A new ‘boto3’ Amazon Athena client wrapper with dplyr async query support

20.07.2018

A previous post explored how to deal with Amazon Athena queries asynchronously. The function presented is a beast, though it is on purpose (to provide options for folks). In reality, nobody really wants to use rJava wrappers much anymore and dealing with icky Python library calls directly just feels wrong, plus Python functions often return truly...

4671 sym R (2956 sym/8 pcs)