Publications by hrbrmstr
Rpad Domain Repurposed To Deliver Creepy (and potentially malicious) Content
I was about to embark on setting up a background task to sift through R package PDFs for traces of functions that “omit NA values” as a surprise present for Colin Fay and Sir Tierney: [Please RT]#RStats folks, @nj_tierney & I need your help for {naniar}!When does R silently drop/omit NA? https://t.co/V5elyGcG8Z pic.twitter.com/VScLXFCl2n— C...
4584 sym R (13460 sym/10 pcs) 10 img
New CRAN Package Announcement: splashr
I’m pleased to announce that splashr is now on CRAN. (That image was generated with splashr::render_png(url = "https://cran.r-project.org/web/packages/splashr/")). The package is an R interface to the Splash javascript rendering service. It works in a similar fashion to Selenium but is fear more geared to web scraping and has quite a bit of po...
1471 sym 2 img
Readability Redux
I recently posted about using a Python module to convert HTML to usable text. Since then, a new package has hit CRAN dubbed htm2txt that is 100% R and uses regular expressions to strip tags from text. I gave it a spin so folks could compare some basic output, but you should definitely give htm2txt a try on your own conversion needs since each met...
2330 sym R (2021 sym/2 pcs)
Teasing Out Top Daily Topics with GDELT’s Television Explorer
Earlier this year, the GDELT Project released their Television Explorer that enabled API access to closed-caption tedt from television news broadcasts. They’ve done an incredible job expanding and stabilizing the API and just recently released “top trending tables” which summarise what the “top” topics and phrases are across news statio...
2298 sym R (2267 sym/4 pcs) 2 img
Revisiting Readability With RStudio
I’ve blogged about my in-development R package hgr a before and it’s slowly getting to a CRAN release. There are two new features to it that are more useful in an interactive session than in a programmatic context. Since they build on each other, we’ll take them in order. New S3 print() Method Objects created with hgr::just_the_facts() used...
2925 sym 10 img
It’s a FAKE (?)! Revisiting Trust In FOSS Ecosystems
I’ve blathered about trust before 1 2, but said blatherings were in a “what if” context. Unfortunately, the if has turned into a when, which begged for further blathering on a recent FOSS ecosystem cybersecurity incident. The gg_spiffy @thomasp85 linked to a post by the SK-CSIRT detailing the discovery and take-down of a series of malicious...
3702 sym R (351 sym/1 pcs) 4 img
Mapping Fall Foliage with sf
I was socially engineered by @yoniceedee into creating today’s post due to being prodded with this tweet: Where to see the best fall foliage, based on your location: https://t.co/12pQU29ksB pic.twitter.com/JiywYVpmno— Vox (@voxdotcom) September 18, 2017 Since there aren’t nearly enough sf and geom_sf examples out on the wild, wild #rstats ...
1224 sym R (3252 sym/1 pcs) 2 img
Pirating Web Content Responsibly With R
International Code Talk Like A Pirate Day almost slipped by without me noticing (September has been a crazy busy month), but it popped up in the calendar notifications today and I was glad that I had prepped the meat of a post a few weeks back. There will be no ‘rrrrrr’ abuse in this post, I’m afraid, but there will be plenty of R code. We�...
5516 sym R (6275 sym/6 pcs) 8 img
Speeding Up Digital Arachnids
spiderbar, spiderbar Reads robots rules from afar. Crawls the web, any size; Fetches with respect, never lies. Look Out! Here comes the spiderbar. Is it fast? Listen bud, It's got C++ under the hood. Can you scrape, from a site? Test with can_fetch(), TRUE == alright Hey, there There goes the spiderbar. (Check the end of the post if you ...
2325 sym R (2098 sym/4 pcs) 6 img
SODD — StackOverflow Driven-Development
I occasionally hang out on StackOverflow and often use an answer as an opportunity to fill a package void for a particular need. docxtractr and qrencoder are two (of many) packages that were birthed from SO answers. I usually try to answer with inline code first then expand the functionality into a package (if warranted). Some make it to CRAN (li...
2542 sym