Publications by David Smith

Choosing Priors: Double-Yolk Bayesian Egg

06.03.2018

by Subhadeep (Deep) Mukhopadhyay and Douglas Fletcher, Department of Statistical Science, Temple University  Bayesians and Frequentists have long been ambivalent toward each other. The concept of “Prior” remains the center of this 250 years old tug-of-war: frequentists view prior as a weakness that can cloud the final inference, whereas Baye...

7019 sym R (877 sym/7 pcs) 10 img

R data concepts, for Excel users

07.03.2018

Excel users starting to use R likely have some established concepts about data: where it's stored, how functions apply to data, etc. In general, R does things differently to Excel (or any spreadsheet, in fact). In a useful guide, Steph de Silva from Rex Analytics explains the concepts of data management in R and how they differ from Excel, which ...

1533 sym 2 img

R rises to #12 in Redmonk language rankings

13.03.2018

In the latest Redmonk language rankings, R has risen to the #12 position, up from #14 in the June 2017 rankings. (Python remains steady in the #3 position.) The Redmonk rankings are based on activity in StackOverflow (as a proxy for user engagement) and Github (as a proxy for developer engagement). Here's the chart from January 2018 of Github pop...

2104 sym 2 img

In case you missed it: February 2018 roundup

14.03.2018

In case you missed them, here are some articles from February of particular interest to R users. The R Consortium opens a new round of grant applications for R-related user groups and projects, and has issued US$0.5M in grants to date for R-related projects and events. Microsoft R Client 3.4.3 and Microsoft ML Server 9.3, both built with R 3.4.3...

2004 sym

R 3.4.4 released

15.03.2018

R 3.4.4 has been released, and binaries for Windows, Mac, Linux and now available for download on CRAN. This update (codenamed “Someone to Lean On” — likely a Peanuts reference, though I couldn't find which one with a quick search) is a minor bugfix release, and shouldn't cause any compatibility issues with scripts or packages written for p...

1055 sym

R and Docker

20.03.2018

If you regularly have to deal with specific versions of R, or different package combinations, or getting R set up to work with other databases or applications then, well, it can be a pain. You could dedicate a special machine for each configuration you need, I guess, but that's expensive and impractical. You could set up virtual machines in the c...

3409 sym 4 img

The most prolific package maintainers on CRAN

22.03.2018

During a discussion with some other members of the R Consortium, the question came up: who maintains the most packages on CRAN? DataCamp maintains a list of most active maintainers by downloads, but in this case we were interested in the total number of packages by maintainer. Fortunately, this is pretty easy to figure thanks to the CRAN reposito...

1596 sym R (609 sym/1 pcs) 2 img

Generate image captions with the Computer Vision API

27.03.2018

The Azure Computer Vision API can extract all sorts of interesting information from images — tags describing objects found in the images, locations of detected faces, and more — but today I want to play around with just one: caption generation. I was inspired by @picdescbot on Twitter, which selects random images from Wikimedia Commons and g...

4594 sym R (398 sym/3 pcs) 8 img

BotRNot: An R app to detect Twitter bots

29.03.2018

Twitter's bot problem is well documented, influencing discourse on divisive topics like politics and civil rights. But it's getting harder and harder to spot such nefarious bots, who often borrow biographies and tweets from real (and often stolen) profiles to evade detection. (The New York Times recently published an outstanding feature on bots...

2029 sym 2 img

Use Python functions and modules in R with the "reticulate" package

30.03.2018

Since its inception over 40 years ago, when S (R's predecessor) was just a sketch on John Chambers' wall at Bell Labs, R has always been a language for providing interfaces. I was reminded of this during Dirk Eddelbuettel's presentation at the Chicago R User Group meetup last night, where he enumerated Chambers' three principles behind its desig...

2567 sym 2 img