Publications by Jeroen Ooms
Using xml schema and xslt in R
This week an update for xml2 and a new xslt package have appeared on CRAN. A full announcement for xml2 version 1.1 will appear on the rstudio blog. This post explains xml validation (via xsd schema) and xml transformation (via xslt stylesheets) which have been added in this release. XML schemas and stylesheets are not exactly new; both xslt 1.1 ...
4048 sym R (2340 sym/8 pcs)
Release mongolite 1.0
After 2.5 years of development, version 1.0 of the mongolite package has been released to CRAN. The package is now stable, well documented, and will soon be submitted for peer review to be onboarded in the rOpenSci suite. MongoDB in R and mongolite I started working on mongolite in September 2014, and it was first announced at the r...
4435 sym R (586 sym/3 pcs) 2 img
New rOpenSci Packages for Text Processing in R
Textual data and natural language processing are still a niche domain within the R ecosytstem. The NLP task view gives an overview of existing work however a lot of basic infrastructure is still missing. At the rOpenSci text workshop in April we discussed many ideas for improving text processing in R which revealed several core areas that need im...
3589 sym R (2537 sym/7 pcs)
Announcing OpenCPU 2.0: Building and Deploying Scalable R Apps and Services
OpenCPU 2.0 provides the most robust system available today for building and deploying R based apps and services. The server exposes a simple HTTP API for calling with R functions, scripts and managing data, which provides a very solid basis for intergrating R into any environment. The OpenCPU 2.0 cloud server naturally scales up to many concurre...
7378 sym R (1016 sym/7 pcs) 8 img
Magick 1.0: ? ✨? Advanced Graphics and Image Processing in R
Last week, version 1.0 of the magick package appeared on CRAN: an ambitious effort to modernize and simplify high quality image processing in R. This R package builds upon the Magick++ STL which exposes a powerful C++ API to the famous ImageMagick library. The best place to start learning about magick is the vignette which gives a brief overview...
5735 sym R (1414 sym/7 pcs) 18 img
Tesseract and Magick: High Quality OCR in R
Last week we released an update of the tesseract package to CRAN. This package provides R bindings to Google's OCR library Tesseract. install.packages("tesseract") The new version ships with the latest libtesseract 3.05.01 on Windows and MacOS. Furthermore it includes enhancements for managing language data and using tesseract together with the ...
1722 sym R (2674 sym/5 pcs) 2 img
Spelling 1.0: quick and effective spell checking in R
The new rOpenSci spelling package provides utilities for spell checking common document formats including latex, markdown, manual pages, and DESCRIPTION files. It also includes tools especially for package authors to automate spell checking of R documentation and vignettes. Spell Checking Packages The main purpose of this package is to quickly fi...
2289 sym R (1277 sym/5 pcs)
The writexl package: zero dependency xlsx writer for R
We have started working on a new rOpenSci package called writexl. This package wraps the very powerful libxlsxwriter library which allows for exporting data to Microsoft Excel format. The major benefit of writexl over other packages is that it is completely written in C and has absolutely zero dependencies. No Java, Perl or Rtools are required. G...
1685 sym R (624 sym/3 pcs)
Changes to Internet Connectivity in R on Windows
This week we released version 3.0 of the curl R package to CRAN. You may have never used this package directly, but curl provides the foundation for most HTTP infrastructure in R, including httr, rvest, and all packages that build on it. If R packages need to go online, chances are traffic is going via curl. This release introduces an important c...
4146 sym 2 img
Why Use Docker with R? A DevOps Perspective
There have been several blog posts going around about why one would use Docker with R. In this post I’ll try to add a DevOps point of view and explain how containerizing R is used in the context of the OpenCPU system for building and deploying R servers. Has anyone in the #rstats world written really well about the *why* of their use of Docker,...
3711 sym R (248 sym/4 pcs) 2 img