Publications by Hong Ooi
Introducing the dplyrXdf package
The dplyr package is a popular toolkit for data transformation and manipulation. Over the last year and a half, dplyr has become a hot topic in the R community, for the way in which it streamlines and simplifies many common data manipulation tasks. Out of the box, dplyr supports data frames, data tables (from the data.table package), and the foll...
9280 sym R (4087 sym/6 pcs)
Updated dplyrXdf package brings data munging with pipes to Xdf files
by Hong Ooi, Sr. Data Scientist, Microsoft I’m pleased to announce the release of version 0.62 of the dplyrXdf package, a backend to dplyr that allows the use of pipeline syntax with Microsoft R Server’s Xdf files. This update adds a new verb (persist), fills some holes in support for dplyr verbs, and fixes various bugs. The persist verb A ...
3030 sym R (1646 sym/5 pcs)
glmnetUtils: quality of life enhancements for elastic net regression with glmnet
The glmnetUtils package provides a collection of tools to streamline the process of fitting elastic net models with glmnet. I wrote the package after a couple of projects where I found myself writing the same boilerplate code to convert a data frame into a predictor matrix and a response vector. In addition to providing a formula interface, it al...
5521 sym R (1069 sym/4 pcs)
dplyrXdf 0.90 now available
by Hong Ooi, Sr. Data Scientist, Microsoft Version 0.90 of the dplyrXdf package has just been released. dplyrXdf is a package that brings dplyr pipelines and data transformation verbs to Microsoft R Server’s xdf files. This version includes several changes, mostly to address performance and efficiency concerns, which I’ll detail these below. ...
5194 sym R (1309 sym/8 pcs)
dplyrXdf 0.10.0 beta prerelease
I’m happy to announce that version 0.10.0 beta of the dplyrXdf package is now available. You can get it from Github: install_github("RevolutionAnalytics/dplyrXdf", build_vignettes=FALSE) This is a major update to dplyrXdf that adds the following features: Support for the tidyeval framework that powers the latest version of dplyr Works with Spa...
5295 sym R (628 sym/5 pcs)
Announcing dplyrXdf 1.0
I’m delighted to announce the release of version 1.0.0 of the dplyrXdf package. dplyrXdf began as a simple (relatively speaking) backend to dplyr for Microsoft Machine Learning Server/Microsoft R Server’s Xdf file format, but has now become a broader suite of tools to ease working with Xdf files. This update to dplyrXdf brings the following n...
5305 sym R (1683 sym/4 pcs)
AzureR: R packages to control Azure services
by Hong Ooi, senior data scientist, Microsoft Azure This post is to announce a new family of packages we have developed as part of the CloudyR project for talking to Azure from R: AzureR. As background, some of you may remember the AzureSMR package, which was written a few years back as an R interface to Azure. AzureSMR was very successful and ga...
3729 sym R (1072 sym/2 pcs)
AzureRMR: an R interface to Azure Resource Manager
In a previous article I announced AzureR, a new family of packages for working with Azure from R. This article goes into more detail on how you can use AzureRMR, the base package of the AzureR family, to manage resources with Azure Resource Manager. Before you begin The first thing you have to do is create a service principal. This is a securit...
7056 sym R (1568 sym/10 pcs) 4 img
AzureVM: managing virtual machines in Azure
This is the next article in my series on AzureR, a family of packages for working with Azure in R. I’ll give a short introduction on how to use AzureVM to manage Azure virtual machines, and in particular Data Science Virtual Machines (DSVMs). Creating a VM Creating a VM is as simple as using the create_vm method, which is available as part of ...
6196 sym R (1869 sym/8 pcs)
How to deploy a predictive service to Kubernetes with R and the AzureContainers package
It's easy to create a function in R, but what if you want to call that function from a different application, with the scale to support a large number of simultaneous requests? This article shows how you can deploy an R fitted model as a Plumber web service in Kubernetes, using Azure Container Registry (ACR) and Azure Kubernetes Service (AKS). W...
5578 sym R (3309 sym/11 pcs)