Publications by Randy Zwitch
Real-time Reporting with the Adobe Analytics API
Starting with version 1.3.1 of RSiteCatalyst, you can now access the real-time reporting capabilities of the Adobe Analytics API through a familiar R interface. Here’s how to get started… GetRealTimeConfiguration Before using the real-time reporting capabilities of Adobe Analytics, you first need to indicate which metrics and elements you ar...
4495 sym
Building JSON in R: Three Methods
When I set out to build RSiteCatalyst, I had a few major goals: learn R, build a CRAN-worthy package and learn the Adobe Analytics API. As I reflect back on how the package has evolved over the past two years and what I’ve learned, I think my greatest learning was around how to deal with JSON (and strings in general). JSON is ubiquitous as a...
5178 sym
Five Hard-Won Lessons Using Hive
I’ve been spending a ton of time lately on the data engineering side of ‘data science’, so I’ve been writing a lot of Hive queries. Hive is a great tool for querying large amounts of data, without having to know very much about the underpinnings of Hadoop. Unfortunately, there are a lot of things about Hive (version 0.12 and before) that ...
4820 sym R (87 sym/2 pcs)
Using Julia As A ‘Glue’ Language
While much of the focus in the Julia community has been on the performance aspects of Julia relative to other scientific computing languages, Julia is also perfectly suited to ‘glue’ together multiple data sources/languages. In this blog post, I will cover how to create an interactive plot using Gadfly.jl, by first preparing the data using Ha...
5350 sym
Maybe I Don’t Really Know R After All
Lately, I’ve been feeling that I’m spreading myself too thin in terms of programming languages. At work, I spend most of my time in Hive/SQL, with the occasional Python for my smaller data. I really prefer Julia, but I’m alone at work on that one. And since I maintain a package on CRAN (RSiteCatalyst), I frequently spend my evenings bug ...
3450 sym 2 img
RSiteCatalyst Version 1.4 Release Notes
It felt like it would never happen, but RSiteCatalyst v1.4 is now available on CRAN! There are numerous changes in this version of the package, so unlike previous posts, there won’t be any code examples. THIS VERSION IS ONE BIG BREAKING CHANGE While not the most important improvement, it can’t be stressed enough that migrating to v1.4 of RSit...
4297 sym 2 img
Visualizing Website Pathing With Network Graphs
Last week, version 1.4 of RSiteCatalyst was released, and now it’s possible to get site pathing information directly within R. Now, it’s easy to create impressive looking network graphs from your Adobe Analytics data using RSiteCatalyst and d3Network. In this blog post, I will cover simple and force-directed network graphs, which show the p...
5453 sym
Visualizing Website Pathing With Sankey Charts
In my prior post on visualizing website structure using network graphs, I referenced that network graphs showed the pairwise relationships between two pages (in a bi-directional manner). However, if you want to analyze how your visitors are pathing through your site, you can visualize your data using a Sankey chart. Visualizing Single Page-to-Nex...
4197 sym 2 img
Evaluating BreakoutDetection
A couple of weeks ago, Twitter open-sourced their BreakoutDetection package for R, a package designed to determine shifts in time-series data. The Twitter announcement does a great job of explaining the main technique for detection (E-Divisive with Medians), so I won’t rehash that material here. Rather, I wanted to see how this package works re...
5029 sym 6 img
RSiteCatalyst Version 1.4.1 Release Notes
Changes Version 1.4.1 of RSiteCatalyst is now available on CRAN. There were a handful of bug fixes and new features added, including: Fixed bug in QueueRanked function where only 10 results were returned when requesting multiple element reports. Function now returns up to 50,000 per breakdown (API limit) Created better error message to inform us...
2853 sym 2 img