Publications by Brian Lee Yung Rowe
Comparing #rstats and #pdf15 intraday hashtag streams
This post is a lecture for IS624 Predictive Analytics, which is part of the CUNY Master’s program in Data Analytics. In class, we discussed the characteristics of the #rstats hashtag, and its apparent randomness at a minute frequency. We surmised that numerous factors contribute to this, such as multiple discussions, time zone differences, and ...
4769 sym 16 img
From functional programming to MapReduce in R
The MapReduce paradigm has long been a staple of big data computational strategies. However, properly leveraging MapReduce can be a challenge, even for experienced R users. To get the most out of MapReduce, it is helpful to understand its relationship to functional programming. In this post I discuss how MapReduce relates to the underlying higher...
10209 sym R (2689 sym/8 pcs) 6 img
Webinar: Fostering a Data-Driven Culture with Interactive Data Analysis in Slack
I’ll be giving a presentation on Thursday, 8 October on Panoptez, the interactive data analysis platform I’ve been developing. Panoptez integrates with Slack and other chat platforms, meaning you have the power of a programming language from within Slack! This makes it easy to create and collaborate on an analysis or visualization, since the ...
1148 sym 4 img
SparkR quick start that works
If you’re following along the SparkR Quick Start, you’ll notice that the instructions are not consistent with a more recent build of Spark. Here are instructions that work for SparkR version 1.4.1 on Linux. YMMV on Spark 1.5. Now that SparkR has been promoted to the core Spark library, it lives in Spark’s bin directory with the other execut...
3718 sym R (655 sym/9 pcs) 4 img
Data-Driven Weekly (2015-11-09)
This is the initial installment of a weekly post on notable articles on the data-driven trend. Topics covered include self-service analytics, data literacy, democratization of data and analytics, and how these subjects foster insights, innovation, and operational excellence. SaaS Metrics Software as a service (SaaS) has changed the business game ...
3408 sym 4 img
The Data-Driven Weekly #1.2
Last week witnessed a number of exciting announcements from the big data and machine learning space. What it shows is that there are still lots of problems to solve in 1) working with/deriving insights from big data, 2) integrating insights into business processes. TensorFlow Probably the biggest (data) headline was that Google open sourced Tenso...
5382 sym 4 img
The Data-Driven Weekly #1.3
This week we explore two different themes related to data. The first is how to value big data. The second looks at approaches for quantifying individual experience within the context of gender discrimination. Treating Big Data as an Asset It’s generally taken for granted that data is a strategic asset. Every salesperson knows this, which is w...
4109 sym 8 img
The Data-Driven Weekly #1.4
With the holiday shopping season officially kicked off, sales data is on everyone’s mind. But that’s not the only data people are talking about. Talk of data brokers and consumer tracking is becoming more commonplace, as is a certain backlash against big data. This week, we highlight some articles touching on these subjects and round out the ...
6542 sym 8 img
Modeling data with functional programming – lists
The latest update of my book includes a chapter on lists. This chapter explores the list data type and how to effectively use it as a general purpose data structure. We also look at the mechanics of do.call and its relationship with lists. The latter half of the chapter transitions to practical applications. Not only do we continue the ebola situ...
1185 sym 4 img
The Data-Driven Weekly #1.5
This week, we continue the parallel themes of deep learning and natural language processing. Last week I mentioned some papers that use deep learning for NLP. In deep learning, these tasks are modeled as a prediction problem, which is why such an extensive training set is required. I think it’s important to remember this amongst the flurry of s...
6338 sym 6 img