Publications by emaasit

Installing and Starting SparkR Locally on Windows OS and RStudio

26.07.2015

Introduction With the recent release of Apache Spark 1.4.1 on July 15th, 2015, I wanted to write a step-by-step guide to help new users get up and running with SparkR locally on a Windows machine using command shell and RStudio. SparkR provides an R frontend to Apache Spark and using Spark’s distributed computation engine allows R-Users to run...

5841 sym 46 img

Interview with a Data Scientist (Hadley Wickham)

02.08.2015

Originally posted on Models are illuminating and wrong: I recently interviewed Hadley Wickham the creator of Ggplot2 and a famous R Stats person. He works for RStudio and his job is to work on Open Source software aimed at Data Geeks. Hadley is famous for his contributions to Data Science tooling and inspires a lot of other languages! I include ...

1307 sym 18 img

Interactive Data Science with R in Apache Zeppelin Notebook

16.11.2015

Introduction The objective of this blog post is to help you get started with Apache Zeppelin notebook for your R data science requirements. Zeppelin is a web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with Scala(with Apache Spark), Python(with Apache Spark), ...

4714 sym 56 img

Using Apache SparkR to Power Shiny Applications: Part I

08.12.2015

This post was first published on SparkIQ Labs’ blog and re-posted on my personal blog. Introduction The objective of this blog post is demonstrate how to use Apache SparkR to power Shiny applications. I have been curious about what the use cases for a “Shiny-SparkR” application would be and how to develop and deploy such an app. SparkR i...

6294 sym 34 img

Tracking ggplot2 Extensions

01.02.2016

Introduction The purpose of this blog post is to inform R users of a website that I created to track and list ggplot2 extensions. The site is available at: http://ggplot2-exts.github.io. The purpose of this site is to help other R users easily find ggplot2 extensions that are coming in “fast and furious” from the R community. If you have d...

3328 sym 28 img

Webinar: Getting Started with Spatial Data Analysis with R

14.02.2016

I am glad to announce that I shall be presenting a live webinar with Domino Data Labs on February 24, 2016 from 11:00 – 11:30 AM PST: Getting Started with Spatial Data Analysis with R. If you are interested or know someone interested in learning how to manipulate spatial and spatial-temporal data with R, please send them along. Here is the abs...

2533 sym 21 img