Publications by Michele Usuelli

A Big Data introduction

05.06.2013

Since R uses the computer RAM, it may handle only rather small sets of data. Nevertheless, there are some packages that allow to treat larger volumes and the best solution is to connect R with a Big Data environment. This post introduces some Big Data concepts that are fundamental to understand how R can work in this environment. Afterwards, some...

3552 sym

A possibility for use R and Hadoop together

08.07.2013

As mentioned in the previous article, a possibility for dealing with some Big Data problems is to integrate R within the Hadoop ecosystem. Therefore, it’s necessary to have a bridge between the two environments. It means that R should be capable of handling data the are stored through the Hadoop Distributed File System (HDFS). In order to proce...

3393 sym

An example of MapReduce with rmr2

02.09.2013

R can be connected with Hadoop through the rmr2 package. The core of this package is mapreduce() function that allows to write some custom MapReduce algorithms. The aim of this article is to show how it works and to provide an example. As mentioned in the previous article, the R mapreduce() function requires some arguments, but now we will deal w...

2990 sym R (385 sym/2 pcs) 2 tbl

R framework with Object-Oriented Programming

13.02.2014

Data analysis deals with different kinds of data. For instance we can have supermarket sales with – a transactional table, with customer ID, item ID, date of purchase – an item table, with the item ID and its price – a customer table, with customer ID and its anagraphic details (age, gender) In this example data are tables with different s...

2689 sym

R AND OOP – defining new classes

12.03.2014

My previous article shows an example in which data analysis requires a structured framework with R and OOP. In order to explain how to build the framework this article describes how to do that in more detail. Using OOP means creating new data structures and defining their methods that are functions performing a specific tasks on the object. Defin...

2317 sym R (322 sym/4 pcs) 4 tbl

Announcing the Publication of R Machine Learning Essentials

10.11.2014

R machine learning essentials will be published soon. The target audience is readers wanting to quickly get familiar with machine learning. The only requirement is knowing a bit about data analysis and/or coding concepts. This book is not just a tutorial. Its target is not teaching how to build very sophisticated machine learning solutions. It do...

2249 sym 2 img