Publications by Michele Usuelli
A Big Data introduction
Since R uses the computer RAM, it may handle only rather small sets of data. Nevertheless, there are some packages that allow to treat larger volumes and the best solution is to connect R with a Big Data environment. This post introduces some Big Data concepts that are fundamental to understand how R can work in this environment. Afterwards, some...
3552 sym
A possibility for use R and Hadoop together
As mentioned in the previous article, a possibility for dealing with some Big Data problems is to integrate R within the Hadoop ecosystem. Therefore, it’s necessary to have a bridge between the two environments. It means that R should be capable of handling data the are stored through the Hadoop Distributed File System (HDFS). In order to proce...
3393 sym
An example of MapReduce with rmr2
R can be connected with Hadoop through the rmr2 package. The core of this package is mapreduce() function that allows to write some custom MapReduce algorithms. The aim of this article is to show how it works and to provide an example. As mentioned in the previous article, the R mapreduce() function requires some arguments, but now we will deal w...
2990 sym R (385 sym/2 pcs) 2 tbl
R framework with Object-Oriented Programming
Data analysis deals with different kinds of data. For instance we can have supermarket sales with – a transactional table, with customer ID, item ID, date of purchase – an item table, with the item ID and its price – a customer table, with customer ID and its anagraphic details (age, gender) In this example data are tables with different s...
2689 sym
R AND OOP – defining new classes
My previous article shows an example in which data analysis requires a structured framework with R and OOP. In order to explain how to build the framework this article describes how to do that in more detail. Using OOP means creating new data structures and defining their methods that are functions performing a specific tasks on the object. Defin...
2317 sym R (322 sym/4 pcs) 4 tbl
Announcing the Publication of R Machine Learning Essentials
R machine learning essentials will be published soon. The target audience is readers wanting to quickly get familiar with machine learning. The only requirement is knowing a bit about data analysis and/or coding concepts. This book is not just a tutorial. Its target is not teaching how to build very sophisticated machine learning solutions. It do...
2249 sym 2 img