Publications by BNOSAC - Belgium Network of Open Source Analytical Consultants

R package ETLUtils @ CRAN – easy loading into ffdf

09.04.2012

The R package ETLUtils is now available for download at it’s CRAN repository.It’s a package which facilitates the ETL in situations where you need to interact with SQL databases in a corporate environment. Basically it currently focusses on the E(Extract) part of the ETL. In the libary you’ll find a function called read.dbi.ffdf which all...

1190 sym

Get your large SQL data in ff swiftly

17.04.2012

The ff package is great when you are working with large data in R. Data in corporate environments are usually not that large that a Hadoop system is needed to handle it but the data are mostly large enough to make R choke on it’s RAM.  The ff package is great for this type of data. It can handle 2.14 billion elements per atomic (so 2.14 bi...

1440 sym 2 img

read.odbc.ffdf & read.dbi.ffdf for fetching large corporate SQL data

22.05.2012

If you are into large data but not enormeoulsy big data everyone is talking about and you are tired of finding a solution to get your data with several 10's of millions of records in R without having RAM issues, having a look at the packages ff, ffbase and ETLUtils might be the solution to your problem. Following up on our post about the ETLUt...

2388 sym

R courses in Belgium

26.09.2012

Every year, the Leuven Statistics Research Center (Belgium) is offering short courses for professionals and researchers in statistics and statistical tools. The following link shows the overview of the courses: http://lstat.kuleuven.be/consulting/shortcourses/ENcourse%20overview.htm or get it here in pdf: http://lstat.kuleuven.be/consulting/...

2105 sym 2 img

RBelgium meeting on November, 16

07.11.2012

Next week on Friday, November 16, the RBelgium R user group is holding its next Regular meeting in Brussels. This is the schedule of the upcoming RBelgium Regular meeting: * Graphical User Interface developments around R, including tcltk2 and SciViews – Philippe Grosjean (UMons) * Using R via the Amazon Cloud – Jean-Baptiste Poullet...

989 sym 2 img

bigglm on your big data set in open source R, it just works – similar as in SAS

29.11.2012

In a recent post by Revolution Analytics (link & link) in which Revolution was benchmarking their closed source generalized linear model approach with SAS, Hadoop and open source R, they seemed to be pointing out that there is no 'easy' R open source solution which exists for building a poisson regression model on large datasets.  This post i...

2430 sym R (5953 sym/6 pcs) 4 img

Massive online data stream mining with R

25.03.2013

A few weeks ago, the stream package has been released on CRAN. It allows to do real time analytics on data streams. This can be very usefull if you are working with large datasets which are already hard to put in RAM completely, let alone to build some statistical model on it without getting into RAM problems. Most of the standard statistical al...

2656 sym R (665 sym/3 pcs) 4 img

DataMind & The R Service Bus @ RBelgium

07.05.2013

Within 2 weeks on Friday, May 24, The RBelgium R user group is holding its next Regular meeting in Leuven for which this is the schedule: ** Jonathan Cornelissen – DataMind  Discover DataMind, a new online learning platform for data analysis and R, developed by a group of Belgian R enthusiasts!  http://www.datamind.org/  ** Tobias Verbek...

1496 sym 2 img

Popularity bigdata / large data packages in R and ffbase useR presentation

12.07.2013

A few weeks ago, Rstudio released it's download logs, showing who downloaded R packages through their CRAN mirror. More info: http://blog.rstudio.org/2013/06/10/rstudio-cran-mirror/ This is very nice information and it can be used to show the popularity of packages with R, which has been done before and criticized also as the RStudio logs mi...

2477 sym R (2663 sym/1 pcs) 4 img

Connect R with Myrrix – Mahout & Cloudera’s real-time, scalable recommender system

18.07.2013

Myrrix is probably more known by java developers and users of Mahout than R users. This is because most of the times java and R developers live in a different community.  If you go to the website of Myrrix (http://myrrix.com), you'll find out that it is a large-scale recommender system which is able to build a recommendation model based on Al...

3554 sym R (561 sym/1 pcs) 4 img