Publications by BNOSAC - Belgium Network of Open Source Analytical Consultants
R package ETLUtils @ CRAN – easy loading into ffdf
The R package ETLUtils is now available for download at it’s CRAN repository.It’s a package which facilitates the ETL in situations where you need to interact with SQL databases in a corporate environment. Basically it currently focusses on the E(Extract) part of the ETL. In the libary you’ll find a function called read.dbi.ffdf which all...
1190 sym
Get your large SQL data in ff swiftly
The ff package is great when you are working with large data in R. Data in corporate environments are usually not that large that a Hadoop system is needed to handle it but the data are mostly large enough to make R choke on it’s RAM. The ff package is great for this type of data. It can handle 2.14 billion elements per atomic (so 2.14 bi...
1440 sym 2 img
read.odbc.ffdf & read.dbi.ffdf for fetching large corporate SQL data
If you are into large data but not enormeoulsy big data everyone is talking about and you are tired of finding a solution to get your data with several 10's of millions of records in R without having RAM issues, having a look at the packages ff, ffbase and ETLUtils might be the solution to your problem. Following up on our post about the ETLUt...
2388 sym
R courses in Belgium
Every year, the Leuven Statistics Research Center (Belgium) is offering short courses for professionals and researchers in statistics and statistical tools. The following link shows the overview of the courses: http://lstat.kuleuven.be/consulting/shortcourses/ENcourse%20overview.htm or get it here in pdf: http://lstat.kuleuven.be/consulting/...
2105 sym 2 img
RBelgium meeting on November, 16
Next week on Friday, November 16, the RBelgium R user group is holding its next Regular meeting in Brussels. This is the schedule of the upcoming RBelgium Regular meeting: * Graphical User Interface developments around R, including tcltk2 and SciViews – Philippe Grosjean (UMons) * Using R via the Amazon Cloud – Jean-Baptiste Poullet...
989 sym 2 img
bigglm on your big data set in open source R, it just works – similar as in SAS
In a recent post by Revolution Analytics (link & link) in which Revolution was benchmarking their closed source generalized linear model approach with SAS, Hadoop and open source R, they seemed to be pointing out that there is no 'easy' R open source solution which exists for building a poisson regression model on large datasets. This post i...
2430 sym R (5953 sym/6 pcs) 4 img
Massive online data stream mining with R
A few weeks ago, the stream package has been released on CRAN. It allows to do real time analytics on data streams. This can be very usefull if you are working with large datasets which are already hard to put in RAM completely, let alone to build some statistical model on it without getting into RAM problems. Most of the standard statistical al...
2656 sym R (665 sym/3 pcs) 4 img
DataMind & The R Service Bus @ RBelgium
Within 2 weeks on Friday, May 24, The RBelgium R user group is holding its next Regular meeting in Leuven for which this is the schedule: ** Jonathan Cornelissen – DataMind Discover DataMind, a new online learning platform for data analysis and R, developed by a group of Belgian R enthusiasts! http://www.datamind.org/ ** Tobias Verbek...
1496 sym 2 img
Popularity bigdata / large data packages in R and ffbase useR presentation
A few weeks ago, Rstudio released it's download logs, showing who downloaded R packages through their CRAN mirror. More info: http://blog.rstudio.org/2013/06/10/rstudio-cran-mirror/ This is very nice information and it can be used to show the popularity of packages with R, which has been done before and criticized also as the RStudio logs mi...
2477 sym R (2663 sym/1 pcs) 4 img
Connect R with Myrrix – Mahout & Cloudera’s real-time, scalable recommender system
Myrrix is probably more known by java developers and users of Mahout than R users. This is because most of the times java and R developers live in a different community. If you go to the website of Myrrix (http://myrrix.com), you'll find out that it is a large-scale recommender system which is able to build a recommendation model based on Al...
3554 sym R (561 sym/1 pcs) 4 img