Publications by David Smith

Open soure software has changed the way we do business

20.05.2015

Earlier this month TechCrunch published an article of mine, “The Business Economics And Opportunity Of Open-Source Data Science“. With this article I wanted to share how open-source software has disrupted the economics of doing business, now that data is a fundamental component of every businesses' operations. Open source projects like Hadoop...

1637 sym

Revolution R Open 3.2.0 now available for download

22.05.2015

The latest update to Revolution R Open, RRO 3.2.0, is now available for download from MRAN. In addition to new features, this release tracks the version number of the underlying R engine version (so this is the release following RRO 8.0.3). Revolution R Open 3.2.0 includes: The latest R engine, R 3.2.0. This includes many improvements, includin...

1794 sym

Open data sets you can use with R

25.05.2015

R is an environment for programming with data, so unless you're doing a simulation study you'll need some data to work with. If you don't have data of your own, we've made a list of open data sets you can use with R to accompany the latest release of Revolution R Open. At the Data Sources on the Web page on MRAN, you can find links to dozens of ...

1621 sym

R tops 2015 KDnuggets Software Poll

27.05.2015

R is the leading choice for Predictive Analytics / Data Mining / Data Science software according to the results of the 2015 KDnuggets Software Poll, now in its 16th year. Each of the 28,000 participants selected one or more tools they had used in the last year from a list of 93 options, and R was selected by 46.9% of participants (up from 38.5% ...

1643 sym 2 img

RStudio 0.99 released

29.05.2015

If you download R or Revolution R Open, the R interface is pretty stark — you'll get a command prompt, and not much else. That's fine for quick, interactive calculations, but if you need to do any serious scripting or programming in R, you'll need an interactive development environment (IDE) to be productive. For subscribers to Revolution R Ent...

1544 sym

A comparison of high-performance computing techniques in R

01.06.2015

When it comes to speeding up “embarassingly parallel” computations (like for loops with many iterations), the R language offers a number of options: An R looping operator, like mapply (which runs in a single thread) A parallelized version of a looping operator, like mcmapply (which can use multiple cores) Explicit parallelization, via the ...

1930 sym 2 img

Computing with GPUs in R

03.06.2015

On Monday, we compared the performance of several different ways of calculating a distance matrix in R. Now there's another method to add to the list: using GPU acceleration in R. A GPU is a dedicated, high-performance chip available on many computers today. Unlike the CPU, it's not used for general computations, but rather for specialized tasks...

2256 sym 2 img

Any R code as a cloud service: R demonstration at BUILD

05.06.2015

At last month's BUILD conference for Microsoft developers in San Francisco, R was front-and-center on the keynote stage. In the keynote, Microsoft CVP Joseph Sirosh introduced the “language of data”: open source R. Sirosh encouraged the audience to learn R, saying “if there is a single language that you choose to learn today .. let it be R...

2127 sym 6 img

SparkR: Distributed data frames with Spark and R

12.06.2015

R is now integrated with Apache Spark, the open-source cluster computing framework. The Databricks blog announced this week that yesterday's release of Spark 1.4 would include SparkR, “an R package that allows data scientists to analyze large datasets and interactively run jobs on them from the R shell”. The SparkR 1.4 announcement led with ...

2554 sym

Connect R to Bloomberg with the RBlpapi package

15.06.2015

For anyone who works with financial data and has access to a Bloomberg terminal, there is a new R package to interface to Bloomberg data services: RBlpapi. (If you had searched for an R connection to Bloomberg you wouldn’t have found this one — Bloomberg is happy to have software that connects to its public API, but not to use its name, appa...

1594 sym