Publications by tomaztsql

Sharing thoughts on satRdays R Conference, Budapest 2016 #satRdays

06.09.2016

First satRdays in Budapest September 03, 2016 event is completed. This one day, community driven event with regional for very affordable prices, good for networking, getting latest from R community event is over. And it was a blast! Great time, nice atmosphere, lots of interesting people and where there is a good energy, there is a will to learn ...

2961 sym R (492 sym/1 pcs) 34 img

Size of XDF files using RevoScaleR package

22.09.2016

It came to my attention that size of XDF (external data frame) file can change drastically based on the compute context and environment. When testing the output of a dataset I was working on in SQL Server Management Studio I was simultaneously testing R code in RTVS or  R Studio and I have noticed a file growth. Following stored procedure will d...

2051 sym R (2926 sym/5 pcs) 10 img

FileTable and storing graphs from Microsoft R Server

25.09.2016

FileTable has been around now for quite some time and and it is useful  for storing files, documents, pictures and and binary files in a designated SQL Server table – FileTable. The best part of FileTable is the fact one can access it from windows or other application as if it were stored on file system (because they are) and not making any ot...

2807 sym R (4076 sym/6 pcs) 16 img

Comparing performance on dplyr package, RevoScaleR package and T-SQL on simple data manipulation tasks

10.10.2016

Long I wanted to test a simple data manipulation tasks and compare the execution time, ease of writing the code and simplicity between T-SQL and R package for data manipulation. Couple of packages I will mention for data manipulations are plyr, dplyr and data.table and compare the execution time, simplicity and ease of writing with general T-SQL ...

4516 sym R (15136 sym/7 pcs) 14 img

Performance comparison between kmeans and RevoScaleR rxKmeans

12.10.2016

In my previous blog post, I was focusing on data manipulation tasks with RevoScaleR Package in comparison to other data manipulation packages and at the end conclusions were obvious; RevoScaleR can not (without the help of dplyrXdf) do piping (or chaining) and storing temporary results take time and on top of that, data manipulation can be done e...

3066 sym R (1786 sym/8 pcs) 10 img 1 tbl

Association Rules on WideWorldImporters and SQL Server R Services

14.10.2016

Association rules are very handy for analyzing Retail data. And WWI database has really neat set of invoices that can be used to make a primer. Starting with following T-SQL query: USE WideWorldIMportersDW; GO ;WITH PRODUCT AS ( SELECT   [Stock Item Key]  ,[WWI Stock Item ID]  ,[Stock Item]  ,LEFT([Stock Item], 8) AS L8DESC  ,ROW_NUMBER(...

3306 sym R (5887 sym/7 pcs) 8 img

Detecting outliers and fraud with R and SQL Server on my bank account data – Part 1

31.10.2016

Detecting outliers and fraudulent behaviour (transactions, purchases, events, actions, triggers, etc.) takes a large amount of experiences and statistical/mathetmatical background. One of the samples Microsoft provided with release of new SQL Server 2016 was using simple logic of Benford’s law. This law works great with naturally occurring numb...

2836 sym R (1137 sym/3 pcs) 8 img

R graphs and tables in Power BI Desktop

18.12.2016

Power BI Desktop enable users to use R script visual for adding custom visualization generated with R language – regardless of R package used. Before using R script visual, you will need to enable it by setting path to R Engine on your client in the global options. Once this is done, you will be able to enhance your Power BI reports using R vis...

3774 sym R (467 sym/3 pcs) 20 img

Using R sp_execute_external_script with JSON

08.01.2017

JSON has become part of the SQL Server in the same version as R. Both were very highly anticipated and awaited from the community. JSON has very powerful statements for converting to and from JSON for storing into / from SQL Server engine (FOR JSON and JSON VALUE, etc).  And since it is gaining popularity for data exchange, I was curious to give...

1928 sym R (1359 sym/6 pcs) 8 img

Clustering executed SQL Server queries using R as tool for

08.01.2017

When query execution performance analysis is to be done, there are many ways to find which queries might cause any unwanted load or cause stall on the server. By encouraging DBA community to start practicing the advantage or R Language and world of data science, I have created a demo to show, how statistics on numerous queries can be stored for l...

2655 sym R (6189 sym/5 pcs) 10 img