Publications by tomaztsql
Some of the more useful Tidyverse functions
R functions for every data engineer using Tidyverse Tidyverse has long been an amazing collection of R packages, primarily for data engineering and data science. Common among these packages is the same language grammar, great design and structure, making data science easier. Motivation Data engineering is important step that helps improve data usab...
5686 sym R (3485 sym/6 pcs) 12 img
Data science with Microsoft Fabric – Plotting ROC curve and distribution of scores
ROC (Receiver Operation Characteristics) – curve is a graph that shows how classifiers performs by plotting the true positive and false positive rates. It is used to evaluate the performance of binary classification models by illustrating the trade-off between True positive rate (TPR) and False positive rate (FPR) at various threshold settings. ...
4548 sym Python (2448 sym/5 pcs) 20 img
Schedule generator with R
This sample R code is an implementation one of possible solutions for generating timetable schedule. The proposed solution is based on methods of evolutionary computing, and it uses (1+1) evolutionary strategy and simulated hardening. The success of solution is estimated on fulfillment of given constraints and criteria. Results of testing the algo...
3374 sym R (1852 sym/1 pcs) 2 img
Little useless-useful R functions – Dragon curve
Let’s play with some dragons. Dragons from the Jurassic park or the board game dungeon and dragons. The algorithm is a fractal curve of Hausdorff dimension 2. One starts with one segment. In each iteration the number of segments is doubled by taking each segment as the diagonal of a square and replacing it by half the square (90 degrees). Alter...
1824 sym R (1324 sym/3 pcs) 2 img
Simple custom colour palettes with R ggplot graphs
A simple, yet effective way to set your colour palette in R using ggplot library. library(ggplot2) set.seed(2908) my_palette <- c("red", "limegreen", "#3357FF", "goldenrod1", "#33FFFF", "brown") data <- data.frame( x = 1:25, y = rnorm(25), group = rep(c("A", "B", "C", "D", "E"), each = 5) ) After that, we can start “chaining” ggplot g...
743 sym R (990 sym/5 pcs) 8 img
Calculating data for visualization on stacked 100% bar
Calculating cumulative percentage or percentage per group for each time can sometimes be a task with a slight twist. Let’s check this with ggplot2 and tidyverse. library(ggplot2) library(tidyverse) data <- data.frame( sector = rep(1:20, each = 5), item = rep(1:5, times = 20), value = rpois(100, 10) ) Three (out of many more) ...
855 sym R (1136 sym/4 pcs) 4 img
Little useless-useful R functions – Reverse Hello World
You know the feeling after long vacation and finally sitting in front of your favourite UI and even forgot how to write simplest “hello world” or “foo bar” function? Well, we got you covered! The reverse Hello world function is for all the people returning to the office after rather long vacation. Create the function: # reverse Hello Wor...
1055 sym R (144 sym/1 pcs) 4 img
Advent of 2023, Day 11 – Starting data science with Microsoft Fabric
In this Microsoft Fabric series: Dec 01: What is Microsoft Fabric? Dec 02: Getting started with Microsoft Fabric Dec 03: What is lakehouse in Fabric? Dec 04: Delta lake and delta tables in Microsoft Fabric Dec 05: Getting data into lakehouse Dec 06: SQL Analytics endpoint Dec 07: SQL commands in SQL Analytics endpoint Dec 08: Using Lake...
2799 sym Python (1812 sym/10 pcs) 10 img
Advent of 2023, Day 10 – Creating Job Spark definition
n this Microsoft Fabric series: Dec 01: What is Microsoft Fabric? Dec 02: Getting started with Microsoft Fabric Dec 03: What is lakehouse in Fabric? Dec 04: Delta lake and delta tables in Microsoft Fabric Dec 05: Getting data into lakehouse Dec 06: SQL Analytics endpoint Dec 07: SQL commands in SQL Analytics endpoint Dec 08: Using Lakeh...
2425 sym R (410 sym/1 pcs) 12 img
Advent of 2023, Day 4 – Delta lake and delta tables in Microsoft Fabric
In this Microsoft Fabric series: Dec 01: What is Microsoft Fabric? Dec 02: Getting started with Microsoft Fabric Dec 03: What is lakehouse in Fabric? Yesterday we looked into lakehouse and learned that Delta tables are the storing format. So, let’s explore what and how we can go around understanding and working with delta tables. But first w...
2924 sym R (717 sym/4 pcs) 14 img