Publications by George Pipis

How to Generate Correlated Data in R

03.05.2021

Sometimes we need to generate correlated data for exhibition purposes, technical assessments, testing etc. We have provided a walk-through example of how to generate correlated data in Python using the scikit-learn library. In R, as far as I know, there is not any library that allows us to generate correlated data. For that reason, we will work w...

2542 sym R (1363 sym/5 pcs) 6 img

10 Tips And Tricks For Data Scientists Vol.7

08.05.2021

We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you have missed: Vol.1Vol.2Vol.3Vol.4Vol.5Vol.6 Python 1.Differences Between Numpy Arrays and Python Lists There are some differences between Numpy Arrays and Python Lists. We will provide some examples of algebraic operators. ‘+’ ...

5884 sym R (5826 sym/37 pcs) 4 img

How to Compare Nested Models in R

09.05.2021

Using R and the anova function we can easily compare nested models. Where we are dealing with regression models, then we apply the F-Test and where we are dealing with logistic regression models, then we apply the Chi-Square Test. By nested, we mean that the independent variables of the simple model will be a subset of the more complex model. In ...

2549 sym R (335 sym/5 pcs) 14 img

Tips And Tricks For Data Scientists Vol.8

29.05.2021

We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you have missed: Vol.1Vol.2Vol.3Vol.4Vol.5Vol.6Vol.7 R 1.How To Remove The Correlated Variables From A Data Frame When we build predictive models, we use to remove the high correlated variables (multi-collinearity). The point is to kee...

3329 sym R (2875 sym/16 pcs) 12 img

10 Tips and Tricks for Data Scientists Vol.9

22.06.2021

We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you have missed: Vol.1Vol.2Vol.3Vol.4Vol.5Vol.6Vol.7Vol.8 R 1.How To Write File Paths If we want to write file paths that work in every operating system, like Linux, OS, Microsoft, we can work with the file.path() command. Let’s sa...

4660 sym R (2740 sym/33 pcs) 10 img

Who is going to Win the Euro 2020

24.06.2021

We have reached the knock-out phase of Euro 2020 (or 2021) where the final-16 teams and the games can be shown below: The question is who is going to be the Euro 2020 Winner. Although we cannot predict the Winner, we can estimate the probabilities of each team to win the Euro. The Methodology This is a very simple model that is based on UEFA R...

1830 sym R (4447 sym/1 pcs) 8 img 1 tbl

Euro 2020 Predictive Model based on FIFA Ranking System

29.06.2021

In a previous post, we built a Predictive Model based on FIFA Ranking and making the assumption that the points follow a normal distribution. If we look closer at FIFA’s Ranking Model we will see that it is based on the ELO System where the expected result of the game can be extracted from the following formula: Simulate the Final-16 Phase Bas...

973 sym R (4295 sym/2 pcs) 4 img

Get the Odds of Euro 2020 Games based on FIFA World Ranking

29.06.2021

We will provide an example of how you can estimate the outcome of a Euro 2020 Game based on FIFA World Ranking. The current calculation method applied on 10 June 2018 and is based on the Elo rating system and after each game points will be added to or subtracted from a team’s rating according to the formula: The Expected Result of a Game The ...

2402 sym R (557 sym/8 pcs) 10 img

10 Tips and Tricks for Data Scientists Vol.10

04.07.2021

We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you have missed: Vol.1Vol.2Vol.3Vol.4Vol.5Vol.6Vol.7Vol.8Vol.9 Python 1.How to Get The Key of the Maximum Value in a Dictionary d={"a":3,"b":5,"c":2} (max(d, key=d.get)) b 2.How to Sort a Dictionary by Values Assume that we have the ...

4643 sym R (1500 sym/16 pcs) 16 img

Euro Semi-Finals: England is the Favorite!

06.07.2021

Using the  FIFA World Ranking and the Elo rating system we will try to estimate the probability of England winning its first Euro in history! The expected result of a game is given by the formula: where dr is the difference between two teams’ ratings before the game. Let’s see the function of the Winning Probability versus the Ranking Dif...

2189 sym R (383 sym/4 pcs) 4 img