Publications by George Pipis
Predict Basketball Games with Log5 formula
We have provided an example of how to get started with predictive models for NBA Games. In this post, we will show how you can get a rough estimate of the final outcome of the game by using the Log5 formula and the Beta Distribution. For this example, we will consider the game, Dallas Mavericks (away) vs Portland Trail Blazers (home). Prediction...
1966 sym R (310 sym/2 pcs) 2 img
Detect the Changes in Timeseries Data
In this post, we will provide an example of how you can detect changes in the distribution across time. For example, let’s say that we monitor the heart rate of a person with the following states: Sleep: Normal (60,5)Awake: Normal (75,8)Exercise: Normal (135, 12) Let’s generate this data: set.seed(5) sleep<-rnorm(100, 60, 5) awake<-rnorm(200...
1408 sym R (1255 sym/4 pcs) 6 img
10 Tips and Tricks for Data Scientists Vol.1
Introduction As data scientists, we love to do our job efficiently without reinventing the wheel. Tips-and-tricks articles provide snippets of code for common tasks in the data science world. In this article, we’ll cover mainly Python and R, as well as other tips in Unix, Excel, Git, Docker, Google Spreadsheets, etc. Here, we will gather 10 tip...
2879 sym R (2731 sym/14 pcs)
10 Tips and Tricks for Data Scientists Vol.2
We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you missed Vol. 1, you can have a look here. Python 1. How to create files from Jupyter While working with the Jupyter Notebook, sometimes you need to create a file (e.g. a .py file). Let’s see how we can do it via Jupyter Notebook. T...
4308 sym R (3494 sym/13 pcs) 14 img
10 Tips and Tricks for Data Scientists Vol.3
I have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you missed vol 1 and vol 2. Python 1. How to work with JSON cells in pandas Assume that you are dealing with a pandas DataFrame where one of your columns is in JSON format and you want to extract specific information. For this example, we...
3835 sym R (4610 sym/16 pcs) 10 img
10 Tips and Tricks for Data Scientists Vol.4
We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you missed vol 1, vol 1and vol 3. Python 1.How To Get Data From Google Drive Into Colab The Google Colab is becoming more and more popular in Data Science community. Working with Colab Jupyter notebooks, you are able to mount your Goog...
4259 sym R (4616 sym/23 pcs) 24 img
10 Tips And Tricks For Data Scientists Vol.5
We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you missed vol 1, vol 2 ,vol 3 and vol 4. Python 1.How To COALESCE In Pandas This function returns the first non-null value between 2 columns. import pandas as pd import numpy as np df=pd.DataFrame({"A":[1,2,np.nan,4,np.nan],"B":['A',...
5518 sym R (2345 sym/23 pcs) 24 img
10 Tips And Tricks For Data Scientists Vol.6
We have started a series of articles on tips and tricks for data scientists (mainly in Python and R). In case you have missed: Vol.1Vol.2Vol.3Vol.4Vol.5 Python 1.How To Get The Mode From A List In Python Assume that we have the following list: mylist = [1,1,1,2,2,3,3] and we want to get the mode, i.e. the most frequent element. We can use the fo...
6069 sym R (2492 sym/21 pcs) 24 img
How to Check if a File or a Directory exists in R, Python and Bash
When we build Data Workflows and Machine Learning Pipelines, it is common to check for the existence of specific files and directories (folders). We will provide some hands-on examples of how you check for files or directories in R, Python and Bash. Check for the existence of File or a Directory in R For this example, we have created a file calle...
4020 sym Python (2027 sym/23 pcs) 4 img
How to Get the Power of Test in Hypothesis Testing with Binomial Distribution
In this tutorial we will show how you can get the Power of Test when you apply Hypothesis Testing with Binomial Distribution. Before we provide the example let’s recall that is the Type I, and Type II errors. Type I error This is the probability to reject the null hypothesis, given that the null hypothesis is true. This is the level of signific...
4915 sym R (1054 sym/14 pcs) 2 img