Publications by msuzen

Demystify Dirac delta function for data representation on discrete space

20.11.2013

Dirac delta function is an important tool in Fourier Analysis. It is used specially in electrodynamics and signal processing routinely. A function over set of data points is often shown with a delta function representation. A novice reader relying on integral properties of the delta function may found this notation quite confusing. Probably, ...

2464 sym R (148 sym/1 pcs) 2 img 1 tbl

Particle approximation to probability density functions: Dirac delta function representation

17.01.2014

In the previous post, I have briefly shown the idea of using dirac delta function for discrete data representation. In the second example there, a histogram locations for a given set of points are presented as spike trains, where as heights are somehow given in a second sum. This is hard to follow and visualise, of course if you are n...

2309 sym Python (365 sym/2 pcs) 2 img 2 tbl

Euclid Algorithm for Set of Integers: ‘Reduce’ vs. trees in R

07.05.2014

The Euclid Algorithm provides a solution to the greatest common divisor (GCD) of two natural numbers $x_{1}$ and $x_{-2}$, denoted by $GCD(x_{1}, x_{2})$. This will produce the largest integer that divides $x_{1}$ and $x_{2}$. Solution is proposed by Euclid of Ancient Greece. This can be formulated as a recurrence relation:$$x_{k} = x_{k-1} mo...

2187 sym R (489 sym/4 pcs) 1 tbl

Scale back or transform back multiple linear regression coefficients: Arbitrary case with ridge regression

10.04.2015

SummaryThe common case in data science or machine learning applications, different features or predictors manifest them in different scales. This could bring difficulty in interpreting the resulting coefficients of linear regression, such as one feature having very large or small values compare to other predictors and being in different units fir...

2894 sym R (1147 sym/1 pcs)

S-shaped data: Smoothing with quasibinomial distribution

16.01.2016

Figure 1: Synthetic data and fitted curves.S-shaped distributed data can be found in many applications. Such data can be approximated with logistic distribution function [1]. Cumulative distribution function of logistic distribution function is a logistic function, i.e., logit.To demonstrate this, in this short example, after generating a synth...

2059 sym R (2560 sym/2 pcs) 2 img 1 tbl

S-shaped data: Smoothing with quasibinomial distribution

16.01.2016

Figure 1: Synthetic data and fitted curves.S-shaped distributed data can be found in many applications. Such data can be approximated with logistic distribution function [1]. Cumulative distribution function of logistic distribution function is a logistic function, i.e., logit.To demonstrate this, in this short example, after generating a synth...

1952 sym R (2560 sym/2 pcs) 2 img 1 tbl

Economy and dynamic modelling: Haavelmo’s approach

25.07.2016

Updated on 25 August 2017Preamable: Predictions using dynamic modellingMachine Learning and Neural Networks are not the only way to do data science or AI. There are other techniques to explore , for example, from quantitative economics. Apart from Game Theory, dynamic modelling could be suitable to many prediction problems, specially the one...

6749 sym

Economy and dynamic modelling: Haavelmo’s approach

25.07.2016

Econometrics aims at estimating observables in the economy and their inter-dependencies and testing the estimates against the economic reality. A quantitative approach to express these inter-dependencies appear as simultaneous equations, an i.e. system of linear equations, this is a mathematical structure of economic relationships that were mad...

6016 sym

Understanding the empirical law of large numbers and the gambler’s fallacy

01.08.2016

One of the misconceptions in our understanding of statistics, or a counter-intuitive guess, fallacy, appears in the assumption of the existence of the law of averages. Imagine we toss a fair coin many times, most people would think that the number of heads and tails would be balanced over the increasing number of trails, which is wron...

2579 sym R (4169 sym/2 pcs) 4 img 2 tbl

Practical Kullback-Leibler (KL) Divergence: Discrete Case

07.01.2017

KL divergence (Kullback-Leibler57) or KL distance is non-symmetric measure of difference between two probability distributions. It is related to mutual information and can be used to measure the association between two random variables.Figure: Distance between two distributions. (Wikipedia)In this short tutorial, I show how to compute...

3932 sym R (468 sym/2 pcs) 2 img 2 tbl