Publications by msuzen
Demystify Dirac delta function for data representation on discrete space
Dirac delta function is an important tool in Fourier Analysis. It is used specially in electrodynamics and signal processing routinely. A function over set of data points is often shown with a delta function representation. A novice reader relying on integral properties of the delta function may found this notation quite confusing. Probably, ...
2464 sym R (148 sym/1 pcs) 2 img 1 tbl
Particle approximation to probability density functions: Dirac delta function representation
In the previous post, I have briefly shown the idea of using dirac delta function for discrete data representation. In the second example there, a histogram locations for a given set of points are presented as spike trains, where as heights are somehow given in a second sum. This is hard to follow and visualise, of course if you are n...
2309 sym Python (365 sym/2 pcs) 2 img 2 tbl
Euclid Algorithm for Set of Integers: ‘Reduce’ vs. trees in R
The Euclid Algorithm provides a solution to the greatest common divisor (GCD) of two natural numbers $x_{1}$ and $x_{-2}$, denoted by $GCD(x_{1}, x_{2})$. This will produce the largest integer that divides $x_{1}$ and $x_{2}$. Solution is proposed by Euclid of Ancient Greece. This can be formulated as a recurrence relation:$$x_{k} = x_{k-1} mo...
2187 sym R (489 sym/4 pcs) 1 tbl
Scale back or transform back multiple linear regression coefficients: Arbitrary case with ridge regression
SummaryThe common case in data science or machine learning applications, different features or predictors manifest them in different scales. This could bring difficulty in interpreting the resulting coefficients of linear regression, such as one feature having very large or small values compare to other predictors and being in different units fir...
2894 sym R (1147 sym/1 pcs)
S-shaped data: Smoothing with quasibinomial distribution
Figure 1: Synthetic data and fitted curves.S-shaped distributed data can be found in many applications. Such data can be approximated with logistic distribution function [1]. Cumulative distribution function of logistic distribution function is a logistic function, i.e., logit.To demonstrate this, in this short example, after generating a synth...
2059 sym R (2560 sym/2 pcs) 2 img 1 tbl
S-shaped data: Smoothing with quasibinomial distribution
Figure 1: Synthetic data and fitted curves.S-shaped distributed data can be found in many applications. Such data can be approximated with logistic distribution function [1]. Cumulative distribution function of logistic distribution function is a logistic function, i.e., logit.To demonstrate this, in this short example, after generating a synth...
1952 sym R (2560 sym/2 pcs) 2 img 1 tbl
Economy and dynamic modelling: Haavelmo’s approach
Updated on 25 August 2017Preamable: Predictions using dynamic modellingMachine Learning and Neural Networks are not the only way to do data science or AI. There are other techniques to explore , for example, from quantitative economics. Apart from Game Theory, dynamic modelling could be suitable to many prediction problems, specially the one...
6749 sym
Economy and dynamic modelling: Haavelmo’s approach
Econometrics aims at estimating observables in the economy and their inter-dependencies and testing the estimates against the economic reality. A quantitative approach to express these inter-dependencies appear as simultaneous equations, an i.e. system of linear equations, this is a mathematical structure of economic relationships that were mad...
6016 sym
Understanding the empirical law of large numbers and the gambler’s fallacy
One of the misconceptions in our understanding of statistics, or a counter-intuitive guess, fallacy, appears in the assumption of the existence of the law of averages. Imagine we toss a fair coin many times, most people would think that the number of heads and tails would be balanced over the increasing number of trails, which is wron...
2579 sym R (4169 sym/2 pcs) 4 img 2 tbl
Practical Kullback-Leibler (KL) Divergence: Discrete Case
KL divergence (Kullback-Leibler57) or KL distance is non-symmetric measure of difference between two probability distributions. It is related to mutual information and can be used to measure the association between two random variables.Figure: Distance between two distributions. (Wikipedia)In this short tutorial, I show how to compute...
3932 sym R (468 sym/2 pcs) 2 img 2 tbl