Publications by arthur charpentier
Classification from scratch, penalized Ridge logistic 4/8
Fourth post of our series on classification from scratch, following the previous post which was some sort of detour on kernels. But today, we’ll get back on the logistic model. Formal approach of the problem We’ve seen before that the classical estimation technique used to estimate the parameters of a parametric model was to use the maximum l...
9189 sym R (4817 sym/17 pcs) 22 img
Classification from scratch, penalized Lasso logistic 5/8
Fifth post of our series on classification from scratch, following the previous post on penalization using the \(\ell_2\) norm (so-called Ridge regression), this time, we will discuss penalization based on the \(\ell_1\) norm (the so-called Lasso regression). First of all, one should admit that if the name stands for least absolute shrinkage and ...
12687 sym R (5191 sym/14 pcs) 22 img
Classification from scratch, neural nets 6/8
Sixth post of our series on classification from scratch. The latest one was on the lasso regression, which was still based on a logistic regression model, assuming that the variable of interest \(Y\) has a Bernoulli distribution. From now on, we will discuss technique that did not originate from those probabilistic models, even if they might stil...
10244 sym R (5359 sym/28 pcs) 26 img
Classification from scratch, SVM 7/8
Seventh post of our series on classification from scratch. The latest one was on the neural nets, and today, we will discuss SVM, support vector machines. A formal introduction Here \(y\) takes values in \(\{-1,+1\}\). Our model will be \(m(\mathbf{x})=\text{sign}[\mathbf{\omega}^T\mathbf{x}+b]\) Thus, the space is divided by a (linear) border\(\...
9920 sym R (4905 sym/18 pcs) 16 img
Classification from scratch, bagging and forests 10/8
Tenth post of our series on classification from scratch. Today, we’ll see the heuristics of the algorithm inside bagging techniques. Often, bagging is associated with trees, to generate forests. But actually, it is possible using bagging for any kind of model. Recall that bagging means “boostrap aggregation”. So, consider a model \(m:\mathc...
4267 sym R (3410 sym/12 pcs) 12 img
Classification from scratch, boosting 11/8
Eleventh post of our series on classification from scratch. Today, that should be the last one… unless I forgot something important. So today, we discuss boosting. An econometrician perspective I might start with a non-conventional introduction. But that’s actually how I understood what boosting was about. And I am quite sure it has to do wit...
9003 sym R (3729 sym/11 pcs) 14 img
Discrete or continuous modeling ?
Tuesday, we got our conference “Insurance, Actuarial Science, Data & Models” and Dylan Possamaï gave a very interesting concluding talk. In the introduction, he came back briefly on a nice discussion we usually have in economics on the kind of model we should consider. It was about optimal control. In many applications, we start with a one p...
3288 sym R (657 sym/5 pcs) 4 img
Quantile Regression (home made)
After my series of post on classification algorithms, it’s time to get back to R codes, this time for quantile regression. Yes, I still want to get a better understanding of optimization routines, in R. Before looking at the quantile regression, let us compute the median, or the quantile, from a sample. Median Consider a sample \(\{y_1,\cdots,y...
2794 sym R (2458 sym/12 pcs) 4 img
Linear Regression, with Map-Reduce
Sometimes, with big data, matrices are too big to handle, and it is possible to use tricks to numerically still do the map. Map-Reduce is one of those. With several cores, it is possible to split the problem, to map on each machine, and then to agregate it back at the end. Consider the case of the linear regression, \(\mathbf{y}=\mathbf{X}\mathbf...
2727 sym R (1018 sym/8 pcs)
Parallelizing Linear Regression or Using Multiple Sources
My previous post was explaining how mathematically it was possible to parallelize computation to estimate the parameters of a linear regression. More speficially, we have a matrix \(\mathbf{X}\) which is \(n\times k\) matrix and \(\mathbf{y}\) a \(n\)-dimensional vector, and we want to compute \(\widehat{\mathbf{\beta}}=[\mathbf{X}^T\mathbf{X}]^{...
3017 sym R (1689 sym/14 pcs) 4 img