Publications by jmount

abs and relu are not Mercer Kernels

25.12.2020

I am sharing some rough notes (in R and Python) here on how while dot(a, b) fulfills “Mercer’s condition” (by definition!, and I’ll just informally call these beasts a “Mercer Kernel”), the seemingly harmless variations abs(dot(a, b)) relu(dot(a, b)) are not Mercer Kernels (relu(x) = max(0, x) = (abs(x) + x)/2). It turns out they fail...

2063 sym

The Nature of Overfitting

04.01.2021

Introduction I would like to talk about the nature of supervised machine learning and overfitting. One of the cornerstones of our data science intensives is giving the participants the experiences of a data scientist in a safe controlled environment. We hope by working examples they can quickly get to the point where they have experience that can...

16016 sym R (10432 sym/44 pcs) 10 img 1 tbl

The Nature of Overfitting

04.01.2021

Introduction I would like to talk about the nature of supervised machine learning and overfitting. One of the cornerstones of our data science intensives is giving the participants the experiences of a data scientist in a safe controlled environment. We hope by working examples they can quickly get to the point where they have experience that can...

16021 sym R (10432 sym/44 pcs) 10 img 1 tbl

Smoothing isn’t Always Safe

07.01.2021

Introduction Here is a quick data-scientist / data-analyst question: what is the overall trend or shape in the following noisy data? For our specific example: How do we relate value as a noisy function (or relation) of m? This example arose in producing our tutorial “The Nature of Overfitting”. One would think this would be safe and easy to a...

6370 sym R (3827 sym/25 pcs) 26 img

Smoothing isn’t Always Safe

07.01.2021

Introduction Here is a quick data-scientist / data-analyst question: what is the overall trend or shape in the following noisy data? For our specific example: How do we relate value as a noisy function (or relation) of m? This example arose in producing our tutorial “The Nature of Overfitting”. One would think this would be safe and easy to a...

6370 sym R (3827 sym/25 pcs) 26 img

Variable Utility is not Intrinsic

12.01.2021

There is much ado about variable selection or variable utility valuation in supervised machine learning. In this note we will try to disarm some possibly common fallacies, and to set reasonable expectations about how variable valuation can work. Introduction In general variable valuation is estimating the utility that a column of explanatory valu...

9015 sym R (2372 sym/23 pcs) 4 tbl

Variable Utility is not Intrinsic

12.01.2021

There is much ado about variable selection or variable utility valuation in supervised machine learning. In this note we will try to disarm some possibly common fallacies, and to set reasonable expectations about how variable valuation can work. Introduction In general variable valuation is estimating the utility that a column of explanatory valu...

9015 sym R (2372 sym/23 pcs) 4 tbl

Code for the “Variable Utility is not Intrinsic” Article

12.01.2021

I’ve now shared the code for my “Variable Utility is not Intrinsic” article here: https://github.com/WinVector/Examples/tree/main/Variable_Utility_is_not_Intrinsic. And I have also ported the entire article to Python. It is actually kind of neat to be able to compare the two and see how close doing data science in R and in Python can be ma...

752 sym

Code for the “Variable Utility is not Intrinsic” Article

12.01.2021

I’ve now shared the code for my “Variable Utility is not Intrinsic” article here: https://github.com/WinVector/Examples/tree/main/Variable_Utility_is_not_Intrinsic. And I have also ported the entire article to Python. It is actually kind of neat to be able to compare the two and see how close doing data science in R and in Python can be ma...

752 sym

Bilingual Data Science

13.01.2021

I’d like to share a new talk on bilingual data science. It is limited to R and Python, so it is a bit of a “we play all kinds of music, both Country and Western.” It has what I feel is a really neat example how I used Jetbrains Intellij PyCharm to quickly translate an R .Rmd file to a Python Juptyter Lab .ipynb notebook. The translation is ...

869 sym