Publications by T. Moudiki
Chat with your tabular data in www.techtonique.net
You can now obtain insights from your tabular data by chatting with it in techtonique.net. No plotting yet (coming soon), but you can already ask questions like:What is the average of column A?Show me the first 5 rows of dataShow me 5 random rows of dataWhat is the sum of column B?What is the average of column A grouped by column B?…As a reminder...
3854 sym 1 img
Chat with your tabular data in www.techtonique.net
You can now obtain insights from your tabular data by chatting with it in techtonique.net. No plotting yet (coming soon), but you can already ask questions like: What is the average of column A? Show me the first 5 rows of data Show me 5 random rows of data What is the sum of column B? What is the average of column A grouped by column B? … As a...
4119 sym 2 img
Gradient-Boosting anything (alert: high performance): Part3, Histogram-based boosting
A few weeks ago, I intoduced a model-agnostic gradient boosting procedure, that can use any base learner (available in R and Python package mlsauce):https://thierrymoudiki.github.io/blog/2024/10/06/python/r/genericboostinghttps://thierrymoudiki.github.io/blog/2024/10/14/r/genericboosting-rThe rationale is different from other histogram-based gradie...
2616 sym Python (1683 sym/6 pcs) 1 img 8 tbl
R editor and SQL console (in addition to Python editors) in www.techtonique.net
It’s now possible to run R code in an editor and SQL queries in a console on www.techtonique.net. In the R editor, you can write and execute R code, including plotting, and in the SQL console, you can display SQL queries’ results and download these results as csv files. As a reminder from last week, you can run R or Python code interactively i...
4053 sym 4 img
R editor and SQL console (in addition to Python editors) in www.techtonique.net
It’s now possible to run R code in an editor and SQL queries in a console on www.techtonique.net. In the R editor, you can write and execute R code, including plotting, and in the SQL console, you can display SQL queries’ results and download these results as csv files.As a reminder from last week, you can run R or Python code interactively in ...
3798 sym 2 img
R and Python consoles + JupyterLite in www.techtonique.net
You can now run R or Python code interactively in your browser, on www.techtonique.net. As a reminder, a few weeks ago, I released Techtonique web app, a tool designed to help you make informed, data-driven decisions using Mathematics, Statistics, Machine Learning, and Data Visualization. As of September 2024, the tool is in its beta phase (subjec...
3796 sym 2 img
Gradient-Boosting anything (alert: high performance): Part2, R version
Last week, I presented a functionality from Python package called mlsauce that allows gradient boosting of any regression algorithm. This post is about the R version. I think (?) I finally wrapped my head around the process of creating an R package from a Python package systematically, using reticulate. By default when onload ing, reticulate create...
1221 sym R (13188 sym/1 pcs) 2 img
Gradient-Boosting anything (alert: high performance)
We’ve always been told that decision trees are best for Gradient Boosting Machine Learning. I’ve always wanted to see for myself. AdaBoostClassifier is working well, but is relatively slow (by my own standards). A few days ago, I noticed that my Cython implementation of LSBoost in Python package mlsauce was already quite generic (never noticed ...
1732 sym Python (1899 sym/5 pcs) 4 img 3 tbl
Benchmarking 30 statistical/Machine Learning models on the VN1 Forecasting — Accuracy challenge
This post is about the VN1 Forecasting – Accuracy challenge. The aim is to accurately forecast future sales for various products across different clients and warehouses, using historical sales and pricing data.Phase 1 was a warmup to get an idea of what works and what wouldn’t (and… for overfitting the validation set, so that the leaderboard ...
1917 sym Python (5613 sym/19 pcs) 8 tbl
Automated random variable distribution inference using Kullback-Leibler divergence and simulating best-fitting distribution
Another post from R package misc! This time, we’ll see how to fit multiple continuous parametric distributions on a vector of data and simulate best-fitting distribution. Under the hood, misc::fit_param_dist uses a loop of MASS::fitdistr calls and Kullback-Leibler divergence for checking distribution adequacy. remotes::install_github("thierrymoud...
975 sym Python (2022 sym/5 pcs) 8 img