Publications by matloff
Package Updates
Several updates. All packages are on CRAN, but please use GitHub for the latest. dsld (Data Science Looks at Discrimination. New package. Tools for (a) careful investigation of possible discrimination (race, sex, age etc.) and (b) avoiding/amelioration of bias in machine learning algorithms. Comes with a free Quarto textbook on the methodology. Us...
1386 sym
New qeML Plotting Function
I’ve added a new function to qeML 1.2, qeMittalGraph, based on an idea by my student Aditya Mittal. Below is an example that I think is rather compelling. The basic idea is quite simple (and not necessarily new, just something I had not seen below): Instead of comparing several curves directly, plot their growth from their initial baseline value....
1701 sym R (568 sym/3 pcs) 4 img
New R Package: Data Science Looks at Discrimination (dsld)
I’m very pleased to announce a new package, dsld, available on CRAN. This is the work of eight talented undergrad students. I provided the concept and some general guidance, but this is their work. The package is aimed at dealing with discrimination — race, gender, age — in the workplace, education, health care and so on. It consists of analy...
1441 sym
New Paper on Data Privacy
Readers who are interested in the Data Privacy field may find our new paper (Perry, Matloff, Tendick) of interest, https://tdp.cat/issues21/tdp.a478a22.pdf…. There we introduce a new method that we call RWN, Randomization within Neighborhoods. We present a bit of supporting theory and do some empirical evaluation. We also present a qualitative c...
900 sym
Knowing Something vs. Knowing the Name of Something: Some Points about Causal Analysis
The famed physicist Richard Feynman once said, “I learned very early the difference between knowing the name of something and knowing something,” a lesson from his father. I think too often we in the statistics/machine learning field are guilty of “only knowing the name of something.” Well, in most cases, we may know a bit more than the nam...
15653 sym R (1526 sym/4 pcs) 2 img
Torch for R Now in the qeML Package
I’ve added a new function, qeNeuralTorch, to the qeML package, as an alternative to the package’s qeNeural. It is experimental as this point, but usable and I urge everyone to try it out. In this post, I will (a) state why I felt it desirable to add such a function, (b) show a couple of examples, (c) explain how the function works, thereby givi...
5794 sym R (710 sym/6 pcs)
Quantile Regression with Random Forests
In my December 22 blog, I first introduced the classic parametric quantile regression (QR) concept. I then showed how one could use the qeML package to perform quantile regression nonparametrically, using the package’s qeKNN function for a k-Nearest Neighbors approach. A reader then asked if this could be applied to random forests (RFs). The ...
3552 sym R (426 sym/1 pcs) 2 img
qeML Example: Nonparametric Quantile Regression
In this post, I will first introduce the concept of quantile regression (QR), a powerful technique that is rarely taught in stat courses. I’ll give an example from the quantreg package, and then will show how qeML can be used to do model-free QR estimation. Along the way, I will also illustrate the use of closures in R. Notation: We are predictin...
4293 sym R (791 sym/6 pcs)
A Comparison of Several qeML Predictive Methods
Is machine learning overrated, with traditional methods being underrated these days? Yes, ML has had some celebrated successes, but these have come after huge amounts of effort, and it’s possible that similar effort with traditional methods may have produced similar results. A related issue concerns the type of data. Hard core MLers tend to divid...
2663 sym 2 img
data.table User Survey
The data.table 2023 user community survey is here, open until December 1st. Related To leave a comment for the author, please follow the link and comment on their blog: Mad (Data) Scientist. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find ...
477 sym