Publications by matloff
Women in R
Last week I gave one of the keynote addresses at R/Finance 2018 in Chicago. I considered it an honor and a pleasure to be there, both because of the stimulating intellectual exchange and the fine level of camaraderie and hospitality that prevailed. I mentioned at the start of my talk that the success of this conference, now in its tenth year, epi...
4112 sym
Neural Networks Are Essentially Polynomial Regression
You may be interested in my new arXiv paper, joint work with Xi Cheng, an undergraduate at UC Davis (now heading to Cornell for grad school); Bohdan Khomtchouk, a post doc in biology at Stanford; and Pete Mohanty, a Science, Engineering & Education Fellow in statistics at Stanford. The paper is of a provocative nature, and we welcome feedback. ...
2023 sym
Update on Polynomial Regression in Lieu of Neural Nets
There was quite a reaction to our paper, “Polynomial Regression as an Alternative to Neural Nets” (by Cheng, Khomtchouk, Matloff and Mohanty), leading to discussions/debates on Twitter, Reddit, Hacker News and so on. Accordingly, we have posted a revised version of the paper. Some of the new features: Though originally we had made the discla...
1704 sym 2 img
What, No Parentheses?
I’m about to show you an R trick. Various readers may find it cool, useful and interesting, or stupid, useless and an evil deed undermining the sanctity of R’s functional programming nature (“All bow”). But I hope many of you will find the material here rather intriguing if not useful. All this involves a trick one can employ while workin...
2610 sym R (166 sym/6 pcs)
Manifold Visualization: Polynomials to the Rescue
Our arXiv paper and the associated R package polyreg caused a bit of a stir, both pro and con, when we first announced them here in June. The discussion even spread as far as Twitter, Reddit and Hacker News. We’ll be announcing a revised paper, and various new features to the package, very soon. But the purpose of this blog post is to focus on ...
3787 sym 10 img
Manifold Visualization: Second Example
In last night’s post, I introduced prVis(), a new visualization tool which we have invented, available in our polyreg package. Recall that prVis() is intended as a simpler alternative to recent visualization tools like t-SNE and UMAP. Here I will post another example. The dataset is prgeng, included in the package. It consists of wage income, a...
2641 sym R (266 sym/2 pcs) 6 img
Example of Overfitting
I occasionally see queries on various social media as to overfitting — what is it?, etc. I’ll post an example here. (I mentioned it at my talk the other night on our novel approach to missing values, but had a bug in the code. Here is the correct account.) The dataset is prgeng, on wages of programmers and engineers in Silicon Valley as of t...
2946 sym R (403 sym/3 pcs) 1 tbl
R > Python: a Concrete Example
I like both Python and R, and teach them both, but for data science R is the clear choice. When asked why, I always note (a) written by statisticians for statisticians, (b) built-in matrix type and matrix manipulations, (c) great graphics, both base and CRAN, (d) excellent parallelization facilities, etc. I also like to say that R is “more CS-i...
1670 sym 2 img
nice student project
In all of my undergraduate classes, I require a term project, done in groups of 3-4 students. Though the topic is specified, it is largely open-ended, a level of “freedom” that many students are unaccustomed to. However, some adapt quite well. The topic this quarter was to choose a CRAN package that does not use any C/C++, and try to increase...
1012 sym
Free online r course
Recently a young relative mentioned that the campus R course she hoped to attend was full. What online alternatives did she have? So, I decided to start one of my own! https://github.com/matloff/fasteR Designed for complete beginners. I now have six lessons up on the site. I hope to add one new lesson per week. Related To leave a comment fo...
718 sym