Publications by John Mount
Thank you “Why R?” for Being Awesome Hosts
Thank you very much Why R? for being awesome hosts. We are really pleased with how your virtual MeetUp went. For those who missed it here is a link. Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. R-bloggers.com offers daily e-mail updates about R news and tutorials about le...
551 sym
Deal of the Day May 10: Half off Practical Data Science with R, Second Editio
Deal of the Day May 10: Half off Practical Data Science with R, Second Edition. Use code dotd051020au at https://bit.ly/2xLRPCk Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many othe...
530 sym 2 img
General Data Science Means Cross-Language Tools, Training, and Documentation
Data science is often a case of brining the tools to the problems and data, instead of insisting on bringing the problems and data to the tools. To support cross-language data science we have been working on cross-language tools, documentation, and training. For example: vtreat data preparation package for supervised machine learning available b...
1635 sym
Data engineering and data shaping in Practical Data Science with R 2nd Edition
A kind reader recently shared the following comment on the Practical Data Science with R 2nd Edition live-site. Thanks for the chapter on data frames and data.tables. It has helped me overcome an obstacle freeing me from a lot of warnings telling me my data table was not a real . It reduced the calculation time for a scenario in modelStudio from...
1330 sym
Don’t Feel Guilty About Selecting Variables
We have an exciting new article to share: Don’t Feel Guilty About Selecting Variables. If you are at all interested in the probabilistic justification of important data science techniques, such as variable selection or pruning, this should be an informative and fun read. “Data Science” is often criticized with the common slur “if it has s...
1709 sym
How to Read Sourav Chatterjee’s Basic XICOR Definition
Introduction Professor Sourav Chatterjee recently published a new coefficient of correlation called XICOR (refs: JASA, R package, Arxiv, Hacker News, and a Python package (different author)). The basic formula (in the tie-free case) is: Take X and Y as n-vectors of observations of random variable. Compute the ranks r(i) of the Y observations. So...
4214 sym
Working in CRAN’s World
Part of the deal of having a package up on CRAN is: at any time one may be sent an automated email like the following. Dear maintainer, Please see the problems shown on URL. Please correct before TODAY+14DAYS to safely retain your package on CRAN. The CRAN Team If this automated email from a bulk sender bounces, goes to SPAM, or isn’t respond...
5419 sym