Publications by John Mount
Keep Calm and Use vtreat (in R and in Python)
A big thank you to Dmytro Perepolkin for sharing a “Keep Calm and Use vtreat” poster! Also, we have translated the Python vtreat steps from our recent “Cross-Methods are a Leak/Variance Trade-Off” article into R vtreat steps here. This R-port demonstrates the new to R fit/prepare notation! We want vtreat to be a platform agnostic (works ...
1578 sym 2 img
A Little Something From Practical Data Science with R Chapter 1
Here is a small quote from Practical Data Science with R Chapter 1. It is often too much to ask for the data scientist to become a domain expert. However, in all cases the data scientist must develop strong domain empathy to help define and solve the right problems. Interested? Please check it out. Related To leave a comment for the author, p...
703 sym
Free Coupon for our R Video Course: Introduction to Data Science
For all our remote learners, we are sharing a free coupon code for our R video course Introduction to Data Science. The code is ITDS2020, and can be used at this URL https://www.udemy.com/course/introduction-to-data-science/?couponCode=ITDS2020 . Please check it out and share it! Related To leave a comment for the author, please follow the link...
682 sym
Version Control is a Time Machine That Translates Common Hindsight Into Valuable Foresight
For data science projects I recommend using source control or version control, and committing changes at a very fine level of granularity. This means checking in possibly broken code, and the possibly weak commit messages (so when working in a shared project, you may want a private branch or second source control repository). Please read on for ...
4779 sym 2 img
Re-Share: vtreat Data Preparation Documentation and Video
I would like to re-share vtreat (R version, Python version) a data preparation documentation for machine learning tasks. vtreat is a system for preparing messy real world data for predictive modeling tasks (classification, regression, and so on). In particular it is very good at re-coding high-cardinality string-valued (or categorical) variables...
2201 sym
wrapr 2.0.0 up on CRAN
wrapr 2.0.0 is now up on CRAN. This means the := variant of unpack[] is now easy to install. Please give it a try! Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Cli...
516 sym
R Tip: How To Look Up Matrix Values Quickly
R is a powerful data science language because, like Matlab, numpy, and Pandas, it exposes vectorized operations. That is, a user can perform operations on hundreds (or even billions) of cells by merely specifying the operation on the column or vector of values. Of course, sometimes it takes a while to figure out how to do this. Please read for a...
2085 sym R (529 sym/4 pcs) 4 tbl
Discount on Manning Books, Including our own Practical Data Science with R 2nd Edition
We have a discount on Manning Books, including our own Practical Data Science with R 2nd Edition! Manning.com is offering FREE shipping with code SHIP35 for US residents only. Use this link to link to purchase http://www.manning.com/?a_aid=zm. And, Manning.com is offering 50% off all eBooks and 35% off all print books. Take advantage of this gre...
839 sym 2 img
Nina and John Speaking at Why R? Webinar Thursday, May 7, 2020
Nina Zumel and John Mount will be speaking on advanced data preparation for supervised machine learning at the Why R? Webinar Thursday, May 7, 2020. This is a 8pm in a GMT+2 timezone, which for us is 11AM Pacific Time. Hope to see you there! Related To leave a comment for the author, please follow the link and comment on their blog: R – Wi...
645 sym 2 img
Thank you “Why R?” for Being Awesome Hosts
Thank you very much Why R? for being awesome hosts. We are really pleased with how your virtual MeetUp went. For those who missed it here is a link. Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. R-bloggers.com offers daily e-mail updates about R news and tutorials about le...
551 sym