Publications by John Mount
New improved cdata instructional video
We have a new improved version of the “how to design a cdata/data_algebra data transform” up! The original article, the Python example, and the R example have all been updated to use the new video. Please check it out! Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. R-blo...
621 sym
What is New For vtreat 1.5.2?
vtreat version 1.5.2 just became available from CRAN. We have a logged a few improvement in the NEWS. The changes are small and incremental, as the package is already in a great stable state for production use. One of the biggest improvements is documentation clean up, and adapting the examples to use wrapr unpack/to multiple assignment notatio...
844 sym
Nifty Upcoming Enhancements to unpack/to
We have some really nifty upcoming enhancements to wrapr unpack/to. One of the new notations is the use of := as an alternate assignment operator for unpack/to. This lets us write code like the following. First let’s attach our package and set up some example data. library(wrapr) # attach package packageVersion("wrapr") # confirm we have at...
3688 sym 1 tbl
Cross-Methods are a Leak/Variance Trade-Off
We have a new Win Vector data science article to share: Cross-Methods are a Leak/Variance Trade-Off John Mount (Win Vector LLC), Nina Zumel (Win Vector LLC) March 10, 2020 We work some exciting examples of when cross-methods (cross validation, and also cross-frames) work, and when they do not work. Abstract Cross-methods such as cross-valida...
1689 sym
Keep Calm and Use vtreat (in R and in Python)
A big thank you to Dmytro Perepolkin for sharing a “Keep Calm and Use vtreat” poster! Also, we have translated the Python vtreat steps from our recent “Cross-Methods are a Leak/Variance Trade-Off” article into R vtreat steps here. This R-port demonstrates the new to R fit/prepare notation! We want vtreat to be a platform agnostic (works ...
1578 sym 2 img
A Little Something From Practical Data Science with R Chapter 1
Here is a small quote from Practical Data Science with R Chapter 1. It is often too much to ask for the data scientist to become a domain expert. However, in all cases the data scientist must develop strong domain empathy to help define and solve the right problems. Interested? Please check it out. Related To leave a comment for the author, p...
703 sym
Free Coupon for our R Video Course: Introduction to Data Science
For all our remote learners, we are sharing a free coupon code for our R video course Introduction to Data Science. The code is ITDS2020, and can be used at this URL https://www.udemy.com/course/introduction-to-data-science/?couponCode=ITDS2020 . Please check it out and share it! Related To leave a comment for the author, please follow the link...
682 sym
Version Control is a Time Machine That Translates Common Hindsight Into Valuable Foresight
For data science projects I recommend using source control or version control, and committing changes at a very fine level of granularity. This means checking in possibly broken code, and the possibly weak commit messages (so when working in a shared project, you may want a private branch or second source control repository). Please read on for ...
4779 sym 2 img
Re-Share: vtreat Data Preparation Documentation and Video
I would like to re-share vtreat (R version, Python version) a data preparation documentation for machine learning tasks. vtreat is a system for preparing messy real world data for predictive modeling tasks (classification, regression, and so on). In particular it is very good at re-coding high-cardinality string-valued (or categorical) variables...
2201 sym
wrapr 2.0.0 up on CRAN
wrapr 2.0.0 is now up on CRAN. This means the := variant of unpack[] is now easy to install. Please give it a try! Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Cli...
516 sym