Publications by John Mount

You Can Override Just About Anything in R

02.10.2019

To understand computations in R, two slogans are helpful: Everything that exists is an object. Everything that happens is a function call. John Chambers In R, the “[” array access operator is a function call. And it is one a user can re-bind to the new effect of their own choosing. Let’s see what sort of mischief we can get into using t...

3571 sym

vtreat Cross Validation

05.10.2019

Nina Zumel finished new documentation on how vtreat‘s cross validation works, which I want to share here. vtreat is a system that makes data preparation for machine learning a “one-liner” (available in R or available in Python). We have a set of starting off points here. These documents describe what vtreat does for you, you just find the...

1189 sym

Free R/datascience Extract: Evaluating a Classification Model with a Spam Filter

15.10.2019

We are excited to share a free extract of Zumel, Mount, Practical Data Science with R, 2nd Edition, Manning 2019: Evaluating a Classification Model with a Spam Filter. This section reflects an important design decision in the book: teach model evaluation first, and as a step separate from model construction. It is funny, but it takes some effort...

1909 sym 2 img

Practical Data Science with R 2nd Edition update

17.10.2019

We are in the last stages of proofing the galleys/typesetting of Zumel, Mount, Practical Data Science with R, 2nd Edition, Manning 2019. So this edition will definitely be out soon! If you ever wanted to see what Nina Zumel and John Mount are like when we have the help of editors, this book is your chance! One thing I noticed in working through ...

855 sym

New Introduction to rquery

27.10.2019

Introduction rquery is a data wrangling system designed to express complex data manipulation as a series of simple data transforms. This is in the spirit of R’s base::transform(), or dplyr’s dplyr::mutate() and uses a pipe in the style popularized in R with magrittr. The operators themselves follow the selections in Codd’s relational algebr...

8572 sym R (2625 sym/18 pcs) 16 tbl

Practical Data Science with R, 2nd Edition, IS OUT!!!!!!!

15.11.2019

Practical Data Science with R, 2nd Edition author Dr. Nina Zumel, with a fresh author’s copy of her book! Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here...

508 sym 2 img

Practical Data Science with R, 2nd Edition: Introduction Video

28.11.2019

Nina and I have prepared a quick introduction video for Practical Data Science with R, 2nd Edition. We are really proud of both editions of the book. This book can help an R user directly experience the data science style of working with data and machine learning techniques. The book is available now at: Directly from the publisher Manning, no...

1017 sym 6 img

Practical Data Science with R 2nd Edition now in-stock at Amazon.com!

05.12.2019

Practical Data Science with R 2nd Edition is now in-stock at Amazon.com! Buy it for your favorite data scientist in time for the holidays! Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R a...

542 sym 2 img

What is new for rquery December 2019

07.12.2019

Our goal has been to make rquery the best query generation system for R (and to make data_algebra the best query generator for Python). Lets see what rquery is good at, and what new features are making rquery better. The idea is: the query is a first class citizen that we can use to design and optimize queries prior to translating them into a da...

4958 sym R (3144 sym/26 pcs) 1 tbl

New rquery Vignette: Working with Many Columns

15.12.2019

We have a new rquery vignette here: Working with Many Columns. This is an attempt to get back to writing about how to use the package to work with data (versus the other-day’s discussion of package design/implementation). Please check it out. Related To leave a comment for the author, please follow the link and comment on their blog: R – W...

644 sym