Publications by John Mount
Take Care If Trying the RPostgres Package
Take care if trying the new RPostgres database connection package. By default it returns some non-standard types that code developed against other database drivers may not expect, and may not be ready to defend against. Danger, Will Robinson! Trying the new package One can try the newer RPostgres as a drop-in replacement for the usual RPostgreS...
1652 sym R (1206 sym/29 pcs) 2 img
R Tip: Use stringsAsFactors = FALSE
R tip: use stringsAsFactors = FALSE. R often uses a concept of factors to re-encode strings. This can be too early and too aggressive. Sometimes a string is just a string. Sigmund Freud, it is often claimed, said: “Sometimes a cigar is just a cigar.” To avoid problems delay re-encoding of strings by using stringsAsFactors = FALSE when cre...
1291 sym R (463 sym/2 pcs) 2 img
R Tip: Break up Function Nesting for Legibility
There are a number of easy ways to avoid illegible code nesting problems in R. In this R tip we will expand upon the above statement with a simple example. At some point it becomes illegible and undesirable to compose operations by nesting them, such as in the following code. head(mtcars[with(mtcars, cyl == 8), c("mpg", "cyl", "wt")]) # ...
2258 sym R (1580 sym/5 pcs) 2 img
R Tip: Use let() to Re-Map Names
Another R tip. Need to replace a name in some R code or make R code re-usable? Use wrapr::let(). Here is an example involving dplyr. Let’s look at some example data: library("dplyr") library("wrapr") starwars %>% select(., name, homeworld, species) %>% head(.) # # A tibble: 6 x 3 # name homeworld species # <chr> <chr...
3554 sym R (1699 sym/7 pcs) 2 img
R Tip: Use Named Vectors to Re-Map Values
Here is an R tip. Want to re-map a column of values? Use a named vector as the mapping. Example: library("dplyr") library("wrapr") head(starwars[, qc(name, gender)]) # # A tibble: 6 x 2 # name gender # <chr> <chr> # 1 Luke Skywalker male # 2 C-3PO NA # 3 R2-D2 NA # 4 Darth Vader male # 5...
1170 sym R (1060 sym/4 pcs)
Hangul/Korean edition of Practical Data Science with R!
Excited to see our new Hangul/Korean edition of “Practical Data Science with R” by Nina Zumel, John Mount, translated by Daekyoung Lim. Thank you for producing a handsome edition, Manning and JPub.kr! Related To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog. R-bloggers.com offers ...
605 sym 4 img
R Tip: Think in Terms of Values
R tip: first organize your tasks in terms of data, values, and desired transformation of values, not initially in terms of concrete functions or code. I know I write a lot about coding in R. But it is in the service of supporting statistics, analysis, predictive analytics, and data science. R without data is like going to the theater to watch th...
6913 sym R (1526 sym/10 pcs)
Four Years of Practical Data Science with R
Four years ago today authors Nina Zumel and John Mount received our author’s copies of Practical Data Science with R! It has its imitators, but it remains the best “I have R, now what do I do with it?” book (as it works the user through non-trivial projects, analyses, presentations, predictive analytic, data science, and machine learning a...
1131 sym 2 img
magrittr and wrapr Pipes in R, an Examination
Let’s consider piping in R both using the magrittr package and using the wrapr package. magrittr pipelines The magittr pipe glyph “%>%” is the most popular piping symbol in R. magrittr documentation describes %>% as follow. Basic piping: x %>% f is equivalent to f(x) x %>% f(y) is equivalent to f(x, y) x %>% f %>% g %>% h is equivalent...
12339 sym R (3540 sym/11 pcs) 3 tbl
R Tip: Use match_order() to Align Data
R tip. Use wrapr::match_order() to align data. Suppose we have data in two data frames, and both of these data frames have common row-identifying columns called “idx“. library("wrapr") d1 <- build_frame( "idx", "x" | 3 , "a" | 1 , "b" | 2 , "c" ) d2 <- build_frame( "idx", "y" | 2 , "D" | 1 , "E" | 3 ...
1560 sym R (400 sym/2 pcs)