Publications by The Clerk
R’s Tricky == Operator, or "It depends on what the meaning of the word ‘is’ is"
One scenario where R can trip up a programmer is when using the == operator or its relatives. The help page notes that “NA values are regarded as non-comparable”, which introduces some potentially unexpected behavior.As a toy example, look what happens when trying to subset on a column that includes NA values.df dfdf[df$b==4,]df[df$b<=4,]In e...
988 sym 2 img
Tableau 9.0 Connects Directly to R Data Files
Tableau 9.0 will be released soon.Tableau 8 already integrates with some R functionality, but 9.0 actually allows direct connection to R data files.Tableau continues to remove friction between itself and R, further justifying its superior Gartner position. Related To leave a comment for the author, please follow the link and comment on their ...
649 sym 2 img
Finding Similar European Soccer Clubs (with R & Shiny)
Are you a die-hard supporter of one European soccer (football) team (club)? Having a rough season, or just want to watch more matches with passion?This European Team Finder analyzed 126 attributes of the top-flight teams in the marquee national leagues of Europe. Everything was considered, such as tackles, fouls, pass type, crosses, throw-ins,...
1340 sym 2 img
Top 2 Packages for Newly Hired Data Scientists
library(NewCo knowledge)function (X, FUN, …, ) {FUN Read the business wires + Go to lunch with wide range of people + Read the 10-K and maybe 10-Q + �...
6765 sym 4 img
cuRve stitching
Remember curve stitching from grade school? It makes for a nice tutorial for working with some common R functionality.Here’s an example of how to create the appearance of a parabola from plotting a series of straight lines:pkg inst if(length(pkg[!inst]) > 0) install.packages(pkg[!inst],repos=”http://cran.rstudio.com/”)lapply(pkg...
1141 sym 2 img
What Does the AVERAGE Brand Logo Look Like?
PNG images are essentially a grid of values that represent colors to display. Since each cell in the grid is made up of numbers, I got curious about what it might mean to aggregate multiple PNGs. What would it look like to average two or more images? Median? Mode? Random?To do so, I pulled the top 100 brands’ logos from Best Global Brands.Then...
1562 sym R (3978 sym/1 pcs) 8 img
What Does the AVERAGE Brand Logo Look Like?
PNG images are essentially a grid of values that represent colors to display. Since each cell in the grid is made up of numbers, I got curious about what it might mean to aggregate multiple PNGs. What would it look like to average two or more images? Median?To do so, I pulled the top 100 brands’ logos from Best Global Brands.Then I used the (l...
1456 sym R (19463 sym/1 pcs) 4 img
Visualing High Dimensions as DNA Strands
For a community project, I needed to research which U.S. cities were most similar to mine. The U.S. census has some wonderful data that covers 1,579 statistical areas, using the Office of Management & Budget’s definition.With this data, I selected the relevant attributes and then calculated the root mean squared error of the scaled ...
1836 sym 12 img 6 tbl
Visualing High Dimensions as DNA Strands
For a community project, I needed to research which U.S. cities were most similar to mine. The U.S. census has some wonderful data that covers 1,579 statistical areas, using the Office of Management & Budget’s definition.With this data, I selected the relevant attributes and then calculated the root mean squared error of the scaled ...
1836 sym 12 img 6 tbl
The Simpsons as a Chart
Inspired by this clever image, I thought I would whip it up in R.Results:Below is the R code: # Prepare ----------------------------------------------------------------- rm(list=ls());gc() pkg <- c("ggplot2") inst <- pkg %in% installed.packages() if(length(pkg[!inst]) > 0) install.packages(pkg[!inst]) lapply(pkg,librar...
497 sym R (1934 sym/1 pcs) 2 img