Publications by The Clerk

R’s Tricky == Operator, or "It depends on what the meaning of the word ‘is’ is"

11.02.2015

One scenario where R can trip up a programmer is when using the == operator or its relatives. The help page notes that “NA values are regarded as non-comparable”, which introduces some potentially unexpected behavior.As a toy example, look what happens when trying to subset on a column that includes NA values.df dfdf[df$b==4,]df[df$b<=4,]In e...

988 sym 2 img

Tableau 9.0 Connects Directly to R Data Files

11.03.2015

Tableau 9.0 will be released soon.Tableau 8 already integrates with some R functionality, but 9.0 actually allows direct connection to R data files.Tableau continues to remove friction between itself and R, further justifying its superior Gartner position. Related To leave a comment for the author, please follow the link and comment on their ...

649 sym 2 img

Finding Similar European Soccer Clubs (with R & Shiny)

17.03.2015

Are you a die-hard supporter of one European soccer (football) team (club)? Having a rough season, or just want to watch more matches with passion?This European Team Finder analyzed 126 attributes of the top-flight teams in the marquee national leagues of Europe. Everything was considered, such as tackles, fouls, pass type, crosses, throw-ins,...

1340 sym 2 img

Top 2 Packages for Newly Hired Data Scientists

09.07.2015

 library(NewCo knowledge)function (X, FUN, …, ) {FUN                                 Read the business wires +                                Go to lunch with wide range of people +                                Read the 10-K and maybe 10-Q +    �...

6765 sym 4 img

cuRve stitching

15.09.2015

Remember curve stitching from grade school? It makes for a nice tutorial for working with some common R functionality.Here’s an example of how to create the appearance of a parabola from plotting a series of straight lines:pkg inst if(length(pkg[!inst]) > 0) install.packages(pkg[!inst],repos=”http://cran.rstudio.com/”)lapply(pkg...

1141 sym 2 img

What Does the AVERAGE Brand Logo Look Like?

23.10.2015

PNG images are essentially a grid of values that represent colors to display. Since each cell in the grid is made up of numbers, I got curious about what it might mean to aggregate multiple PNGs. What would it look like to average two or more images? Median? Mode? Random?To do so, I pulled the top 100 brands’ logos from Best Global Brands.Then...

1562 sym R (3978 sym/1 pcs) 8 img

What Does the AVERAGE Brand Logo Look Like?

23.10.2015

PNG images are essentially a grid of values that represent colors to display. Since each cell in the grid is made up of numbers, I got curious about what it might mean to aggregate multiple PNGs. What would it look like to average two or more images? Median?To do so, I pulled the top 100 brands’ logos from Best Global Brands.Then I used the (l...

1456 sym R (19463 sym/1 pcs) 4 img

Visualing High Dimensions as DNA Strands

10.01.2016

For a community project, I needed to research which U.S. cities were most similar to mine. The U.S. census has some wonderful data that covers 1,579 statistical areas, using the Office of Management & Budget’s definition.With this data, I selected the relevant attributes and then calculated the root mean squared error of the scaled ...

1836 sym 12 img 6 tbl

Visualing High Dimensions as DNA Strands

10.01.2016

For a community project, I needed to research which U.S. cities were most similar to mine. The U.S. census has some wonderful data that covers 1,579 statistical areas, using the Office of Management & Budget’s definition.With this data, I selected the relevant attributes and then calculated the root mean squared error of the scaled ...

1836 sym 12 img 6 tbl

The Simpsons as a Chart

26.03.2016

Inspired by this clever image, I thought I would whip it up in R.Results:Below is the R code: # Prepare ----------------------------------------------------------------- rm(list=ls());gc() pkg <- c("ggplot2") inst <- pkg %in% installed.packages() if(length(pkg[!inst]) > 0) install.packages(pkg[!inst]) lapply(pkg,librar...

497 sym R (1934 sym/1 pcs) 2 img