Publications by Mollie

Truncate by Delimiter in R

19.09.2013

Sometimes, you only need to analyze part of the data stored as a vector. In this example, there is a list of patents. Each patent has been assigned to one or more patent classes. Let’s say that we want to analyze the dataset based on only the first patent class listed for each patent.patents <- data.frame( patent = 1:30, class =...

1407 sym R (666 sym/2 pcs)

Perform a Function on Each File in R

26.09.2013

Sometimes you might have several data files and want to use R to perform the same function across all of them. Or maybe you have multiple files and want to systematically combine them into one file without having to open each file and manually copy the data out.Fortunately, it’s not complicated to use R to systematically iterate acr...

619 sym R (45 sym/1 pcs)

Custom Legend in R

10.10.2013

This particular custom legend was designed with three purposes:To effectively bin values based on a theoretical minimum and maximum value for that variable (e.g. -1 and 1 or 0 and 100)To use a different interval notation than the defaultTo handle NA valuesEven though this particular legend was designed with those needs, it should be s...

3060 sym R (2313 sym/10 pcs) 4 img 2 tbl

Line Breaks Between Words in Axis Labels in ggplot in R

17.10.2013

Sometimes when plotting factor variables in R, the graphics can look pretty messy thanks to long factor levels. If the level attributes have multiple words, there is an easy fix to this that often makes the axis labels look much cleaner.Without Line BreaksHere’s the messy looking example:No line breaks in axis labelsAnd here’s the...

1413 sym R (535 sym/3 pcs) 8 img 4 tbl

Check if a Variable Exists in R

05.12.2013

If you use attach, it is easy to tell if a variable exists. You can simply use exists to check:>attach(df) >exists("varName") [1] TRUE However, if you don’t use attach (and I find you generally don’t want to), this simple solution doesn’t work.> detach(df) > exists("df$varName") [1] FALSE Instead of using exists, you can use ...

929 sym R (242 sym/4 pcs)

Compare Regression Results to a Specific Factor Level in R

06.02.2014

Including a series of dummy variables in a regression in R is very simple. For example,ols <- lm(weight ~ Time + Diet, data = ChickWeight) summary(ols) The above regression automatically includes a dummy variable for all but the first level of the factor of the Diet variable.Call: lm(formula = weight ~ Time + Diet, data = ChickWeig...

1578 sym R (1763 sym/5 pcs)

ggplot Fit Line and Lattice Fit Line in R

13.02.2014

Let’s add a fit line to a scatterplot!Fit Line in Base GraphicsHere’s how to do it in base graphics:ols <- lm(Temp ~ Solar.R, data = airquality) summary(ols) plot(Temp ~ Solar.R, data = airquality) abline(ols) Fit line in base graphics in RFit Line in ggplotAnd here's how to do it in ggplot:library(ggplot2) ggplot(data = air...

915 sym R (376 sym/3 pcs) 6 img 3 tbl

Merge by City and State in R

20.02.2014

Often, you’ll need to merge two data frames based on multiple variables. For this example, we’ll use the common case of needing to merge by city and state.First, you need to read in both your data sets:# import city coordinate data: coords <- read.csv("cities-coords.csv", header = TRUE, sep = ",") # import population data: da...

1835 sym R (1915 sym/5 pcs)

Deaths

10.12.2021

library('tidyverse') ## -- Attaching packages --------------------------------------- tidyverse 1.3.1 -- ## v ggplot2 3.3.5 v purrr 0.3.4 ## v tibble 3.1.6 v dplyr 1.0.7 ## v tidyr 1.1.4 v stringr 1.4.0 ## v readr 2.1.0 v forcats 0.5.1 ## -- Conflicts ------------------------------------------ tidyverse_conflicts() -- ...

38 sym R (15199 sym/33 pcs)