Publications by Mollie
Storing a Function in a Separate File in R
If you’re going to be using a function across several different R files, you might want to store the function in its own file.If you want to name the function in its own fileThis is probably the best option in general, if only because you may want to put more than one function in a single file.Next, let’s make our function in the ...
1283 sym R (130 sym/3 pcs)
Using Line Segments to Compare Values in R
Sometimes you want to create a graph that will allow the viewer to see in one glance:The original value of a variableThe new value of the variableThe change between old and newOne method I like to use to do this is using geom_segment and geom_point in the ggplot2 package.First, let’s load ggplot2 and our data:library(ggplot2) d...
1887 sym R (842 sym/7 pcs) 4 img 2 tbl
GPS Basemaps in R Using get_map
There are many different maps you can use for a background map for your gps or other latitude/longitude data (i.e. any time you’re using geom_path, geom_segment, or geom_point.)get_mapHelpfully, there’s just one function that will allow you to query Google Maps, OpenStreetMap, Stamen maps, or CloudMade maps: get_map in the ggmap...
3171 sym R (751 sym/1 pcs) 20 img 10 tbl
Elevation Profiles in R
First, let’s load up our data. The data are available in a gist. You can convert your own GPS data to .csv by following the instructions here, using gpsbabel.gps <- read.csv("callan.csv", header = TRUE)Next, we can use the function SMA from the package TTR to calculate a moving average of the altitude or elevation data, if we...
1188 sym R (1046 sym/4 pcs) 2 img 1 tbl
Shapefiles in R
Let’s learn how to use Shapefiles in R. This will allow us to map data for complicated areas or jurisdictions like zipcodes or school districts. For the United States, many shapefiles are available from the Census Bureau. Our example will map U.S. national parks.First, download the U.S. Parks and Protected Lands shape files from Na...
1526 sym R (611 sym/4 pcs) 4 img 2 tbl
geom_point Legend with Custom Colors in ggplot
Formerly, I showed how to make line segments using ggplot.Working from that previous example, there are only a few things we need to change to add custom colors to our plot and legend in ggplot.First, we’ll add the colors of our choice. I’ll do this using RColorBrewer, but you can choose whatever method you’d like.library(RColo...
1230 sym R (826 sym/3 pcs) 2 img 1 tbl
Date Formats in R
Importing DatesDates can be imported from character, numeric, POSIXlt, and POSIXct formats using the as.Date function from the base package.If your data were exported from Excel, they will possibly be in numeric format. Otherwise, they will most likely be stored in character format.Importing Dates from Character FormatIf your dates are stored a...
3070 sym R (1092 sym/6 pcs) 1 tbl
Plot Weekly or Monthly Totals in R
When plotting time series data, you might want to bin the values so that each data point corresponds to the sum for a given month or week. This post will show an easy way to use cut and ggplot2‘s stat_summary to plot month totals in R without needing to reorganize the data into a second data frame.Let’s start with a simple samp...
1375 sym R (1962 sym/7 pcs) 4 img 2 tbl
Using colClasses to Load Data More Quickly in R
Specifying a colClasses argument to read.table or read.csv can save time on importing data, while also saving steps to specify classes for each variable later.For example, loading a 893 MB took 441 seconds to load when not using colClasses, but only 268 seconds to load when using colClasses. The system.time function in base can h...
1772 sym R (705 sym/5 pcs)
Only Load Data If Not Already Open in R
I often find it beneficial to check to see whether or not a dataset is already loaded into R at the beginning of a file. This is particularly helpful when I’m dealing with a large file that I don’t want to load repeatedly, and when I might be using the same dataset with multiple R scripts or re-running the same script while making...
1280 sym R (274 sym/2 pcs)