Publications by A.M. Barbosa
Delimiting the modelling background for scattered uneven occurrence data
In species distribution modelling and ecological niche modelling (SDM & ENM), the region from where background or pseudoabsence points are picked is key to how well a model turns out. This region should include sufficient localities for the model to assess the species’ (non-)preferences, but it should also be within the species’ reach AND reaso...
3204 sym Python (1758 sym/3 pcs) 6 img
Getting marine polygon maps in R
Another frequent question of my students is how to obtain a polygon map of the seas and oceans, rather than the land polygons (countries, etc.) that are commonly imported with R spatial data packages. You can mostly just use the land polygons and do the opposite operation as you would do for terrestrial features — e.g., to colour the sea in a ma...
1750 sym Python (1111 sym/2 pcs) 4 img
Extract raster values to points with bilinear interpolation
A student recently asked me how exactly the R terra::extract() function worked when using method="bilinear" to get raster values for points. The help file rightly says that ‘With “bilinear” the returned values are interpolated from the values of the four nearest raster cells‘, but this wasn’t immediately clear without a visual example. So...
1416 sym R (1198 sym/4 pcs) 4 img
Actual pixel sizes of unprojected raster maps
It is well known, though often dismissed, that the areas of spatial units (cells, pixels) based on unprojected coordinates (longitude-latitude degrees, arc-minutes or arc-seconds) are wildly inconsistent across the globe. Towards the poles, as the longitude meridians approach each other, the actual ground width of the pixels sharply decreases. So, ...
1911 sym R (1513 sym/2 pcs) 2 img
Weighted probability vs. favourability
Presence probability, typically obtained with presence-(pseudo)absence modelling methods like GLM, GAM, GBM or Random Forest, is conditional not only on the suitability of the environmental conditions, but also on the general prevalence (proportion of presences) of the species in the study area. So, a species with few presences will generally have ...
3261 sym R (1886 sym/2 pcs) 4 img
Removing absences from GBIF datasets
I often come across GBIF users who are unaware that the records available for a given taxon are not necessarily all presences: there’s a column named “occurrenceStatus” whose value can be “PRESENT” or “ABSENT”! The absence records can, of course, be removed with simple operations in R or even omitted from the download, but many users ...
1546 sym R (836 sym/1 pcs) 2 img
Getting continent, mainland and island maps in R
Maps of continents, mainlands and islands can be useful, for example, for selecting areas — and then cropping or masking variables — for modelling a species’ distribution. Here’s a way to obtain such maps using the ‘geodata’ and ‘terra’ R packages: # load required packages: library(terra) library(geodata) # import a world countries...
840 sym R (1959 sym/5 pcs) 10 img
Safe-and-simple cleaning of species occurrences
In my species distribution modelling courses, for a quick and safe removal of the most common biodiversity database errors, I’ve so far used functions from the scrubr R package, namely ‘coord_incomplete’, ‘coord_impossible’, ‘coord_unlikely’, ‘coord_imprecise’ and ‘coord_uncertain’. There are other R packages for species occ...
2290 sym R (772 sym/1 pcs) 2 img
Lollipop chart
According to modern recommendations in data viz, lollipop charts are generally a better alternative to bar charts, as they reduce the visual distortion caused by the length of the bars, making it easier to compare the values. So, in the next versions of the ‘modEvA‘ and ‘fuzzySim‘ packages, functions that produce bar plots will instead (b...
1236 sym R (2247 sym/3 pcs) 4 img
Model evaluation with presence points and raster predictions
The Boyce index (Boyce et al. 2002) is often described as a presence-only metric for evaluating the predictions of species distribution (or ecological niche, or habitat suitability) models (e.g. Hirzel et al. 2006, Cianfrani et al. 2010, Bellard et al. 2013, Valavi et al. 2022). It measures the predicted-to-expected ratio of presences in each cla...
4459 sym R (1632 sym/2 pcs) 4 img