Publications by A.M. Barbosa

Delimiting the modelling background for scattered uneven occurrence data

31.10.2024

In species distribution modelling and ecological niche modelling (SDM & ENM), the region from where background or pseudoabsence points are picked is key to how well a model turns out. This region should include sufficient localities for the model to assess the species’ (non-)preferences, but it should also be within the species’ reach AND reaso...

3204 sym Python (1758 sym/3 pcs) 6 img

Getting marine polygon maps in R

05.02.2024

Another frequent question of my students is how to obtain a polygon map of the seas and oceans, rather than the land polygons (countries, etc.) that are commonly imported with R spatial data packages. You can mostly just use the land polygons and do the opposite operation as you would do for terrestrial features — e.g., to colour the sea in a ma...

1750 sym Python (1111 sym/2 pcs) 4 img

Extract raster values to points with bilinear interpolation

26.01.2024

A student recently asked me how exactly the R terra::extract() function worked when using method="bilinear" to get raster values for points. The help file rightly says that ‘With “bilinear” the returned values are interpolated from the values of the four nearest raster cells‘, but this wasn’t immediately clear without a visual example. So...

1416 sym R (1198 sym/4 pcs) 4 img

Actual pixel sizes of unprojected raster maps

23.01.2024

It is well known, though often dismissed, that the areas of spatial units (cells, pixels) based on unprojected coordinates (longitude-latitude degrees, arc-minutes or arc-seconds) are wildly inconsistent across the globe. Towards the poles, as the longitude meridians approach each other, the actual ground width of the pixels sharply decreases. So, ...

1911 sym R (1513 sym/2 pcs) 2 img

Weighted probability vs. favourability

11.07.2023

Presence probability, typically obtained with presence-(pseudo)absence modelling methods like GLM, GAM, GBM or Random Forest, is conditional not only on the suitability of the environmental conditions, but also on the general prevalence (proportion of presences) of the species in the study area. So, a species with few presences will generally have ...

3261 sym R (1886 sym/2 pcs) 4 img

Removing absences from GBIF datasets

17.04.2023

I often come across GBIF users who are unaware that the records available for a given taxon are not necessarily all presences: there’s a column named “occurrenceStatus” whose value can be “PRESENT” or “ABSENT”! The absence records can, of course, be removed with simple operations in R or even omitted from the download, but many users ...

1546 sym R (836 sym/1 pcs) 2 img

Getting continent, mainland and island maps in R

08.03.2023

Maps of continents, mainlands and islands can be useful, for example, for selecting areas — and then cropping or masking variables — for modelling a species’ distribution. Here’s a way to obtain such maps using the ‘geodata’ and ‘terra’ R packages: # load required packages: library(terra) library(geodata) # import a world countries...

840 sym R (1959 sym/5 pcs) 10 img

Safe-and-simple cleaning of species occurrences

25.01.2023

In my species distribution modelling courses, for a quick and safe removal of the most common biodiversity database errors, I’ve so far used functions from the scrubr R package, namely ‘coord_incomplete’, ‘coord_impossible’, ‘coord_unlikely’, ‘coord_imprecise’ and ‘coord_uncertain’. There are other R packages for species occ...

2290 sym R (772 sym/1 pcs) 2 img

Lollipop chart

05.01.2023

According to modern recommendations in data viz, lollipop charts are generally a better alternative to bar charts, as they reduce the visual distortion caused by the length of the bars, making it easier to compare the values. So, in the next versions of the ‘modEvA‘ and ‘fuzzySim‘ packages, functions that produce bar plots will instead (b...

1236 sym R (2247 sym/3 pcs) 4 img

Model evaluation with presence points and raster predictions

06.05.2022

The Boyce index (Boyce et al. 2002) is often described as a presence-only metric for evaluating the predictions of species distribution (or ecological niche, or habitat suitability) models (e.g. Hirzel et al. 2006, Cianfrani et al. 2010, Bellard et al. 2013, Valavi et al. 2022). It measures the predicted-to-expected ratio of presences in each cla...

4459 sym R (1632 sym/2 pcs) 4 img