Skip to content

GeoPlant Dataset

The GeoPlant dataset comprises Species Observation data (i.e., Presence-Only (PO) occurrences and Presence-Absence (PA) surveys) and a wide set of Environmental Predictors. It covers 38 European countries and 8 major biogeographic regions (e.g., Alpine, Atlantic, and Boreal).

For each species observation, we provide:

  • Diverse environmental rasters (e.g., elevation, human footprint, land use, soil)
  • Sentinel-2 RGB and Near-Infra-Red satellite images (128×128 pixels at 10 m resolution)
  • A 20-year time series of climatic variables
  • Satellite time-series point values for six bands (R, G, B, NIR, SWIR1, SWIR2) from Landsat

For a detailed description of all predictor modalities, see the Predictors & Modalities page.

Geo spatial scale of the dataset

Figure 1. Geo spatial scale of the dataset. Presence-Only (PO) data spans all habitable Europe, while Presence-Absence (PA) training and test sites are primarily in France, Denmark, Switzerland, and Czechia.


Species Observation Data

The dataset contains approximately 5M PO occurrences and around 90K PA surveys.

PO data covers most of Europe, but is sampled opportunistically without a standardized protocol, leading to various biases. Local observation of a species does not guarantee other species are truly absent. PA surveys are conducted by experts and provide much more reliable information.

Presence-Absence (PA) surveys

A PA survey is an expert inventory of all plant species in a given plot (10–400 m²). All unobserved species are likely truly absent.

  • Source: 29 datasets hosted in the European Vegetation Archive (EVA)
  • Size: 93,703 surveys covering 5,016 species (≈ half of the European flora)
  • Imbalance: Most species are rarely observed in PA surveys.
  • Train/Test Splits: 95%/5% using spatial block hold-out (10×10 km grid) to balance biogeographical regions.

Presence-Only (PO) occurrences

A PO occurrence is a geolocated species observation with unknown sampling protocol, providing no info on species absences. Sampling effort is highly heterogeneous in space, time, and across species—most PO records come from citizen science, are concentrated in accessible/populated areas, and focus on charismatic/easy species. Nevertheless, PO data helps compensate for PA survey gaps when models control for sampling bias.

  • Size: ~5 million records for 9,709 plant species (2017–2021)
  • Source: 13 pre-selected datasets from GBIF

Table 1. Presence-Only dataset sources Selected GBIF datasets cover 38 European countries. "Uniq. species" indicates the number of unique species in each dataset compared to the rest.

GBIF Dataset Name Records Species Uniq. species
Pl@ntNet Observations + Pl@ntNet Occurrences 2,298,884 4,631 295
Danmarks Miljøportals Naturdatabase 691,313 1,457 14
iNaturalist Research-grade Observations 625,681 7,496 1,754
Norwegian Species Observation Service 601,101 2,243 167
Observation.org 241,205 5,108 429
Non-native plant occurrences in Flanders/Brussels 178,544 1,464 134
Artportalen (Swedish Species Observation System) 163,513 2,771 464
National Plant Monitoring Scheme U.K. 120,413 1,109 11
Vascular plant records via iRecord 103,213 2,179 99
Swiss National Databank of Vascular Plants 49,173 58 2
Invazivke - Invasive Alien Species in Slovenia 4,171 60 1
Masaryk University - Herbarium BRNU 2,586 1,321 122
GeoPlant PO data (Combined) 5,079,797 9,709 ---

Environmental Predictors

The environmental predictor data are crucial for modeling.
Each observation (PO or PA) is accompanied by:

  • A 4-band 128×128 satellite image at 10 m resolution (Sentinel-2)
  • Time series of 6 satellite bands (Landsat; R, G, B, NIR, SWIR1, SWIR2; 1999–2020)
  • Environmental rasters at European scale: climate, soil, elevation, land use, human footprint
  • Monthly climatic rasters (CHELSA; 4 variables, 2000–2019)

For full details and variable lists, see the Predictors & Modalities page.


For data download and file structure, see the Resources page.
Please cite the GeoPlant paper if you use or redistribute this dataset.