import os
"AWS_NO_SIGN_REQUEST"]="TRUE" # anonymous access for public buckets
os.environ["AWS_S3_ENDPOINT"]="minio.carlboettiger.info"
os.environ["AWS_VIRTUAL_HOSTING"]="FALSE" os.environ[
About the Biodiversity Dataset Catalog
Overview
Our Biodiversity Dataset Catalog seeks to make a range of biodiversty data more readily findable, accessible and interoperable (see FAIR principles), while in a manner consistent with data sovereignty and ethics (CARE principles). Data products listed here support access with portable, cloud native, high-performance protocols that avoid the need to download entire data files. When existing data providers already meet these objectives, those sources are simply linked from the catalog. Other sources are mirrored on a local object store based on an open source software platform which supports the widely used S3 API for cloud-native operations (MINIO). Access to mirrored content may require access keys to meet licensing and re-distribution policies of certain data providers. The entire setup, including catalog and object store, can be easily replicated on commodity hardware using only open source tools.
Data Access
Large geospatial raster and vector data can best be accessed with ‘cloud-native’ protocols such as the virtual filesystem in GDAL, a popular C(++) library widely used to power various open source geospatial applications. This approach obviates the need to download and store very large data files. It works in anywhere an internet connection is available, though best performance can be achieved on a high-bandwidth or local area network.
Account keys and settings
Access keys are required for data that is mirrored on the DSE cloud object store and for which licensing requirements prohibit re-distribution. Enter your AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
in ~/.aws/credentials
file or as environmental variables. Some objects mirrored to the DSE object store are not subject to licenses preventing redistribution, and are made available on public buckets. These can be accessed with empty key/secret credentials, or by setting AWS_NO_SIGN_REQUEST=FALSE
.
To let the software know that data is not hosted on AWS but rather, at an independent object store, we must specify the domain of the new location as AWS_S3_ENDPOINT
environmental variable, and indicate that our system does not use ‘virtual hosting’ style paths, AWS_VIRTUAL_HOSTING=FALSE
. Environmental variables can be set in ~/.bashrc
(most platforms) or .Renviron
file (for RStudio users)); though for clarity we will show how they can be set directly in python or R. See GDAL VFS documentation for details.
Some data linked in this catalog is already available from public cloud-based object stores such as Azure, Google Storage, or AWS. Note that the former two require their own virtual filesystem prefixes, /vsiaz/
and /vsigs
respectively, as described in the docs.
Python
We can then read in raster or vector files using the GDAL virtual filesystem (VFS) URLs given in the catalog (click the button to copy the URL to your clipbard)
import rioxarray
= "/vsis3/public-biodiversity/carbon/cogs/irrecoverable_c_total_2018.tif"
vfs_url
= rioxarray.open_rasterio(vfs_url) r
R
This process is nearly identical using any of the popular spatial frameworks in R, e.g. terra
, stars
, or sf
.
Sys.setenv("AWS_NO_SIGN_REQUEST"="TRUE") # set FALSE when password is required
Sys.setenv("AWS_S3_ENDPOINT"="minio.carlboettiger.info")
Sys.setenv("AWS_VIRTUAL_HOSTING"="FALSE")
library(terra)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
Warning: /usr/lib/x86_64-linux-gnu/libtiff.so.6: version `LIBTIFF_4.6.1' not
found (required by /opt/conda/lib/gdalplugins/../libgdal.so.36) (GDAL error 1)
code for methods in class "Rcpp_SpatCategories" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatCategories" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatDataFrame" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatDataFrame" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatExtent" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatExtent" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatFactor" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatFactor" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatGraph" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatGraph" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatMessages" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatMessages" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatOptions" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatOptions" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatRaster" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatRaster" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatRasterCollection" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatRasterCollection" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatRasterStack" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatRasterStack" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatSRS" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatSRS" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatTime_v" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatTime_v" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatVector" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatVector" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatVectorCollection" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatVectorCollection" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatVectorProxy" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
code for methods in class "Rcpp_SpatVectorProxy" was not checked for suspicious field assignments (recommended package 'codetools' not available?)
terra 1.8.5
= "/vsis3/public-biodiversity/carbon/cogs/irrecoverable_c_total_2018.tif"
vfs_url = rast(vfs_url) r
GDAL and lazy evaluation
It is worth becoming familiar with support for GDAL’s lazy evaluation mechanisms in your preferred client platform. These can allow you to subset, crop, re-sample, and transform your data over the remote network connection, which can improve computational speed and saves available disk and RAM space. Illustrative examples of this in R and python are provided in the example notebooks on this site.
Categories of Data Access
Datasets included in this catalog fall into several different categories of access:
Data that is publicly available from a high-bandwidth source in optimized format.
Data that is freely available under permissive licenses, but only from low-bandwidth hosts or in non-optimized formats. Data shared on scientific data repositories such as Zenodo usually fall in this category.
Data that is freely available but under restricted use, typically requiring a request (automated or manual). Such distributors often also have low-bandwidth distribution and formats of category 2.
Data that is shared only privately or under explicit non-disclosure contracts.
Sensitive or sovereign data which can only be stored in approved geographic regions or accessed only from secure systems.
This catalog seeks to help research teams bridge across these diverse types and needs in biodiversity data while respecting necessary restrictions of each category using distributed and open source tooling.