This project lets you serve your dataset as an EDR API, if it is a CF compliant dataset.
The project is based on xarray, and therefore supports the formats available through xarray. Both zarr and NetCDF is supported, zarr is supported through S3 interface and local file, NetCDF only as a local file
You configure the service to read a certain dataset file and parameters, and given the sufficient CF convention metadata is available it will be able to query the dataset by location and time and serve the data through an EDR API.
Dependencies can be installed with pip:
python3.10 -m venv venv
source venv/bin/activate
pip install -r requirements.txtThe app is configured through a YAML file. For an example YAML file, look at tests/data/test_edr_config.yaml.
Set the environemnt variable EDR_SERVICE_CF_CONFIG with the path to your config file. Then in the root directory run the command:
uvicorn edr_service_cf.main:appUse the --reload option if you are doing active development.
Below is a description of the config file that you must provide to EDR service CF:
metadata:
api_title: "Test" # Optional, but defaults to ""
api_description: Test API # Optional, but defaults to ""
# Optional: Metadata keywords added to landing page metadata
keywords:
- test
provider:
name: "Test Institute"
url: "https://test-institute.test"
license:
name: CC-BY 4.0 license
url: https://creativecommons.org/licenses/by/4.0/
# Contact information that is part of landing page metadat
contact:
address: Mailing Address
city: City
postalcode: Zip or Postal Code
country: Country
phone: +xx-xxx-xxx-xxxx
email: you@example.org
hours: Mo-Fr 08:00-17:00 # Optional
external_documentation: https://institute.country/open_data_docs
# Metadata links added to landing page and metadata links of /collections
extra_links:
- content_type: text/html
rel: download
title: Download of large datasets
href: https://institute.country/downloads
collections:
- collection_id: test_time_series # Must be unique among collections
# Title added to collection metadata, overwrites title from dataset attributes
title: Test time collection
# Description added to collection metadata, overwrites description from dataset attributes
description: "Testing"
# Extra metadata links that describes the dataset, will be added to collection metadata links
links:
- content_type: text/html
rel: canonical
title: information
href: https://test.test/test_docs
hreflang: en-EN
crs_epsg_code: "EPSG:4326" # Optional: Alternative to the crs_field above, if the dataset lacks the metadata you can hard code it with this parameter
# Configure how to retrieve datasource, either local or s3
# local: For a dataset file accessible on local storage
# s3: For zarr files accessible through S3
data_source:
engine: local # datasource type, either local or s3, see above
file_path: tests/data/test_time_series_wgs84.zarr # Path of file to dataset
# Configure how EDR service CF should interpret the dataset
data_config:
# either grid or time_series, depending on the type of dataset it is
# time_series: if featureType = timeseries in dataset attributes
data_type: time_series
x_axis_field: x # name of dataset field with 1st axis coordinates
y_axis_field: y # name of dataset field with 2nd axis coordinates
station_id_field: station_id # name of dataset timeseries axis, only for data_type: timeseries
# https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#time-coordinate
time_field: t # name of dataset time axis
time_reference_field: time_reference # name of dataset field with reference time, see conventions
covjson_return_type: "multi_point" # Either "multi_point" or "coverage_collection". Determines domain type of coverage json returned by cube query
# Dataset field following CF conventions for CRS: https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#grid-mappings-and-projections
crs_field: crs
# If dataset contains station metadata fields, they can be configured to be outputted
extra_station_metadata_fields:
- dataset_name: foo # Field name in dataset
output_name: bar # Field name in output GeoJSONContributions are welcome!
All code most be formatted with the black formatter, the (to be) pipeline will reject any PR not formatted correctly.
This project uses the tools already available:
- xarray
- FastAPI
- pydantic-edr
- pydantic-covjson
- pydantic-geojson
- covjson-reader
- leaflet-covjson