planetary-computer
This source is for building datasets from STAC collections on the open Microsoft Planetary Computer.
It’s intended for two types of collections:
1. Containing a collection-level dataset asset under the zarr-abfs key
corresponding to a single Zarr store containing all data.
Containing multiple items and, potentially, multiple assets per item.
Below is an example recipe that builds a dataset using the Near-surface level collection Met Office global deterministic 10km forecast collection.
dates:
start: 2020-01-01T00:00:00+00:00
end: 2020-01-02T00:00:00+00:00
frequency: 6h
input:
planetary-computer:
data_catalog_id: met-office-global-deterministic-near-surface
param: [rainfall_rate, lwe_snowfall_rate]
search_params:
datetime: 2020-01-01T00:00:00+00:00/2020-01-02T00:00:00+00:00
variable_key_map:
lwe_snowfall_rate: snowfall_rate
filter:
op: and
args:
- op: "="
args:
- property: forecast:horizon
- PT0000H00M
- op: "="
args:
- property: met_office_deterministic:model
- global
The following is applicable to collections with multiple items and assets only.
The search_params config section enables specification of mappings and
filters for the STAC items and assets to include in the dataset. Supported
parameters include:
datetime: passed to the STAC API to filter items by their datetime field(s).variable_key_map: a mapping of data variable names to STAC asset keys for collections where they differ.filter: a CQL2 filter (dict for cql2-json, string for cql2-text) passed directly to the STAC API to filter items server-side.
Tip
While not required, it is recommended to include a datetime filter under
search_params.datetime to reduce query time and the number of results to
filter. See pystac_client.Client.search
for accepted formats.
Tip
To identify a collection’s queryable fields, visit its queryables endpoint
(e.g., ERA5 - PDS queryables
) or use the Python equivalent
pystac_client.CollectionClient(...).get_queryables.
See other example recipes in the tests.