Open dataset parametersο
This page is intended to provide level of support/applicability of the various parameters that can be passed to the open_dataset function when dealing with tablular datasets. Tabular datasets are a new addition to the anemoi-datasets package and are still under development. The meaning of the emojis is as follows:
β : Parameter has been tested and works for tabular datasets.
π: Parameter will work, but has a slightly different meaning or behaviour for tabular datasets.
β: Parameter is not applicable to tabular datasets.
β: Parameter may work, but has not been tested with tabular datasets.
π§ͺ: Parameter should work, but has not been tested with tabular datasets.
β οΈ: Parameter may work, but the behaviour is not fully understood.
π οΈ: Will work in the future. Not yet implemented or tested, but expected to work without major issues.
ποΈ: Is obsolete and will be removed in the future. Should not be used for new code, but may still work for now. Will be removed in a future release.
Warning
Gridded and tabular datasets cannot be combined. So concat, join, and similar operations are not supported between these types of datasets. The table below assumes that when combining datasets, they have the same layout. See Dataset layouts for more information.
parameter |
gridded |
tabular |
comment |
adjust |
β |
β οΈ |
Adjustment mode when combining datasets, e.g. select common dates, variables, etc. This needs testing and possibly decision on expected behaviour for tabular datasets |
area |
β |
β |
Spatial cropping area as a list [lon_min, lat_min, lon_max, lat_max]. |
chain |
ποΈ |
ποΈ |
Experimental chain operation. Same behaviour as concat, but does not check that the dates are continous. Will be removed in the future. |
concat |
β |
β οΈ |
Concatanate two or more datasets along the time dimension. That may work, but the behaviour of the windowing at the seam is not well defined. So it should be skipped for now. |
complement |
β |
β |
Complement/cutout configuration (used for creating complements). |
cutout |
β |
β |
List of datasets used as cutouts for complements/cutout operations. |
drop |
β |
π§ͺ |
Variables to drop (list). |
end |
β |
π |
Set the end date for the opened dataset. For gridded datasets, the date must be present in the dataset. For tabular datasets, the date is used as-is, and any windows requested between the actual end of date of the data and that date will return empty arrays (See Tabular). |
ensemble |
β |
β |
List of datasets forming an ensemble (e.g. |
fill_missing_dates |
β |
β |
Method to fill missing dates (βinterpolateβ or βclosestβ). |
fill_missing_gaps |
β |
β |
Fill virtual datasets for gaps when concatenating. |
frequency |
β |
π |
For gridded dataset, select the frequency of the return sample; it must be a multiple of the dataset frequency. For tabular datasets, it is used to create windows of the specified frequency (e.g. β1Dβ for daily windows) and is not connected to the dataset frequency (which is undefined) (See Tabular). |
grids |
β |
β |
List of grids/datasets to combine as multiple grid sources. |
interpolate_frequency |
β |
β |
Frequency used to interpolate a dataset to a higher temporal resolution. |
interpolate_variables |
β |
β |
Variables to interpolate spatially (with optional
|
interpolation |
β |
β |
Interpolation method (example: βnearestβ). |
join |
β |
π§ͺ |
Join two or more datasets along the variable dimension. |
max_distance |
β |
β |
Maximum distance used by spatial interpolation (e.g. nearest-neighbour). |
member, members |
β |
β |
0-based member selection (see number for 1-based selection). |
merge |
β |
β |
Merge operation key to combine datasets by overlaying fields. |
name |
β |
β |
Experimental. Optional name assigned to the resulting dataset subset that can be used to name masks that will be retrieved in inference. |
number, numbers |
β |
β |
1-based member selection (see member for 0-based selection). |
reorder |
β |
π§ͺ |
Reorder variables (list or mapping). |
rename |
β |
π§ͺ |
Rename variables mapping. |
rescale |
β |
π οΈ |
Rescaling mapping/tuples/units for variables. |
select |
β |
β |
Select variables (list, set or string). |
set_missing_dates |
β |
β |
Debug option: list of dates to mark as missing. |
shuffle |
ποΈ |
ποΈ |
Boolean to shuffle dataset indices when subsetting. |
skip_missing_dates |
β |
β |
Boolean: skip missing dates when iterating (requires
|
source |
β |
β |
Source dataset name/path used in complement examples. |
start |
β |
π |
Set the start date for the opened dataset. For gridded datasets, the date must be present in the dataset. For tabular datasets, the date is used as-is, and any windows requested between that date and the actual start of date of the data will return empty arrays (See Tabular). |
statistics |
β |
π§ͺ |
Use the statistics of another dataset. |
thinning |
β |
β |
Thinning factor or proportion. |
trim_edge |
β |
β |
Tuple to trim edges of the grid (e.g. |
window |
β |
β |
Window specification for tabular datasets. For gridded datasets, that parameter ignored. See Tabular for details. |
x |
ποΈ |
ποΈ |
Experimental: x coordinate for xy selection. |
xy |
ποΈ |
ποΈ |
Experimental xy selection mode. |
y |
ποΈ |
ποΈ |
Experimental: y coordinate for xy selection. |
zip |
ποΈ |
ποΈ |
Experimental zip mode to combine datasets. |