Selecting grid points

thinning

You can thin a dataset by specifying the thinning parameter in the open_dataset function. The thinning parameter depends on the method selected. The default (and only) method is “every-nth”, which will mask out all but every Nth point, with N specified by the thinning parameter.

ds = open_dataset(dataset, thinning=N, method="every-nth")

Please note that the thinning will apply to all dimensions of the fields. So for 2D fields, the thinning will apply to both the latitude and longitude dimensions. For 1D fields, such as reduced Gaussian grids, the thinning will apply to the only dimension.

The following example shows the effect of thinning a dataset with a 1 degree resolution:

../_images/thinning-before.png

Thinning the dataset with thinning=4 will result in the following dataset:

../_images/thinning-after.png

masking

You can apply an arbitrary spatial mask to a dataset by specifying the mask parameter in the open_dataset function. The mask must be a NumPy .npy file containing a boolean array, where True indicates points to be kept and False indicates points to be removed.

ds = open_dataset(dataset, mask="path/to/mask.npy")

The mask array must have the same total number of grid points and dimension as the dataset. Please note that this masking will not be automatically applied to the input data provided during inference. Users must ensure they are applying the mask explicitly, for instance via the use of pre-processors.

area

You can crop a dataset to a specific area by specifying the area in the open_dataset function. The area is specified as a list of four numbers in the order (north, west, south, east). For example, to crop a dataset to the area between 60N and 20N and 50W and 0E, you can use:

ds = open_dataset(dataset, area=(60, -50, 20, 0))

Which will result in the following dataset:

../_images/area-1.png

Alternatively, you can specify another dataset as the area. In this case, the bounding box of the dataset will be used.

ds = open_dataset(dataset1, area=dataset2)

trim_edge

You can remove the edges of a limited area domain by specifying trim_edge parameter in the open_dataset function. This can either be an integer, representing the number of gridpoints to remove along each edge, or a tuple of four integers in the order (lower_dim0, upper_dim0, lower_dim1, upper_dim1).

That is, the following

ds = open_dataset(dataset1, trim_edge=(3, 10, 4, 2))

will remove the first 3 and last 10 rows of the domain, and the first 4 and last 2 columns of the domain. If the first dimension of the grid is the y-dimension (i.e north/south), then 3 grid points in the south, 10 in the north, 4 in the west and 10 in the east will be removed.

Note that if thinning is also specified, trim_edge is applied first.