weatherbench2.evaluation.evaluate_with_beam

weatherbench2.evaluation.evaluate_with_beam(data_config, eval_configs, *, input_chunks, runner, fanout=None, shuffle_before_temporal_mean=False, num_threads=None, argv=None, skipna=False)

Run evaluation with a Beam pipeline.

Will save a separate results NetCDF file for each config.Eval. An example for a results dataset with the respective dimensions is given below. Note that region and level are optional.

``` <xarray.Dataset> Dimensions: (lead_time: 21, region: 3, level: 3, metric: 2) Coordinates:

lead_time (lead_time) timedelta64[ns] 0 days 00:00:00 …

region (region) object ‘global’ ‘tropics’ ‘extra-tropics’

level (level) int32 500 700 850

metric (metric) object ‘rmse’ ‘acc’

Data variables:: geopotential (metric, region, lead_time, level) float64 … 2m_temperature (metric, region, lead_time) float64 0.6337 …

```

Parameters:

data_config (Data) – config.Data instance.
eval_configs (dict[str, weatherbench2.config.Eval]) – Dictionary of config.Eval instances.
input_chunks (Mapping[str, int]) – Chunking of input datasets.
runner (str) – Beam runner.
fanout (Optional[int]) – Fanout parameter for Beam combiners in the temporal mean.
shuffle_before_temporal_mean (bool) – If True, shuffle before computing the temporal mean. This is a good idea when evaluation metric outputs are small compared to the size of the input data, such as when aggregating over space or a large ensemble.
num_threads (Optional[int]) – Number of threads to use for reading/writing data.
argv (Optional[list[str]]) – Other arguments to pass into the Beam pipeline.
skipna (bool) – Whether to skip NaN values in both forecasts and observations during evaluation.

Return type:

None