weatherbench2.metrics.RankHistogram

class weatherbench2.metrics.RankHistogram(ensemble_dim='realization', num_bins=None)

Histogram of truth’s rank with respect to forecast ensemble members.

Given a K member ensemble {Xᵢ}, and ground truth Y, the rank of Y is the count of ensemble members less than Y. This class expresses that rank with one-hot encoding, which facilitates averaging/summation (typically over time) to form rank histograms. The histograms will have K+1 bins, and are indexed by ‘bin’.

This class also allows aggregation of the bins into num_bins ≤ K + 1 provided num_bins evently divides K + 1. This reduces the size of output files, but is equivalent to averaging default results along the ‘bin’ dimension.

If these one-hot encodings are averaged over N times, a well calibrated forecast should contain roughly equal values in the bins. The bin variance will be (num_bins - 1) / (N num_bins²). Since the expected value is 1 / num_bins, the relative error is

Sqrt(variance) / expected = Sqrt((num_bins - 1) / N).

NaN values are treated as larger than any other. The skipna kwarg is ignored.

Parameters:
  • ensemble_dim (str) –

  • num_bins (Optional[int]) –

__init__(ensemble_dim='realization', num_bins=None)

Initializes a RankHistogram.

Parameters:
  • ensemble_dim (str) – Dimension indexing ensemble member.

  • num_bins (Optional[int]) – Number of bins in histogram. If None, the number of bins will be ensemble_size + 1. If provided, num_bins must evenly divide into ensemble_size + 1.

Methods

__init__([ensemble_dim, num_bins])

Initializes a RankHistogram.

compute(forecast, truth[, region, skipna])

Evaluate this metric on datasets with full temporal coverages.

compute_chunk(forecast, truth[, region, skipna])

Computes one-hot encoding of rank on a chunk of forecast/truth.

Attributes

ensemble_dim