weatherbench2.metrics.RankHistogram

class weatherbench2.metrics.RankHistogram(ensemble_dim='realization', num_bins=None, break_ties_randomly=True, seed=None)

Histogram of truth’s rank with respect to forecast ensemble members.

Given a K member ensemble {Xᵢ}, and ground truth Y, the rank of Y is the count of ensemble members less than Y. This class expresses that rank with one-hot encoding, which facilitates averaging/summation (typically over time) to form rank histograms. The histograms will have K+1 bins, and are indexed by ‘bin’.

This class also allows aggregation of the bins into num_bins ≤ K + 1 provided num_bins evently divides K + 1. This reduces the size of output files, but is equivalent to averaging default results along the ‘bin’ dimension.

If these one-hot encodings are averaged over N times, a well calibrated forecast should contain roughly equal values in the bins. The bin variance will be (num_bins - 1) / (N num_bins²). Since the expected value is 1 / num_bins, the relative error is

Sqrt(variance) / expected = Sqrt((num_bins - 1) / N).

NaN values are treated as larger than any other. The skipna kwarg is ignored.

Parameters:
  • ensemble_dim (str) –

  • num_bins (Optional[int]) –

  • break_ties_randomly (bool) –

  • seed (Optional[int]) –

__init__(ensemble_dim='realization', num_bins=None, break_ties_randomly=True, seed=None)

Initializes a RankHistogram.

Parameters:
  • ensemble_dim (str) – Dimension indexing ensemble member.

  • num_bins (Optional[int]) – Number of bins in histogram. If None, the number of bins will be ensemble_size + 1. If provided, num_bins must evenly divide into ensemble_size + 1.

  • break_ties_randomly (bool) – If True, break ties with the following behavior. If a subset of bins are identical (due to identical ensemble members), and truth falls within the corresponding bins, a random choice (within the tied bins) is made. If truth is exactly equal to some ensemble members, it is randomly assigned a bin within the tied bins.

  • seed (Optional[int]) – Seed for RNG used to break ties.

Methods

__init__([ensemble_dim, num_bins, ...])

Initializes a RankHistogram.

compute(forecast, truth[, region, skipna])

Evaluate this metric on datasets with full temporal coverages.

compute_chunk(forecast, truth[, region, skipna])

Computes one-hot encoding of rank on a chunk of forecast/truth.

Attributes

ensemble_dim