weatherbench2.metrics.RankHistogram
- class weatherbench2.metrics.RankHistogram(ensemble_dim='realization', num_bins=None, break_ties_randomly=True, seed=None)
Histogram of truth’s rank with respect to forecast ensemble members.
Given a K member ensemble {Xᵢ}, and ground truth Y, the rank of Y is the count of ensemble members less than Y. This class expresses that rank with one-hot encoding, which facilitates averaging/summation (typically over time) to form rank histograms. The histograms will have K+1 bins, and are indexed by ‘bin’.
This class also allows aggregation of the bins into num_bins ≤ K + 1 provided num_bins evently divides K + 1. This reduces the size of output files, but is equivalent to averaging default results along the ‘bin’ dimension.
If these one-hot encodings are averaged over N times, a well calibrated forecast should contain roughly equal values in the bins. The bin variance will be (num_bins - 1) / (N num_bins²). Since the expected value is 1 / num_bins, the relative error is
Sqrt(variance) / expected = Sqrt((num_bins - 1) / N).
NaN values are treated as larger than any other. The skipna kwarg is ignored.
- Parameters:
ensemble_dim (str) –
num_bins (Optional[int]) –
break_ties_randomly (bool) –
seed (Optional[int]) –
- __init__(ensemble_dim='realization', num_bins=None, break_ties_randomly=True, seed=None)
Initializes a RankHistogram.
- Parameters:
ensemble_dim (str) – Dimension indexing ensemble member.
num_bins (Optional[int]) – Number of bins in histogram. If None, the number of bins will be ensemble_size + 1. If provided, num_bins must evenly divide into ensemble_size + 1.
break_ties_randomly (bool) – If True, break ties with the following behavior. If a subset of bins are identical (due to identical ensemble members), and truth falls within the corresponding bins, a random choice (within the tied bins) is made. If truth is exactly equal to some ensemble members, it is randomly assigned a bin within the tied bins.
seed (Optional[int]) – Seed for RNG used to break ties.
Methods
__init__([ensemble_dim, num_bins, ...])Initializes a RankHistogram.
compute(forecast, truth[, region, skipna])Evaluate this metric on datasets with full temporal coverages.
compute_chunk(forecast, truth[, region, skipna])Computes one-hot encoding of rank on a chunk of forecast/truth.
Attributes
ensemble_dim