-
Notifications
You must be signed in to change notification settings - Fork 25
Description
I'm working on defining a Protocol for plotting histograms. To do so, I need some way to access the values and optionally variances so that mplhep and others can decide on what to plot. Here is one possible suggestion:
from numpy.typing import ArrayLike # requires NumPy master
from typing import Protocol, Optional, Tuple, Union, Iterable
class PlottableAxis(Protocol):
label: str # May be removed soon
edges: ArrayLike
# For non-categorical axes, this returns None
categories: Union[Iterable[int], Iterable[str], None]
class PlottableHistogram(Protocol):
axes : Tuple[PlottableAxis]
# Values returns the array or the values array for specialized accumulators
def values(self, flow=False) -> ArrayLike: ...
# Variance returns the variance if applicable, otherwise None
# If counts is none, variance returns NaN for that cell (mean storages)
def variances(self, flow=False) -> Optional[ArrayLike]: ...
We can look at labeling in a later draft, but there are a few key points:
.values()
is likeview()
for simple storages, and like.view().value
for complex ones, providing plotting libraries a consistent way to access the central values for plots..variances()
returns None if a storage does not provide information for variances, allowing a plotting library to useif variances := h.variances():
to skip plotting error bars, or do something reasonable. If it exists for that storage, it returns an array..axes[I].categories
is an array of labels if this is a Category storage, otherwise it is None.
I didn't add .counts()
, but maybe that should be included too?
@HDembinski, what do you think?
This is assuming the master object is the API. You could also have a __histogram_dict__
that returns something like what you see above - then it's much easier to add to existing libraries like Physt and doesn't affect/change the public API, but on the flip side, it's duplicated, all or nothing, and doesn't provide API consistency for users. (Double underscores for clarity - you really aren't supposed to add anything with double underscores, so for an API choice, it might not be ideal).
PS: This is post 0.10.0, possibly post 1.0, so no rush here, feel free to come up with something better.
PPS: Initially, boost-histogram and Uproot4 would implement this, and Hist would get it for free since it is a boost-histogram.