Skip to content

name/title system #430

@henryiii

Description

@henryiii

I'd like to propose upstreaming @LovelyBuggies fantastic title/name system from Hist to boost-histogram. There are several reasons for this:

  • It has no dependencies
  • It can take part in DRAFT: Plotting (and later fitting) Protocol #423 in providing a consistent API for uproot and mplhep (maybe others) (mplhep.plothist(uproot.to_boost()) would show titles too, like Hist, and @jpivarski probably could add the same api directly to TH* objects in Uproot4)
  • It can make dict-indexing, projections, and axes read more clearly if names are used, allowing a high level of clarity as to which axis is being referenced.
  • It is completely optional, and does not affect the current usage at all
  • It would make conversions to/from hist easier, since currently names and titles get packed into a dict, which then has to be changed going between hist/boost-histogram (especially for uproot, since title and name have to be handled during the conversion)
  • We can optimize the storage internally, just keeping the strings directly in the metadata type, rather than storing python dicts and python strings

Quick example of usage:

h = bh.Histogram(
    bh.axis.Regular(10, -1, 1, name="x"),
    bh.axis.Regular(20, -2, 2, name="y"),
)
h.fill(x=np.random.normal(size=1_000_000), y=np.random.normal(size=1_000_000))

h.project("y");

h[{"x": sum, "y": bh.rebin(2)}]

h.axes["x"].label = "x [μm]"
h.axes["y"].label = "y [cm]"
mplhep.hist2dplot(h);

The proposed design, open for discussion:

  • Axes gain a new name parameter.
    • This probably should be a read-only parameter (which is not how Hist currently works)
    • When histograms are created, they make sure all names are ether empty or unique
    • Names are used anywhere axis numbers can be used.
      • Empty-name axes must be accessed by number.
  • Axes gain a new label parameter
    • Will override the name for plotting libraries, so titles can be attached to the histogram rather than added afterwords in the plot.
    • If the label is not set, the name is used if available.
  • The metadata slot in boost-histogram's C++ becomes a struct, with two strings and an arbitrary Python object.

@HDembinski, what do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions