Skip to content

Option to skip workspace validation #1699

@alexander-held

Description

@alexander-held

Summary

Workspace validation is currently not skippable:

pyhf/src/pyhf/workspace.py

Lines 289 to 298 in 767ed59

def __init__(self, spec, **config_kwargs):
"""Workspaces hold the model, data and measurements."""
spec = copy.deepcopy(spec)
super().__init__(spec, channels=spec['channels'])
self.schema = config_kwargs.pop('schema', 'workspace.json')
self.version = config_kwargs.pop('version', spec.get('version', None))
# run jsonschema validation of input specification against the (provided) schema
log.info(f"Validating spec against schema: {self.schema}")
utils.validate(self, self.schema, version=self.version)

In some scenarios the validation takes a significant amount of time, and it would be convenient if it was possible to skip it.

Additional Information

This is using NormalMeasurement_combined.txt from #1695. While working with this gist used in #1695 (comment), I noticed that workspace validation can take a long time for this example. Below is a minimal reproducer:

import pyhf
import json

with open("NormalMeasurement_combined.txt") as f:
    ws = pyhf.Workspace(json.load(f))

This runs in around 5.4 seconds for me locally.

When commenting out

utils.validate(self, self.schema, version=self.version)

this time goes down to 1.6 seconds.

The workspace validation takes about as long as the model construction (via ws.model()), which for me takes about 4 seconds by itself (again dominated by validation time in this example). It is possible to skip model validation (via validate=False), and I believe it would be useful to allow the same for the workspace.

I also noticed that arbitrary kwargs can be passed to workspace construction, e.g. in the above example

ws = pyhf.Workspace(json.load(f), abcdefg=False)

also runs fine. It may be convenient to catch these kwargs to help users identify typos.

Code of Conduct

  • I agree to follow the Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    APIChanges the public APIfeat/enhancementNew feature or requestuser requestRequest coming form a pyhf user

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions