Skip to content

ENH: Enable validation during sphinx-build process #302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Feb 8, 2021

Conversation

rossbar
Copy link
Contributor

@rossbar rossbar commented Nov 4, 2020

The numpydoc.validate module provides a lot of functionality for validating docstrings. Previously, this validation was intended to be used as a separate module and was not integrated into the sphinx-build process.

This PR is a first attempt at hooking the validation checks from numpydoc.validate into sphinx-build. Two new configuration options are added:

  • numpydoc_validate : bool - toggles validation on/off (off by default)
  • numpydoc_validation_checks : set - A set of keys from validate that specifies which validation checks to run (empty set by default, meaning no warnings are reported during the sphinx build process).

With this PR, the validation checks are run at the mangle_docstrings step and any failed validation checks are reported as warnings via sphinx logging. I've modified the conf.py in doc/ to use this feature with all of the validation checks turned on, so the easiest way to see how this looks: cd doc/ && make html. You'll notice additional warnings raised from validation in the build log.

With the default configuration settings, users should see no change in behavior and any performance differences should be negligible.

@rossbar
Copy link
Contributor Author

rossbar commented Nov 4, 2020

Note the CircleCI build failures are expected - I intentionally activated the validation with all of the checks for illustration purposes.

The doc build could be fixed by modifying doc/conf.py setting either:

  • numpydoc_validate = False (consistent with the configuration prior to the addition of this feature)
  • numpydoc_validation_checks = set(ERROR_MSGS.keys()) - set(['GL01', 'SA04', 'RT03']) (leaves the validation activated, but ignores the three checks that currently warn.

@larsoner
Copy link
Collaborator

larsoner commented Nov 4, 2020

This seems useful to me. +1 for the set() approach so we at least don't get worse in our own docs (without explicitly changing the set to allow it).

Running the test suite on this patch demonstrates that refactoring
the boundary between NumpyDocString and SphinxDocString provides the
necessary info to (potentially) do away with the validate.Docstring
class.
Add a conf option to turn on/off.

TODO: test
logger.warn is apparently deprecated
Adds a config option with a set to allow users to select
which validation checks are used. Default is an empty set,
which means none of the validation checks raise warnings
during the build process.

Add documentation for new option and activate in the doc build.
@rossbar rossbar force-pushed the npdoc-validation-sphinx-build branch from e1d0631 to 39f2f22 Compare January 29, 2021 18:12
@rossbar
Copy link
Contributor Author

rossbar commented Jan 29, 2021

I've rebased on master to resolve conflicts and replaced the conf.py for the docs with a subset of the available validation checks (e.g. excluding grammatical-type checks). The selection of which validation checks to include is obviously subjective, but I tried to come up with a subset that I think users might be interested in (and aren't overly stringent).

Copy link
Collaborator

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it on some code and the output is:

...
WARNING: ('SA01', 'See Also section not found')
WARNING: ('EX01', 'No examples section found')
WARNING: ('SA01', 'See Also section not found')
WARNING: ('EX01', 'No examples section found')
WARNING: ('EX01', 'No examples section found')
WARNING: ('SA01', 'See Also section not found')
WARNING: ('EX01', 'No examples section found')
WARNING: ('SA01', 'See Also section not found')
WARNING: ('PR01', "Parameters {'extra'} not documented")
WARNING: ('PR02', "Unknown parameters {'extra: string'}")
WARNING: ('PR10', 'Parameter "extra" requires a space before the colon separating the parameter name and type')
WARNING: ('SA01', 'See Also section not found')
WARNING: ('EX01', 'No examples section found')
WARNING: ('SA01', 'See Also section not found')
WARNING: ('EX01', 'No examples section found')
WARNING: ('SA01', 'See Also section not found')
WARNING: ('EX01', 'No examples section found')
WARNING: ('SA01', 'See Also section not found')
WARNING: ('EX01', 'No examples section found')
WARNING: ('SA01', 'See Also section not found')
WARNING: ('EX01', 'No examples section found')

Can ideally the fully qualified name but at least the function/class name + method be in there so that it's easier to track down the culprits?

More flexibility in configuring which validation checks to run during
sphinx build. If 'all' is present, treat the rest of the set as a
blocklist, else an allowlist.
Copy link
Collaborator

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this on the MNE-Python repo and it's very close to usable. The one thing I think that would make it totally usable would be some way to specify what functions/methods/whatever to process and which ones not to. In my use case it's because I have (among other things) a whole bunch of classes that inherit from dict, and so things like <class>.popitem keep showing up as problematic.

A potentially easy way to get around this is to have a numpydoc_validate_exclude (or maybe include?) which is a regex of names to exclude from (or include in) validation / error reporting. I think I could make the MNE-Python example work by adding a dozen or so bad names, and then actually be able to use this.

I don't want to over-fit one particular use case (MNE's) for this, though, so if you think this is overkill then no need to implement right off the bat.

Have you tried using this on any repo(s) you maintain or contribute to?

@rossbar
Copy link
Contributor Author

rossbar commented Feb 1, 2021

Have you tried using this on any repo(s) you maintain or contribute to?

I need to do so with the latest iteration - my focus up to this point has largely been on how to tamp down the warnings from checks that are overly-pedantic for my use-case, so I'm sure there are other cases like the one you mention that I've just failed to notice. I will test with the numpy & networkx docs and see what crops up.

@larsoner
Copy link
Collaborator

larsoner commented Feb 1, 2021

Sounds good, let me know what you think could work for include/exclude pattern from working on those.

Modify updated config name to avoid sphinx warning.

Add documentation for exclusion config value.
@rossbar rossbar force-pushed the npdoc-validation-sphinx-build branch from c5111bf to 87dd52f Compare February 4, 2021 01:30
@rossbar
Copy link
Contributor Author

rossbar commented Feb 4, 2021

I've made an attempt at a feature to exclude patterns from the docstring validation. 87dd52f adds a numpydoc_validation_exclude configuration parameter that should be a container containing strings with patterns to exclude. Maybe it'd be better to have users supply a regex directly rather than construct one from re strings in a container? I wasn't sure what was best so please LMK if you think it would be easier/better to do it another way.

I hope this captures the feature you had in mind, but also LMK if I'm way off the mark.

Copy link
Collaborator

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it'd be better to have users supply a regex directly rather than construct one from re strings in a container?

People who don't know much regex will have a much easier time, so I like your approach. And by way of (my) example, I'd bet that this is more readable than the full regex equivalent for most Python coders (it was certainly easier for me to construct):

numpydoc_validation_exclude = (  # list of regex
    # dict subclasses
    r'\.clear', r'\.get$', r'\.copy$', r'\.fromkeys', r'\.items', r'\.keys',
    r'\.pop', r'\.popitem', r'\.setdefault', r'\.update', r'\.values',
    # not always our cleanest
    r'\.__getitem__', r'\.__contains__', r'\.__hash__', r'\.__mul__',
    r'\.__sub__', r'\.__add__', r'\.__iter__',
)

Other than a couple last minor ideas LGTM +1 for merge. I've already found several issues with MNE-Python with it, thanks!

doc/install.rst Outdated
Comment on lines 99 to 102
numpydoc_validate : bool
Whether or not to run docstring validation during the sphinx-build process.
Default is ``False``.
numpydoc_validation_checks : set
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last API idea -- do we even need numpydoc_validate anymore? Isn't False redundant with numpydoc_validation_checks = set() which could be the default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right... I'll take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think numpydoc_validate was redundant and it was relatively straightforward to remove it. Just make sure that the changes in 55de340 (especially the docs) make sense.

Copy link
Collaborator

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1 for merge.

I'll leave this open over the weekend in case someone else wants to look then I'll try to merge Monday. Thanks in advance @rossbar !

@rossbar
Copy link
Contributor Author

rossbar commented Feb 5, 2021

Thanks for all the feedback and review!

@larsoner larsoner merged commit ce6666a into numpy:master Feb 8, 2021
@larsoner
Copy link
Collaborator

larsoner commented Feb 8, 2021

Thanks @rossbar !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants