numpydoc.validate fails on the reference docstring example #242

rth · 2019-11-03T16:59:58Z

The recently added numpydoc.validate functionality fails on the reference docstring in doc/example.py.

>>> from numpydoc.validate import validate                                                                                         
>>> validate("doc.example.foo")                                                                                                    
{'type': 'function',
 [...]
 'deprecated': False,
 'file': '/home/rth/src/numpydoc/doc/example.py',
 'file_line': 37,
 'errors': [
   ('GL01',
    'Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)'),
  ('GL02',
   'Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)'),
  ('GL03',
   'Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings'),
  ('SS06', 'Summary should fit in a single line'),
  ('RT03', 'Return value has no description'),
  ('SA02',
   'Missing period at end of description for See Also "numpy.array" reference'),
  ('SA03',
   'Description should be capitalized for See Also "numpy.array" reference'),
  ('SA04', 'Missing description for See Also "numpy.dot" reference'),
  ('SA04', 'Missing description for See Also "numpy.linalg.norm" reference'),
  ('SA04', 'Missing description for See Also "numpy.eye" reference')]}

It seems the validator is contradicting some of the docstring formatting rules (e.g. #241)

cc @datapythonista @jnothman @larsoner

The text was updated successfully, but these errors were encountered:

rth · 2019-11-03T18:21:38Z

After #243 the following checks still fail,

GL01 Docstring text (summary) should start in the line immediately after the o
pening quotes (not in the same line, or leaving a blank line in between)
RT03 Return value has no description
SA04 Missing description for See Also "numpy.dot" reference
SA04 Missing description for See Also "numpy.linalg.norm" reference
SA04 Missing description for See Also "numpy.eye" reference

I think it would make sense disable GL01 and make the others optional (or maybe address RT03 in the example).

datapythonista · 2019-11-04T00:53:22Z

I guess we got GL01 wrong, but the rest, more than contradicting we just wanted to be more strict.

Since validate just returns the list and doesn't raise exceptions or return exit codes different than one, I'd say it should be in the caller where the ones that don't want to be enforced are ignored. I think that will make the code simpler.

I'm ok to have a python -m numpydoc --validate --strict to check the ones that are not part of the standard. My preference would be to change the standard, but I guess there may not be agreement.

rth · 2019-11-04T09:02:02Z

You right, the question is not about contraditing but being more string.

For large OSS projects, it definitely makes sense to use the strict rules. However, in other use cases (e.g. smaller projects) having it less strict could be useful. For instance, I don't necessarily want to add an example for each function in my side project, and the docstring linter shouldn't fail because of it in that case. Of course one could skip those on the user side, but since the same ones happen repeatedly, maybe we could do something about it in numpydoc.

larsoner · 2019-11-04T11:59:02Z

I think users of the function should just cull the list afterward. It's flexible and pretty easy.

For the command line let's discuss in #240. But something like --ignore=GL01,RT01 is a simple enough interface.

datapythonista · 2019-11-04T18:36:59Z

I think we all agree.

Being able to ignore errors is something we surely want. Even when being strict, fixing all errors will take time and we'll want to ignore some while being fixed.

My only point is that I'd prefer to keep the function that validates as is without that, since there is already a decent amount of complexity there. And implement the ignoring in the caller. If there could be a speed gain it could make more sense, but I think it'll take around the same time to validate everything and ignore errors that we don't care about, than ignore them beforehand.

rth mentioned this issue Nov 3, 2019

STY Minor style improvements to doc/example.py to pass validation #243

Merged

rth mentioned this issue Nov 3, 2019

DOC Docstring validation in Pipeline estimator scikit-learn/scikit-learn#15444

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

numpydoc.validate fails on the reference docstring example #242

numpydoc.validate fails on the reference docstring example #242

rth commented Nov 3, 2019 •

edited

Loading

rth commented Nov 3, 2019

datapythonista commented Nov 4, 2019

rth commented Nov 4, 2019

larsoner commented Nov 4, 2019

datapythonista commented Nov 4, 2019

numpydoc.validate fails on the reference docstring example #242

numpydoc.validate fails on the reference docstring example #242

Comments

rth commented Nov 3, 2019 • edited Loading

rth commented Nov 3, 2019

datapythonista commented Nov 4, 2019

rth commented Nov 4, 2019

larsoner commented Nov 4, 2019

datapythonista commented Nov 4, 2019

rth commented Nov 3, 2019 •

edited

Loading