Incorrect test coverage on CPython 3.10b3 #1184

skirpichev · 2021-07-01T08:37:56Z

I've tested the Diofant on CPython 3.10b3 here.

Good news: #198 seems to be fixed. Bad news: coverage is wrong.

For instance: the alternative to "retract" branch seems to be uncovered. In fact, this line was covered by this test, for example.

There are more coverage regressions c.f. CPython 3.9 (see the codecov statistics or the build artifact for this PR), I'm not sure they can be reduced to the above example.

(This was tested for codecov v2.1.11.)

nedbat · 2021-07-01T12:17:45Z

Can you give more details? For example, direct links to coverage results that are wrong? Even better would be steps to reproduce the problem.

skirpichev · 2021-07-01T13:13:52Z

For example, direct links to coverage results that are wrong?

Sorry, I expected this is easy to find in the pr. For instance, coverage results on the codecov are here. (Also, they are in the CI artifact.zip.) Good example, mentioned above, is the coverage regression of polytools.py.

Unfortunately, I can't (yet) provide a simpler example.

nedbat · 2021-07-01T20:36:22Z

Thanks for the links, but I think you are still overestimating my ability to find the specifics you are talking about. Can you include the relevant lines in a comment here? Specific lines of code, with specific missing coverage, if you could.

skirpichev · 2021-07-02T01:53:59Z

Sorry, I though I did this in the PR description. I mentioned these lines: retract=False alternative seems to be uncovered, while it should, e.g. by this test. Here is coverage status. For my surprise, similar block in the div method above - seems to be covered.

skirpichev · 2021-07-03T04:55:54Z

FYI: I was able to reproduce this on the dev version of CPython 3.10 as well and on the v3.10b1.

Another strange coverage regression is on line 322 of the gruntz.py. It claims there is no jump to line 324. Unfortunately codecov.io recently have problems with displaying coverage differences for prs. As I said before, if you can't see the coverage differences for pr on codecov - you can use coverage output from the build artifact.

nedbat · 2021-07-03T17:35:32Z

Is it possible for me to run the test suite locally? I don't see instructions for how to do that, or a tox.ini. If I can run it myself, we might be able to get to the bottom of the problem.

skirpichev · 2021-07-04T02:54:04Z

I don't see instructions for how to do that

Install the package from git. Testing instructions are in the dev guide.

But in this case you shouldn't run the whole test suite with pytest. The Poly.rem() method should be covered by tests in the diofant/tests/polys/test_polytools.py:

pytest --cov diofant diofant/tests/polys/test_polytools.py

The gruntz.py could be tested with diofant/tests/series/test_gruntz.py:

pytest --cov diofant -m 'not slow' diofant/tests/series/test_gruntz.py

we might be able to get to the bottom of the problem.

I hope so. My 2c: I was able to restore coverage of the retract=False case in Poly.rem(). To check that you could add "else: pass" clause for try block, right after line 908.

skirpichev · 2021-07-04T03:17:21Z

I think, I've a simple example for the first case (polytools.py):

$ pip install pytest-cov
[...]
$ cat test.py
def foo(x):
    if x:
        try:
            1/(x - 1)
        except ZeroDivisionError:
            pass

    return x


def test_foo():
    for i in range(3):
        foo(i)
$ pytest -q --cov test test.py --cov-branch
.                                                                                                   [100%]

----------- coverage: platform linux, python 3.10.0-beta-1 -----------
Name      Stmts   Miss Branch BrPart  Cover
-------------------------------------------
test.py      10      0      4      1    93%
-------------------------------------------
TOTAL        10      0      4      1    93%

1 passed in 0.07s

Again, to "fix" coverage of the "if" statement - you can add "else: pass" clause for the "try" block OR remove the "return" statement.

skirpichev · 2021-07-04T04:36:54Z

For the gruntz.py regression, you can "fix" coverage with the "pass" statement, added to the "with" block after line 322.

nedbat · 2021-07-05T22:30:03Z

Thanks, your short example shows the problem well. I've written a bug against CPython here: https://bugs.python.org/issue44570

skirpichev · 2021-07-06T02:59:47Z

Unfortunately, I've no glue on how to simplify example in the gruntz.py module. Here is another one: DummyWrapper._generate_code(). On another hand, this with statement shown as fully covered.

Note: this test fails on 3.10.0b3, the current 3.10 version in the CI tests.

nedbat · 2021-07-08T20:21:06Z

The latest CPython branch fixes the rem() problem. The gruntz problem is probably a coverage bug that isn't properly keeping up with a 3.10 change. I'll look into that.

nedbat · 2021-07-08T21:49:47Z

Oh, good news, it looks like coverage.py has already fixed the gruntz issue. Try using master of coverage.py.

skirpichev · 2021-07-09T00:19:38Z

I confirm, this was fixed too. Latest release works.

I'll test b4 on the github, once the image will be available - maybe not all coverage regressions are fixed. But so far, all mentioned
problems are resolved. Thank you.

skirpichev · 2021-07-12T08:59:30Z

Unfortunately, gruntz.py problem still present in the latest beta4. And there are other coverage regressions, perhaps related.

You can see coverage statistics in the build artifact or on the codecov site.

nedbat · 2021-07-12T09:36:53Z

@skirpichev I appreciate your help with this. Can you please post the relevant chunks of code here, with the specifics of how the coverage is wrong?

skirpichev · 2021-07-12T09:44:57Z

I already did this. Unfortunately, my claim that this problem was fixed is wrong. Either beta4 include some another regression or the code coverage for the line 322 is flaky (i.e. only sometimes this line marked as covered). In both cases, however, it's a bug. I'll try to debug more.

nedbat · 2021-07-12T12:56:21Z

@skirpichev you agreed with me that the problem was fixed. But now it seems not to be? Can you show the coverage that is wrong? You've seen it, you know where it is. Why make me go digging for it also?

skirpichev · 2021-07-12T14:23:31Z

But now it seems not to be?

Yes, apparently I was wrong.

Can you show the coverage that is wrong?

Same as I reported before: line 322 was marked as partially covered (no jump to 324). You can see coverage statistics in the build artifact. Unfortunately, codecov.io statistics seems to be unavailable right now.

My guess, the problem has a random nature. I'll try to reproduce this locally. This file should be 100% covered by

pytest --cov diofant -m 'not slow' diofant/tests/series/test_gruntz.py

nedbat · 2021-07-12T23:17:02Z

Are you sure you are using the latest master of coverage.py? It's not a released version. I did this:

pip uninstall -y coverage
pip install git+https://github.com/nedbat/coveragepy@809cccb6

Then I ran test_gruntz.py again, and it was 100% covered, using Python 3.10.0b4.

skirpichev · 2021-07-13T06:49:12Z

Hmm, indeed - the git version seems to be working. Previously, I've tested successfully on the last release. Maybe I miss some installation problem.

If you have no idea which commit does fix the problem - I'll try to bisect. (Maybe there are no regression test.)

BTW, there are a lot of regressions at the 809cccb version, you can see them here. Today codecov.io seems to be working and you can see all regressions right on the site. (Please ignore the plotting module, however.) Not sure how to proceed: should I close this issue or we can continue tracking the problems here? Maybe we should escalate one to the CPython bugtracker as a whole?

Just for example, here are two regressions: in the residue_ntheory.py and dpll2.py. Second one looks like #1187.

nedbat · 2021-07-13T11:08:00Z

BTW, there are a lot of regressions at the 809cccb version

I don't know a different way to ask this: if you have a specific issue, please please please post the actual code here (not a link), with the actual coverage results (not a link). It is difficult to dig into your test results to understand what you are talking about, and depends on codecov working properly. I tried your "you can see them here" link, and got a cute failure page at codecov.

I would like to find out about things that don't work properly, but my time is limited, and I don't have the same context about your project that you do.

Opening a new issue for current problems would be most helpful. Thanks, I really do appreciate the bug reports, I just want to make them more actionable.

skirpichev · 2021-07-13T12:06:50Z

depends on codecov working properly

That can't be in above examples. You can check coveragepy status for mentioned files in the build log. The ntheory and logic modules should be 100% covered.

got a cute failure page at codecov.

Maybe it's an authentification failure?

BTW, both files are in the coverage report, that has the build artifact. It's easy to see coverage differences: both files were 100% covered.

my time is limited

(You may expect that others also have time constraints.)

Opening a new issue for current problems would be most helpful.

The problem is that there is a lot of regressions. I suspect, they will be unfixed in 3.10 if you will just ignore failure reports.

Maybe you could suggest better ways to illustrate coverage differences (than codecov)?

nedbat · 2021-07-13T12:32:52Z

Maybe it's an authentification failure?

The codecov page said authentication failure and also API rate limit. I didn't know which to believe. It doesn't matter, the link was useless to me.

(You may expect that others also have time constraints.)

Yes, but how much time are we both spending going back and forth here? When you look at codecov and see a problem, copy it and paste it here. It will save us both time in the long run.

The problem is that there is a lot of regressions. I suspect, they will be unfixed in 3.10 if you will just ignore failure reports.

This sounds like you are accusing me of ignoring failure reports? I must be misunderstanding you. I am trying very hard to get your failure reports. You've said "a lot of regressions" twice now, but I don't know what they are.

Maybe you could suggest better ways to illustrate coverage differences (than codecov)?

The best way to illustrate coverage differences is to paste the code into the issue, and also paste part of the coverage report into the issue. It can be a screenshot of codecov if you like. For example, look at #1176. Everything is right there. You don't need a minimal reproducer like they have, but the more information you can put directly into the issue, the less we rely on other services, or build artifacts being kept forever, or me being able to puzzle through your CI, or your repo staying in a consistent state, and so on and so on.

skirpichev · 2021-07-13T13:49:40Z

On Tue, Jul 13, 2021 at 05:33:03AM -0700, Ned Batchelder wrote: The best way to illustrate coverage differences is to paste the code into the issue, and also paste part of the coverage report into the issue.

Well, there is ~30 regressions (in the present state)... But I'll try. I hope, #1188 is better on your view. This documents one of mentioned issues and, perhaps, related.

skirpichev added the bug Something isn't working label Jul 1, 2021

nedbat added a commit that referenced this issue Jul 8, 2021

test: add a test for #1184.

3524d56

nedbat mentioned this issue Jul 8, 2021

Nedbat/bug1184 #1185

Merged

nedbat added a commit that referenced this issue Jul 8, 2021

test: add a test for #1184.

809cccb

Note: this test fails on 3.10.0b3, the current 3.10 version in the CI tests.

skirpichev closed this as completed Jul 9, 2021

warsaw mentioned this issue Jul 11, 2021

Coverage misses with Python 3.10 #1187

Closed

skirpichev reopened this Jul 12, 2021

nedbat closed this as completed Jul 13, 2021

nedbat added cpython Reported as a bug in CPython fixed labels Jul 14, 2021

Uh oh!

Incorrect test coverage on CPython 3.10b3 #1184

Incorrect test coverage on CPython 3.10b3 #1184

Comments

skirpichev commented Jul 1, 2021

nedbat commented Jul 1, 2021

Uh oh!

skirpichev commented Jul 1, 2021

Uh oh!

nedbat commented Jul 1, 2021

Uh oh!

skirpichev commented Jul 2, 2021

Uh oh!

skirpichev commented Jul 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nedbat commented Jul 3, 2021

Uh oh!

skirpichev commented Jul 4, 2021

Uh oh!

skirpichev commented Jul 4, 2021

Uh oh!

skirpichev commented Jul 4, 2021

Uh oh!

nedbat commented Jul 5, 2021

Uh oh!

skirpichev commented Jul 6, 2021

Uh oh!

nedbat commented Jul 8, 2021

Uh oh!

nedbat commented Jul 8, 2021

Uh oh!

skirpichev commented Jul 9, 2021

Uh oh!

skirpichev commented Jul 12, 2021

Uh oh!

nedbat commented Jul 12, 2021

Uh oh!

skirpichev commented Jul 12, 2021

Uh oh!

nedbat commented Jul 12, 2021

Uh oh!

skirpichev commented Jul 12, 2021

Uh oh!

nedbat commented Jul 12, 2021

Uh oh!

skirpichev commented Jul 13, 2021

Uh oh!

nedbat commented Jul 13, 2021

Uh oh!

skirpichev commented Jul 13, 2021

Uh oh!

nedbat commented Jul 13, 2021

Uh oh!

skirpichev commented Jul 13, 2021 via email

Uh oh!

skirpichev commented Jul 3, 2021 •

edited

Loading