Skip to content

Bizarre quirk with how Hypothesis formats the error message for falsifying examples #4183

@tmaxwell-anthropic

Description

@tmaxwell-anthropic

Consider this test:

counter = 0

@hypothesis.given(dummy=st.text())
def test_that_prints_rich_error(dummy: str):
    global counter
    counter += 1
    if counter % 5 == 0:
        raise ExceptionGroup(
            "outer exceptiongroup", [ZeroDivisionError("this is a fake error")]
        )

It will fail nondeterministically, of course. Hypothesis reports a rich error message describing the failure:

...
  |   File "/opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages/hypothesis/core.py", line 1722, in wrapped_test
  |     raise the_error_hypothesis_found
  | hypothesis.errors.FlakyFailure: Hypothesis test_that_prints_rich_error(dummy='\x02\x1d\U000c4bc0') produces unreliable results: Falsified on the first call but did not on a subsequent one (1 sub-exception)
  | Falsifying example: test_that_prints_rich_error(
  |     dummy='\x02\x1d\U000c4bc0',
  | )
  | Failed to reproduce exception. Expected:
  | + Exception Group Traceback (most recent call last):
  |   |   File "/Users/tmaxwell/code/anthropic/belt/tests/test_hypothesis_utils.py", line 399, in test_that_prints_rich_error
  |   |     raise ExceptionGroup(
  |   | ExceptionGroup: outer exceptiongroup (1 sub-exception)
  |   +-+---------------- 1 ----------------
  |     | ZeroDivisionError: this is a fake error
  |     +------------------------------------
  |
  | Explanation:
  |     These lines were always and only run by failing examples:
  |         /Users/tmaxwell/code/anthropic/belt/tests/test_hypothesis_utils.py:399
  +-+---------------- 1 ----------------
    | Exception Group Traceback (most recent call last):
    ...

Now consider this one:

counter = 0

@hypothesis.given(dummy=st.text())
def test_that_does_not_print_rich_error(dummy: str):
    global counter
    counter += 1
    if counter % 5 == 0:
        try:
            raise ZeroDivisionError("this is a fake error")
        except* Exception as e:
            raise

This also fails nondeterministically, but with a much less helpful message:

...
  |   File "/opt/homebrew/Caskroom/miniforge/base/envs/py311/lib/python3.11/site-packages/hypothesis/core.py", line 1722, in wrapped_test
  |     raise the_error_hypothesis_found
  | ExceptionGroup:  (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/Users/tmaxwell/code/anthropic/belt/tests/test_hypothesis_utils.py", line 389, in test_that_does_not_print_rich_error
    |     raise ZeroDivisionError("this is a fake error")
    | ZeroDivisionError: this is a fake error
    +------------------------------------

What is the difference...? I feel like I'm going insane.

This is with Hypothesis v6.112.1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugsomething is clearly wrong herelegibilitymake errors helpful and Hypothesis grokable

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions