Skip to content

DumpError: strings must be utf-8 decodable #843

@jaraco

Description

@jaraco

In pypa/distutils#183, I've stumbled onto an issue with xdist. In that issue, distutils/setuptools are moving from a stdout-based logging system to the Python logging framework. As a result, the quiet() context, which suppresses writes to sys.stdout no longer has the effect of suppressing logs.

One of the things that setuptools is logging is a filename containing surrogate escapes. It logs the name so the user can identify which filename was failing.

With pytest-xdist, however, the test fails with an INTERNALERROR in gateway_base.

A minimal test is to run pytest -n auto on the following:

def test_log_non_utf8():
    __import__('logging').getLogger().warn(
        'föö'.encode('latin-1').decode('utf-8', errors='surrogateescape')
    )
    __import__('pytest').fail('failing to emit messages')

That test seems legitimate and capturing behavior that provides value to the user. The issue doesn't occur without pytest-xdist.

My instinct is that pytest-xdist shouldn't be putting constraints on the allowed outputs for logging. Can something be done to be more lenient about legitimate non-encodeable values being passed? If the encoding is only an internal implementation detail between worker and supervisor, it should serialize any strings in a way that they're deserialized with fidelity to the original.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions