Skip to content

Unnecessary deep copy causes memory flare on network comms #5107

@crusaderky

Description

@crusaderky

distributed git tip, Linux x64

import distributed
import numpy

c = distributed.Client(n_workers=4, threads_per_worker=1, memory_limit="2.8 GiB")
f = c.submit(numpy.random.random, 2 ** 27)  # 1 GiB
c.replicate(f, 4)
# Alternative to replicate, identical effect
# futures = [c.submit(lambda x: None, f, pure=False, workers=[w]) for w in c.has_what()]

Expected behaviour

Thanks to pickle 5 buffers, the peak RAM usage on each worker is 1 GiB

Actual behaviour

I can see on the dashboard the RAM of all workers that receive the computed future over the network briefly flare up to 2 GiB and then settle down at 1 GiB.
On stderr I read:

distributed.worker - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker.html#memtrim for more information. -- Unmanaged memory: 2.07 GiB -- Worker memory limit: 2.80 GiB
distributed.worker - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker.html#memtrim for more information. -- Unmanaged memory: 2.07 GiB -- Worker memory limit: 2.80 GiB
distributed.worker - WARNING - Unmanaged memory use is high. This may indicate a memory leak or the memory may not be released to the OS; see https://distributed.dask.org/en/latest/worker.html#memtrim for more information. -- Unmanaged memory: 2.07 GiB -- Worker memory limit: 2.80 GiB

If I reduce the memory_limit to 2 GiB, the workers get killed off.

The sender worker is unaffected by the flaring.

I tested on Python 3.8 and 3.9 and on protocols tcp://, ws:// and ucx:// and all are equally affected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions