Skip to content

test_distributed.py::test_blockwise_numpy_* fails on 32-bit #7489

@bnavigator

Description

@bnavigator
[   40s] ============================= test session starts ==============================
[   40s] platform linux -- Python 3.8.8, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 -- /usr/bin/python3.8
[   40s] cachedir: .pytest_cache
[   40s] rootdir: /home/abuild/rpmbuild/BUILD/dask-2021.3.1, configfile: setup.cfg
[   40s] plugins: forked-1.3.0, xdist-2.2.0, rerunfailures-9.1.1
...
[  359s] __________________________ test_blockwise_numpy_args ___________________________
...
[  359s] /usr/lib/python3.8/site-packages/tornado/ioloop.py:529: TimeoutError
[  359s] ----------------------------- Captured stderr call -----------------------------
[  359s] distributed.http.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
[  359s] distributed.scheduler - INFO - Clear task state
[  359s] distributed.scheduler - INFO -   Scheduler at:     tcp://127.0.0.1:34773
[  359s] distributed.scheduler - INFO -   dashboard at:           127.0.0.1:45911
[  359s] distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:42977
[  359s] distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:42977
[  359s] distributed.worker - INFO -          dashboard at:            127.0.0.1:35641
[  359s] distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:34773
[  359s] distributed.worker - INFO - -------------------------------------------------
[  359s] distributed.worker - INFO -               Threads:                          1
[  359s] distributed.worker - INFO -                Memory:                   10.43 GB
[  359s] distributed.worker - INFO -       Local Directory: /home/abuild/rpmbuild/BUILD/dask-2021.3.1/dask-worker-space/worker-bgebm6qb
[  359s] distributed.worker - INFO - -------------------------------------------------
[  359s] distributed.worker - INFO -       Start worker at:      tcp://127.0.0.1:39965
[  359s] distributed.worker - INFO -          Listening to:      tcp://127.0.0.1:39965
[  359s] distributed.worker - INFO -          dashboard at:            127.0.0.1:36645
[  359s] distributed.worker - INFO - Waiting to connect to:      tcp://127.0.0.1:34773
[  359s] distributed.worker - INFO - -------------------------------------------------
[  359s] distributed.worker - INFO -               Threads:                          2
[  359s] distributed.worker - INFO -                Memory:                   10.43 GB
[  359s] distributed.worker - INFO -       Local Directory: /home/abuild/rpmbuild/BUILD/dask-2021.3.1/dask-worker-space/worker-64voog8j
[  359s] distributed.worker - INFO - -------------------------------------------------
[  359s] distributed.utils - ERROR - Python int too large to convert to C ssize_t
[  359s] Traceback (most recent call last):
[  359s]   File "/usr/lib/python3.8/site-packages/distributed/utils.py", line 668, in log_errors
[  359s]     yield
[  359s]   File "distributed/scheduler.py", line 3731, in distributed.scheduler.Scheduler.add_worker
[  359s]   File "distributed/scheduler.py", line 412, in distributed.scheduler.WorkerState.__init__
[  359s] OverflowError: Python int too large to convert to C ssize_t
[  359s] distributed.utils - ERROR - Python int too large to convert to C ssize_t
[  359s] Traceback (most recent call last):
[  359s]   File "/usr/lib/python3.8/site-packages/distributed/utils.py", line 668, in log_errors
[  359s]     yield
[  359s]   File "distributed/scheduler.py", line 3731, in distributed.scheduler.Scheduler.add_worker
[  359s]   File "distributed/scheduler.py", line 412, in distributed.scheduler.WorkerState.__init__
[  359s] OverflowError: Python int too large to convert to C ssize_t
[  359s] distributed.core - ERROR - Exception while handling op register-worker
[  359s] Traceback (most recent call last):
[  359s]   File "/usr/lib/python3.8/site-packages/distributed/core.py", line 501, in handle_comm
[  359s]     result = await result
[  359s]   File "distributed/scheduler.py", line 3709, in add_worker
[  359s]   File "distributed/scheduler.py", line 3731, in distributed.scheduler.Scheduler.add_worker
[  359s]   File "distributed/scheduler.py", line 412, in distributed.scheduler.WorkerState.__init__
[  359s] OverflowError: Python int too large to convert to C ssize_t
[  359s] distributed.core - ERROR - Exception while handling op register-worker
[  359s] Traceback (most recent call last):
[  359s]   File "/usr/lib/python3.8/site-packages/distributed/core.py", line 501, in handle_comm
[  359s]     result = await result
[  359s]   File "distributed/scheduler.py", line 3709, in add_worker
[  359s]   File "distributed/scheduler.py", line 3731, in distributed.scheduler.Scheduler.add_worker
[  359s]   File "distributed/scheduler.py", line 412, in distributed.scheduler.WorkerState.__init__
[  359s] OverflowError: Python int too large to convert to C ssize_t
[  359s] distributed.comm.tcp - WARNING - Closing dangling stream in <TCP  local=tcp://127.0.0.1:32864 remote=tcp://127.0.0.1:34773>
[  359s] distributed.comm.tcp - WARNING - Closing dangling stream in <TCP  local=tcp://127.0.0.1:32866 remote=tcp://127.0.0.1:34773>
[  359s] distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:42977
[  359s] distributed.worker - INFO - Closed worker has not yet started: Status.undefined
[  359s] distributed.worker - INFO - Stopping worker at tcp://127.0.0.1:39965
[  359s] distributed.worker - INFO - Closed worker has not yet started: Status.undefined
[  359s] distributed.utils_test - ERROR - Failed to start gen_cluster: TimeoutError: Worker failed to start in 10 seconds; retrying
[  359s] Traceback (most recent call last):
[  359s]   File "/usr/lib/python3.8/site-packages/distributed/core.py", line 275, in _
[  359s]     await asyncio.wait_for(self.start(), timeout=timeout)
[  359s]   File "/usr/lib/python3.8/asyncio/tasks.py", line 501, in wait_for
[  359s]     raise exceptions.TimeoutError()
[  359s] asyncio.exceptions.TimeoutError
[  359s] 
[  359s] During handling of the above exception, another exception occurred:
[  359s] 
[  359s] Traceback (most recent call last):
[  359s]   File "/usr/lib/python3.8/site-packages/distributed/utils_test.py", line 878, in coro
[  359s]     s, ws = await start_cluster(
[  359s]   File "/usr/lib/python3.8/site-packages/distributed/utils_test.py", line 802, in start_cluster
[  359s]     await asyncio.gather(*workers)
[  359s]   File "/usr/lib/python3.8/asyncio/tasks.py", line 695, in _wrap_awaitable
[  359s]     return (yield from awaitable.__await__())
[  359s]   File "/usr/lib/python3.8/site-packages/distributed/core.py", line 279, in _
[  359s]     raise TimeoutError(
[  359s] asyncio.exceptions.TimeoutError: Worker failed to start in 10 seconds
[  359s] distributed.scheduler - INFO - Clear task state
...
[  360s] FAILED tests/test_distributed.py::test_blockwise_numpy_args - tornado.util.Ti...
[  360s] FAILED tests/test_distributed.py::test_blockwise_numpy_kwargs - tornado.util....
[  360s] = 2 failed, 7935 passed, 872 skipped, 135 xfailed, 5 xpassed, 4172 warnings in 331.55s (0:05:31) =

Environment:

  • Dask version: 2021.3.1
  • Python version: 3.8.8
  • NumPy version: 1.20.1
  • Operating System: openSUSE Tumbleweed
  • Install method (conda, pip, source): rpmbuild from source

Full build log with all package versions on intel 32-bit: dask_tumbleweed_32bit_log.txt

Same test failures on armv7l.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions