Skip to content

Conversation

fjetter
Copy link
Member

@fjetter fjetter commented Jun 30, 2021

This supercedes #4972 and closes #4613

Notable changes to #4972

  • Source code attached to the computation object is now inferred by walking back the stack. All calls originating from dask and distributed modules are ignored. First frame of a non-ignored module is interpreted as user code and we'll extract that source. The ignore list is currently only dask and distributed but we can add other common libs like xarray, prefect as suggested in #XYZ. I tested this in jupyter notebooks and unit tests and it worked as I think it should. I ended up needing to write special treatment for list/dict comprehension and wouldn't be surprised to see more special cases popping up. we'll see
  • The max history and ignore modules can now be configured
  • The module ignore thing uses regular expressions. This may be overkill and I'm not entirely sure if this poses a security issue. Implementation actually felt even simpler compared to a "split / str compare". I figured that people might want to use glob patterns or something more sophisticated than hard coded paths. regex was the "simplest" choice. If that faces resistance, I'll go for a simple str compare (startswith, is in, etc.)
  • I added a poor mans HTML repr (don't be too critical, I just copied stuff around) simply to get started. I hope that somebody with a better sense for style will iterate on this
  • I chose to go for a dedicated UUID nevertheless (see Add initial draft of large scale computation class #4972 (comment)) mostly because I believe we need a UUID, especially when connecting to an external service and I couldn't find the UUID @mrocklin was referring to.

image

@fjetter fjetter force-pushed the computation_class branch from f53b309 to dc19a85 Compare June 30, 2021 15:11
@fjetter
Copy link
Member Author

fjetter commented Jun 30, 2021

Test failures are a lot #4859 and distributed/diagnostics/tests/test_progress.py::test_AllProgress of which I'm pretty sure that it is a known flaky test. No idea why this clusters so strongly around this change.

There is also distributed/tests/test_client_executor.py::test_cancellation which is already marked with a pytest.rerun marker but it still fails.

:(

@mrocklin
Copy link
Member

Checking in here @fjetter. I know that you've been busy recently with stability issues. This seems like it might be close though?

@fjetter
Copy link
Member Author

fjetter commented Jul 20, 2021

This seems like it might be close though?

I was hoping this is done. I'll rebase and see what CI thinks about it

@fjetter fjetter force-pushed the computation_class branch from 2616001 to af2545f Compare July 20, 2021 15:05
@fjetter
Copy link
Member Author

fjetter commented Jul 20, 2021

Locally tests look good. If there are no complaints and CI turns out to be green I'll go ahead and merge.

@mrocklin
Copy link
Member

Thanks @fjetter

@fjetter
Copy link
Member Author

fjetter commented Jul 20, 2021

ok, py3.7 crashes hard. some cythonization issue but shouldn't be hard to track down

@fjetter
Copy link
Member Author

fjetter commented Jul 20, 2021

The cythonization problem turns out to be a bit awkward. The Computation._recent as a global class attribute doesn't work. afaik, there is no such thing as a class attribute in cython. Instead, I will probably simply track this in the scheduler itself, after all we will probably not want to track recent computations across multiple schedulers in the same python process.
I might be wrong, distributed users continue to surprise me with creative ways to break stuff :)

@fjetter fjetter force-pushed the computation_class branch from 5a4d418 to 741a648 Compare July 21, 2021 12:14
@fjetter
Copy link
Member Author

fjetter commented Jul 22, 2021

Failures appear to be unrelated. I'll go ahead and merge this now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Computation model to Scheduler

2 participants