-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
What's the problem this feature will solve?
As discussed in #10258 (which is about presenting conflict as they appear) to split-off the discussion I start a new issue as proposed by @pradyunsg.
The feature request here is a proposal to give the user control over the backtracking process via timeout, especially in case it goes into a long backtracking loop. There are some issues describing such backtracking loop happening, for example #10924 and discussing potential changes in the way how algorithm work (for example #10884, #10788).
There are also similar discussions and proposals (for example #10417, #10235) but none of them (as far as I see) proposed timeout as a solution.
The problem this proposal is supposed to handle is to handle the case of users using pip
to find and resolve installation of packages in CI environments. Many of the users use pip
to aither install packages in CI or build container images (example Apache Airflow but there are plenty of others) and they would like to be able to determine the non-conflicting set of dependencies using pip
resolver. This is useful if you build libraries or applications where you have "open" set of requirements and you want to make sure that your users are able to continue installing your application or library so that their pip install
commend will complete in reasonable time.
In this case what the users of pip
need is to use pip
resolver for the job it is designed - to find the right set of dependencies fulfilling the limitations of packages to install, but they also need ot be able to diagnoes and fix the cases when pip
enters the backtrack loop and it takes much more time than usual.
In such case the users should be able to stop backtracking automatically (CI systems are designed to run in unattended way) and get information that will enable them to inspect the root cause of the conflicts that lead to backtracking.
Currently even if such backtracking is stoped at a timeout, the logs of pip
do not contain enough information to figure out what was the root of the conflict that led to backtracking (details in #10258 (comment))
Describe the solution you'd like
What I would like to see is:
- Add --resolver-timeout (for example) flag to
pip install
that will fail resolver if the time limit is reached - Add diagnostics information printed by
pip
when such timeout happens pointing to the (probable if it cannot be assesed exactly) cause for the too-long-backtracking.
Alternative Solutions
- Limiting the number of backtracking similar to Max Backtracking Option and print out current failure casues #10417.
The problem with this one is that users of pip
have no idea of the backtracking algorithm and backtracking depth. See "additional context" below for explanation.
- Solving it using external timeout wthout internal information from
pip
The problem cannot be easily solved (IMHO) without pip
providing the right diagnostics information. In Apache Airflow we've implemented some ways to get more information when such backtracking happens (and we added our own timeout), but the investigation after such capturing are based mostly on guessing. We are looking on what has changed since the last succesful run (which we store in constraints) and provide some clies to a process to manually figure out the root cause. This is however based on trial and error, and this is mostly based on gueses. Information provided by pip
itself could be much more helpful and could lead to much less time (and energy) consuming solution.
Additional context
I perfectly understand that pip
cannot decide upfront on whether to print the context , and yeah - it is an extremely complex problem to solve, so I also sympathise with pip maintainers here.
But if the algorithm cannot decide, let the user decide when they want to stop and how much of backtracking to do and when "too long" or "too much" of the backtracking to do. Simply letting it running for an indefinite amount of time when things go wrong is the "worst" UI choice from the user perspective. Because the user cannot do anything about it when it happens.
Most of the users do not understand the complexity of the resolution algorithm and reasons for backtracking and some of the details that pip
maintainers know initimately. From the users perspective, the users only know that suddenly a) backtracking happens (but they might not understand why and even what backtracking is) and b) pip
starts to download awfully lot of data c) it takes whole lot longer for the resolver to run and it might even never complete.
Out of those three observations of the users, the "time" is the easiest one to reason about from the user perspective. Their observation is that suddenly the install
command they run take a lot longer than normal (and they know from the past what the "normal" is. So their reaction might be to add the timeout parameter and rerun the failed job for investigation. This is unlike to backtracking depth - where users would have to first understand all the details of the algorithm before understanding what limit they should put to investigate.
While I understand that it's extremely "hard" to decide when "long" is "too long" in generic case (this is the problem pip maintainers have and I understand it is difficult), delegating it to the user (if they choose to) is the next best thing that can be done. Most of the users do not have "generic" problem to solve. They just want to make sure their particular command works - and if they use it in CI context especially, this is much "narrower" problem space (because they repetitively run pretty much the same command and they expect similar output) and they can make much better decisions for their particular case than the "generic" solution and they can decide on their own when "long" is "too long". And then they should get the right diagnostic information if they hit the limit.
This is what I propose - give the users (those who need it - should be optional) the ability to make their own decision when "long" is "too long" for their particular case, and print diagnostics information if this limit is hit.
I think not all decisions have to be made by the algorithm and pip maintainers - some of those can simply be delegated to users who need it and know what they are doing and what the "usual" time for resolution is.
Code of Conduct
- I agree to follow the PSF Code of Conduct.