Skip to content

path_to_url called millions of times for ~1000 offline wheel installs #12320

@notatallshaw

Description

@notatallshaw

Description

The fastest way, and has been advised as the best way, to install Python packages is to use wheels.

However when you actually try to install ~1000 wheels from a single set of requirements there are significant performance bottlenecks in Pip

Expected behavior

Installing pre-resolved wheels offline should be very very fast

pip version

23.3.dev0

Python version

Python 3.11

OS

Linux

How to Reproduce

  1. python3.11 -m venv .venv
  2. source .venv/bin/activate
  3. <install latest/dev pip>
  4. git clone https://github.com/home-assistant/core
  5. cd core/
  6. python -m pip download -d {download_directory} --prefer-binary -r requirements_all.txt
  7. cd {download_directory}
  8. for file in $(ls *.tar.gz); do pip wheel --no-deps "$file" && mv "$file" "$file".built ; done
  9. for file in $(ls *.zip); do pip wheel --no-deps "$file" && mv "$file" "$file".built ; done
  10. cd -
  11. time python -m pip install --dry-run --only-binary=:all: --no-index --ignore-installed --find-links file://{download_directory} -r requirements_all.txt

Output

On my machine:

real    2m33.486s
user    2m31.886s
sys     0m1.568s

Running a profile on this I see pip._internal.utils.url.path_to_url is called over 3,000,000 times and takes ~25% of the total run time.

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions