-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Bug description
In astroid, there is a _get_zipimporters function that scans sys.path_importer_cache for zipimporter finders:
def _get_zipimporters() -> Iterator[tuple[str, zipimport.zipimporter]]:
for filepath, importer in sys.path_importer_cache.items():
if isinstance(importer, zipimport.zipimporter):
yield filepath, importerThe isinstance call gets executed ~35 million times when running pylint on the yt-dlp repo. ~32 million of these invocations can be avoided by checking for None first:
if importer is not None and isinstance(importer, zipimport.zipimporter)
A further optimization would be to reduce the number of times path_importer_cache is scanned, but that seems trickier to accomplish.
Configuration
[MAIN]
jobs=1
[MESSAGES CONTROL]
disable=all
enable=E0401
[REPORTS]
reports=no
score=noCommand used
Steps to reproduce
git clone https://github.com/yt-dlp/yt-dlp.git
cd yt-dlp
git checkout 5904853ae5788509fdc4892cb7ecdfa9ae7f78e6
cat << EOF > ./profile_pylint.py
import cProfile
import pstats
import sys
sys.argv = ['pylint', '--recursive=y', '.']
cProfile.run('from pylint import __main__', filename='stats')
with open('profiler_stats', 'w', encoding='utf-8') as file:
stats = pstats.Stats('stats', stream=file)
stats.sort_stats('tottime')
stats.print_stats()
EOF
cat << EOF > .pylintrc
[MAIN]
jobs=1
[MESSAGES CONTROL]
disable=all
enable=E0401
[REPORTS]
reports=no
score=no
EOF
python ./profile_pylint.pyAnalysis
_get_zipimporters calls isinstance ~35 million times
import pstats
stats = pstats.Stats('stats')
stats.print_callers('isinstance')
ncalls tottime cumtime
35188734 4.255 4.255 astroid/interpreter/_import/spec.py:342(_get_zipimporters)Checking for None first seems twice as fast:
$ python -m timeit -s 'import zipimport; x=None' -n 1000000 'isinstance(x, zipimport.zipimporter)'
1000000 loops, best of 5: 48.9 nsec per loop
$ python -m timeit -s 'import zipimport; x=None' -n 1000000 'x is not None and isinstance(x, zipimport.zipimporter)'
1000000 loops, best of 5: 24.3 nsec per loopPylint output
There may be some import errors depending on your (virtual) environment, but the output is less important than the performance numbers.
Expected behavior
Improved performance via reduced isinstance calls
Pylint version
astroid @ git+https://github.com/pylint-dev/astroid.git@a4a9fcc44ae0d71773dc3bff6baa78fc571ecb7d
pylint @ git+https://github.com/pylint-dev/pylint.git@500774ae5a4e49e2aa0c8d3f2b64613e21aa676e
Python 3.12.3OS / Environment
Arch Linux
Additional dependencies
No response