Skip to content

Teach stubsabot to be smarter about the required locations of py.typed files #11053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 22, 2023

Conversation

AlexWaygood
Copy link
Member

@AlexWaygood AlexWaygood commented Nov 21, 2023

An attempt to fix the bug that resulted in #11049.

The approach is this:

  • If there are no directories at the top level of a wheel or sdist (except for the .dist-info directory), none of the packages for this distribution can be py.typed
  • Else, return True if all top-level directories (except for .dist-info) are marked as py.typed:
    • for each directory, if it contains at least one file, consider it marked as py.typed if it has a py.typed file at the top-level
    • Else, if it only contains subdirectories, consider it marked as py.typed if all subdirectories are marked as py.typed (applying the same logic recursively).

Edit: all the above is out of date. The PR now uses the better approach suggested by Akuli in #11053 (comment).

I've tested this a fair bit locally, and I think it does the right thing, but would appreciate careful eyes on this and additional testing

@Akuli
Copy link
Collaborator

Akuli commented Nov 21, 2023

Would this approach work?

  • We say that a directory is marked, if it contains a py.typed file, or one of its parent directories contains a py.typed.
  • The package is typed, if every .py file is in a marked directory.

@AlexWaygood
Copy link
Member Author

Would this approach work?

  • We say that a directory is marked, if it contains a py.typed file, or one of its parent directories contains a py.typed.
  • The package is typed, if every .py file is in a marked directory.

Yes, that does sound like a promising alternative. I'll give it a try and see how it compares.

@AlexWaygood
Copy link
Member Author

That made it a lot simpler, thanks!

@AlexWaygood AlexWaygood requested a review from Akuli November 21, 2023 22:18
@hauntsaninja
Copy link
Collaborator

Have you backtested this on a set of known py.typed packages? It's possible things like shipping an untyped tests/ are not uncommon.

(to be clear, I'm fine with some false negatives, but curious if we know what they would look like in practice)

@Akuli
Copy link
Collaborator

Akuli commented Nov 21, 2023

I'm probably not going to review this. I'm not super familiar with the internal layout of various Python packages or stubsabot.

@AlexWaygood
Copy link
Member Author

Have you backtested this on a set of known py.typed packages? It's possible things like shipping an untyped tests/ are not uncommon.

(to be clear, I'm fine with some false negatives, but curious if we know what they would look like in practice)

I haven't done exhaustive testing (mostly because I became exhausted). There are definitely distributions in typeshed that have some "interesting" things shipped in their sdists: take a look at docopt:

>>>
 with tarfile.open(r"C:\Users\alexw\Downloads\docopt-0.6.2.tar.gz", mode="r:gz") as tf:
...     tf.list(verbose=False)
...
docopt-0.6.2/
docopt-0.6.2/docopt.egg-info/
docopt-0.6.2/docopt.egg-info/dependency_links.txt
docopt-0.6.2/docopt.egg-info/PKG-INFO
docopt-0.6.2/docopt.egg-info/SOURCES.txt
docopt-0.6.2/docopt.egg-info/top_level.txt
docopt-0.6.2/docopt.py
docopt-0.6.2/examples/
docopt-0.6.2/examples/arguments_example.py
docopt-0.6.2/examples/calculator_example.py
docopt-0.6.2/examples/counted_example.py
docopt-0.6.2/examples/git/
docopt-0.6.2/examples/git/git.py
docopt-0.6.2/examples/git/git_add.py
docopt-0.6.2/examples/git/git_branch.py
docopt-0.6.2/examples/git/git_checkout.py
docopt-0.6.2/examples/git/git_clone.py
docopt-0.6.2/examples/git/git_commit.py
docopt-0.6.2/examples/git/git_push.py
docopt-0.6.2/examples/git/git_remote.py
docopt-0.6.2/examples/naval_fate.py
docopt-0.6.2/examples/odd_even_example.py
docopt-0.6.2/examples/options_example.py
docopt-0.6.2/examples/options_shortcut_example.py
docopt-0.6.2/examples/quick_example.py
docopt-0.6.2/examples/validation_example.py
docopt-0.6.2/LICENSE-MIT
docopt-0.6.2/MANIFEST.in
docopt-0.6.2/PKG-INFO
docopt-0.6.2/README.rst
docopt-0.6.2/setup.cfg
docopt-0.6.2/setup.py

Presumably if doctopt ever became py.typed (unlikely, considering the last release was in 2014), it wouldn't add a py.typed file to the examples/ directory that's shipped as part of the sdist

@AlexWaygood AlexWaygood removed the request for review from Akuli November 21, 2023 22:42
@AlexWaygood
Copy link
Member Author

I'm probably not going to review this. I'm not super familiar with the internal layout of various Python packages or stubsabot.

very reasonable!

@AlexWaygood
Copy link
Member Author

I'll try to do more manual testing of this tomorrow.

@AlexWaygood
Copy link
Member Author

I made this diff locally, committed it, and then ran python scripts/stubsabot.py --action-level nothing to test this PR:

Diff
diff --git a/stubs/flake8-plugin-utils/METADATA.toml b/stubs/flake8-plugin-utils/METADATA.toml
index 4874bcd4d..36cbf8860 100644
--- a/stubs/flake8-plugin-utils/METADATA.toml
+++ b/stubs/flake8-plugin-utils/METADATA.toml
@@ -1,7 +1,6 @@
 version = "1.3.*"
 upstream_repository = "https://github.com/afonasev/flake8-plugin-utils"
 partial_stub = true
-obsolete_since = "1.3.3" # Released on 2023-06-26

 [tool.stubtest]
 ignore_missing_stub = true
diff --git a/stubs/pluggy/METADATA.toml b/stubs/pluggy/METADATA.toml
index 69173ea16..29b734085 100644
--- a/stubs/pluggy/METADATA.toml
+++ b/stubs/pluggy/METADATA.toml
@@ -1,3 +1,2 @@
 version = "1.2.0"
 upstream_repository = "https://github.com/pytest-dev/pluggy"
-obsolete_since = "1.3.0" # Released on 2023-08-26
diff --git a/stubs/stdlib-list/METADATA.toml b/stubs/stdlib-list/METADATA.toml
index 4d6dcf39d..f9f553a2c 100644
--- a/stubs/stdlib-list/METADATA.toml
+++ b/stubs/stdlib-list/METADATA.toml
@@ -1,3 +1,2 @@
 version = "0.8.*"
 upstream_repository = "https://github.com/pypi/stdlib-list"
-obsolete_since = "0.9.0" # Released on 2023-06-22
diff --git a/stubs/stripe/METADATA.toml b/stubs/stripe/METADATA.toml
index 59f103da7..b0a9497ad 100644
--- a/stubs/stripe/METADATA.toml
+++ b/stubs/stripe/METADATA.toml
@@ -1,7 +1,6 @@
 version = "3.5.*"
 upstream_repository = "https://github.com/stripe/stripe-python"
 partial_stub = true
-obsolete_since = "7.1.0" # Released on 2023-10-27

 [tool.stubtest]
 ignore_missing_stub = true
diff --git a/stubs/tree-sitter/METADATA.toml b/stubs/tree-sitter/METADATA.toml
index 30d271954..350844ff0 100644
--- a/stubs/tree-sitter/METADATA.toml
+++ b/stubs/tree-sitter/METADATA.toml
@@ -1,3 +1,2 @@
 version = "0.20.1"
 upstream_repository = "https://github.com/tree-sitter/py-tree-sitter"
-obsolete_since = "0.20.3" # Released on 2023-11-13
diff --git a/stubs/tzlocal/METADATA.toml b/stubs/tzlocal/METADATA.toml
index f3e200e94..a5ef047ab 100644
--- a/stubs/tzlocal/METADATA.toml
+++ b/stubs/tzlocal/METADATA.toml
@@ -1,4 +1,3 @@
 version = "5.1"
 upstream_repository = "https://github.com/regebro/tzlocal"
 requires = ["types-pytz"]
-obsolete_since = "5.2" # Released on 2023-10-22

This was the output (all looks good as far as I can see):

Skipping aiofiles: up to date
Skipping cachetools: up to date
Skipping chevron: up to date
Skipping click-default-group: up to date
Skipping croniter: up to date
Skipping colorama: up to date
Skipping aws-xray-sdk: up to date
Skipping commonmark: up to date
Skipping decorator: up to date
Skipping click-spinner: up to date
Skipping bleach: up to date
Skipping docopt: up to date
Skipping dockerfile-parse: up to date
Skipping Deprecated: up to date
Skipping beautifulsoup4: up to date
Skipping braintree: up to date
Skipping console-menu: up to date
Skipping first: up to date
Skipping flake8-bugbear: up to date
Skipping flake8-builtins: up to date
Skipping flake8-2020: up to date
Skipping flake8-docstrings: up to date
Skipping flake8-rst-docstrings: up to date
Skipping flake8-simplify: up to date
Skipping ExifRead: up to date
Skipping Flask-Cors: up to date
Skipping flake8-typing-imports: up to date
Skipping entrypoints: up to date
Skipping Flask-SocketIO: up to date
Skipping Flask-Migrate: up to date
Skipping editdistance: up to date
Skipping html5lib: up to date
Skipping httplib2: up to date
Skipping boto: up to date
Skipping inifile: up to date
Skipping fpdf2: up to date
Skipping docutils: up to date
Skipping google-cloud-ndb: up to date
Skipping jmespath: up to date
Updating boltons from '23.0.*' to '23.1.*'
Skipping keyboard: up to date
Skipping greenlet: up to date
Skipping humanfriendly: up to date
Skipping jsonschema: up to date
Skipping JACK-Client: up to date
Skipping Markdown: up to date
Skipping mock: up to date
Skipping ldap3: up to date
Skipping mypy-extensions: up to date
Skipping influxdb-client: up to date
Marking flake8-plugin-utils as obsolete since '1.3.3'
Skipping oauthlib: up to date
Skipping opentracing: up to date
Skipping caldav: up to date
Skipping mysqlclient: up to date
Skipping paramiko: up to date
Skipping openpyxl: up to date
Skipping paho-mqtt: up to date
Skipping passpy: up to date
Skipping pep8-naming: up to date
Skipping libsass: up to date
Skipping pexpect: up to date
Skipping pika: up to date
Skipping playsound: up to date
Skipping polib: up to date
Skipping portpicker: up to date
Skipping pyasn1: up to date
Updating dateparser from '1.1.*' to '1.2.*'
Skipping pyaudio: up to date
Skipping parsimonious: up to date
Skipping PyAutoGUI: up to date
Skipping pycocotools: up to date
Skipping passlib: up to date
Skipping pyfarmhash: up to date
Skipping cffi: up to date
Skipping pycurl: up to date
Skipping pyflakes: up to date
Skipping PyMySQL: up to date
Updating protobuf from '4.24.*' to '4.25.*'
Skipping Pillow: up to date
Skipping peewee: up to date
Skipping pyjks: up to date
Skipping pynput: up to date
Skipping psycopg2: up to date
Skipping pyOpenSSL: up to date
Skipping psutil: up to date
Skipping pyserial: up to date
Skipping pysftp: up to date
Skipping PyScreeze: up to date
Skipping pytest-lazy-fixture: up to date
Skipping python-datemath: up to date
Skipping pyinstaller: up to date
Skipping python-crontab: up to date
Skipping pyRFC3339: up to date
Skipping python-dateutil: up to date
Skipping python-gflags: up to date
Skipping python-xlib: up to date
Skipping python-jose: up to date
Skipping python-slugify: up to date
Skipping pyxdg: up to date
Skipping qrcode: up to date
Skipping PyYAML: up to date
Skipping hdbcli: up to date
Skipping requests-oauthlib: up to date
Skipping requests: up to date
Skipping retry: up to date
Skipping python-nmap: up to date
Skipping pytz: up to date
Skipping Send2Trash: up to date
Skipping singledispatch: up to date
Skipping s2clientprotocol: up to date
Skipping seaborn: up to date
Skipping six: up to date
Skipping slumber: up to date
Updating Pygments from '2.16.*' to '2.17.*'
Skipping tabulate: up to date
Skipping toml: up to date
Skipping toposort: up to date
Marking pluggy as obsolete since '1.3.0'
Skipping translationstring: up to date
Skipping tqdm: up to date
Skipping netaddr: up to date
Skipping ujson: up to date
Skipping untangle: up to date
Skipping ttkthemes: up to date
Skipping waitress: up to date
Skipping usersettings: up to date
Skipping simplejson: up to date
Skipping vobject: up to date
Skipping WebOb: up to date
Skipping whatthepatch: up to date
Skipping xmltodict: up to date
Skipping regex: up to date
Skipping zxcvbn: up to date
Skipping uWSGI: up to date
Skipping workalendar: up to date
Skipping WTForms: up to date
Skipping zstd: up to date
Updating setuptools from '68.2.*' to '69.0.*'
Marking tzlocal as obsolete since '5.2'
Skipping tree-sitter-languages: up to date
Updating tensorflow from '2.12.*' to '2.15.*'
Marking tree-sitter as obsolete since '0.20.4'
Marking stdlib-list as obsolete since '0.9.0'
Skipping pywin32: up to date
Marking redis as obsolete since '5.0.0'
Marking stripe as obsolete since '7.1.0'
Skipping ibm-db: up to date

@AlexWaygood
Copy link
Member Author

AlexWaygood commented Nov 22, 2023

I wrote this script (and saved it inside the scripts/ directory) to further test the logic I'm adding here. The assertion passed for all tested py.typed distributions except for mypy and pyvmomi:

Script
import asyncio
import aiohttp
from stubsabot import fetch_pypi_info, release_contains_py_typed

py_typed_packages = [
#    "mypy",
    "Flask-SQLAlchemy",
    "SQLAlchemy",
    "typeshed-stats",
    "urllib3",
    "annoy",
    "freezegun",
    "certifi",
    "cryptography",
    "selenium",
    "emoji",
    "dj-database-url",
#    "pyvmomi",
    "invoke",
    "babel",
    "chardet",
    "prettytable",
    "termcolor",
    "xxhash",
    "orjson",
    "attrs",
]


async def check_package_detected_as_py_typed(distribution: str, session: aiohttp.ClientSession) -> None:
    pypi_info = await fetch_pypi_info(distribution, session=session)
    latest_release = next(release for release in pypi_info.releases_in_descending_order() if not release.version.is_prerelease)
    assert await release_contains_py_typed(latest_release, session=session), distribution


async def check_all_packages() -> None:
    async with aiohttp.ClientSession() as session:
        tasks = (check_package_detected_as_py_typed(package, session) for package in py_typed_packages)
        await asyncio.gather(*tasks)


if __name__ == "__main__":
    asyncio.run(check_all_packages())

The mypy assertion fails because the mypy PyPI distribution provides two packages in the wheel: mypy and mypyc. Only the mypy package is py.typed; this seems like a true positive.

The pyvmomi assertion fails because the pyvmomi distribution provides several packages in the sdist (pyvmomi only provides an sdist, not a wheel): pyVim, pyVmomi, and tests. Of these three packages, only the pyVmomi package contains a py.typed file. The tests package is not installed into site-packages when you do pip install pyvmomi, but the pyVim package is.

@AlexWaygood
Copy link
Member Author

AlexWaygood commented Nov 22, 2023

So, answering @hauntsaninja's question from #11053 (comment) -- overall, I think this improves the balance between false positives and false negatives. It gets rid of the clear false positive from #11049, and I've only found one possible false negative (pyvmomi) in the semi-random assortment of 27 py.typed packages that I've tested between #11053 (comment) and #11053 (comment).

Copy link
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into it!

@AlexWaygood AlexWaygood merged commit a40e683 into python:main Nov 22, 2023
@AlexWaygood AlexWaygood deleted the stubsabot-pytyped-confusion branch November 22, 2023 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants