-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Verify URLs that link to the project page on PyPI #16485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: William Woodruff <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @facutuesca!
"verified": _verify_url( | ||
url=url, | ||
publisher_url=publisher_base_url, | ||
project_name=project.name, | ||
project_normalized_name=project.normalized_name, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would prefer something like this instead unless there's a strong reason not to:
"verified": _verify_url( | |
url=url, | |
publisher_url=publisher_base_url, | |
project_name=project.name, | |
project_normalized_name=project.normalized_name, | |
), | |
"verified": any( | |
check( | |
url=url, | |
publisher_url=publisher_base_url, | |
project_name=project.name, | |
project_normalized_name=project.normalized_name, | |
) | |
for check in [_verify_url_pypi, _verify_url_with_trusted_publisher] | |
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would mean we need to pass the same parameters to both _verify_*
functions, so some of them will be unused in each.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's OK? I'm concerned that _verify_url
has logic that short-circuits verification outside of the individual checks, and thinking it'll be more straightforward to just write additional checks and add them to this list than determine where they should fall in the logic for that function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(But also, want to ensure that we're still lazily evaluating these checks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. Could you check trail-of-forks#3013 ? It's my idea for moving the TP-specific verification to the OIDCPublisherMixin
class, since we'll need to specialize them depending on the specific TP provider.
If we go ahead with that, at least one check will be a method of that class, and the proposed change here would have to be:
any(
check(...)
for check in [_verify_url_pypi, publisher.verify_url] if publisher else [_verify_url_pypi])
(or similar), and we would still need to keep the same common list of parameters for all verify functions.
If that's fine I can accept the suggestion. My only concern is that as verification logic grows, having all current and future functions take the same (possibly growing) list of parameters might not be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we'll need to specialize them depending on the specific TP provider.
That's a good point. I think I'd be in favor of that refactor once we introduce more verifications. Then this could become something like:
"verified": _verify_url( | |
url=url, | |
publisher_url=publisher_base_url, | |
project_name=project.name, | |
project_normalized_name=project.normalized_name, | |
), | |
"verified": any( | |
check(url) | |
for check in [ | |
partial( | |
_verify_url_pypi( | |
project_name=project_name, | |
project_normalized_name=project_normalized_name, | |
) | |
), | |
partial(publisher.verify_url) if publisher else lambda _: False | |
), |
I think I'm over-optimizing a bit here though, going to approve/merge this as-is for now and we can revisit when things change.
Fixes #16474
Verifies URLs that link to the project's page on PyPI. URLs are normalized using
rfc3986
, and trailing slashes (e.g:https://pypi.org/p/my_project/
) can be present or not.(Homepage links to
https://pypi.org/project/$PROJECT_NAME
)cc @di @woodruffw