Skip to content

Conversation

facutuesca
Copy link
Contributor

@facutuesca facutuesca commented Aug 14, 2024

Fixes #16474

Verifies URLs that link to the project's page on PyPI. URLs are normalized using rfc3986, and trailing slashes (e.g: https://pypi.org/p/my_project/) can be present or not.

image

(Homepage links to https://pypi.org/project/$PROJECT_NAME)

cc @di @woodruffw

@woodruffw woodruffw added the UX/UI design, user experience, user interface label Aug 14, 2024
Copy link
Member

@woodruffw woodruffw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @facutuesca!

Comment on lines +906 to +911
"verified": _verify_url(
url=url,
publisher_url=publisher_base_url,
project_name=project.name,
project_normalized_name=project.normalized_name,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer something like this instead unless there's a strong reason not to:

Suggested change
"verified": _verify_url(
url=url,
publisher_url=publisher_base_url,
project_name=project.name,
project_normalized_name=project.normalized_name,
),
"verified": any(
check(
url=url,
publisher_url=publisher_base_url,
project_name=project.name,
project_normalized_name=project.normalized_name,
)
for check in [_verify_url_pypi, _verify_url_with_trusted_publisher]
),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would mean we need to pass the same parameters to both _verify_* functions, so some of them will be unused in each.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's OK? I'm concerned that _verify_url has logic that short-circuits verification outside of the individual checks, and thinking it'll be more straightforward to just write additional checks and add them to this list than determine where they should fall in the logic for that function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(But also, want to ensure that we're still lazily evaluating these checks)

Copy link
Contributor Author

@facutuesca facutuesca Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. Could you check trail-of-forks#3013 ? It's my idea for moving the TP-specific verification to the OIDCPublisherMixin class, since we'll need to specialize them depending on the specific TP provider.

If we go ahead with that, at least one check will be a method of that class, and the proposed change here would have to be:

any(
    check(...)
    for check in [_verify_url_pypi, publisher.verify_url] if publisher else [_verify_url_pypi])

(or similar), and we would still need to keep the same common list of parameters for all verify functions.

If that's fine I can accept the suggestion. My only concern is that as verification logic grows, having all current and future functions take the same (possibly growing) list of parameters might not be great.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we'll need to specialize them depending on the specific TP provider.

That's a good point. I think I'd be in favor of that refactor once we introduce more verifications. Then this could become something like:

Suggested change
"verified": _verify_url(
url=url,
publisher_url=publisher_base_url,
project_name=project.name,
project_normalized_name=project.normalized_name,
),
"verified": any(
check(url)
for check in [
partial(
_verify_url_pypi(
project_name=project_name,
project_normalized_name=project_normalized_name,
)
),
partial(publisher.verify_url) if publisher else lambda _: False
),

I think I'm over-optimizing a bit here though, going to approve/merge this as-is for now and we can revisit when things change.

@di di enabled auto-merge (squash) August 15, 2024 18:13
@di di merged commit b4b0424 into pypi:main Aug 15, 2024
@facutuesca facutuesca deleted the verify-pypi-urls branch August 16, 2024 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
UX/UI design, user experience, user interface
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Project metadata: direct PyPI URL is not considered verified
3 participants