Skip to content

Conversation

@MrFreezeex
Copy link
Member

@MrFreezeex MrFreezeex commented Nov 25, 2025

  • One-line PR description: add more conflict condition on asymetrical traffic
  • Other comments:

Make ports raise a conflict when it's not a exact match and a note describing that implementation must not redirect traffic to endpoints from services that actually doesn't declare this port.

Also suggest doing the same for IPFamilies which might have asymmetrical issues. It's merely a suggestion as IPfamilies handling are implementation defined and some implementation may not have issues like that.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 25, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: MrFreezeex
Once this PR has been reviewed and has the lgtm label, please assign jeremyot for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/multicluster Categorizes an issue or PR as relevant to SIG Multicluster. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 25, 2025
@mikemorris
Copy link
Member

Suggested approach makes sense to me and feels less disruptive than changing the guidance from union to intersection.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 25, 2025
@zhiying-lin
Copy link
Contributor

LGTM, thank you Arthur!

set of exported services don’t match, the clusterset service will expose the
union of service ports declared on its constituent services.
union of service ports declared on its constituent services and raise a `PortConflict`
conflict condition. In that case, network traffic must be directed only to endpoints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In sentence for how IPFamilies should be handled above, its directed that the implementer "may" raise a conflict, while this one I'm commenting on here which is for ports says they "will". This line about ports is also more strict on what must be done for routing ("must be directed only") vs how it is described above for IPFamilies ("might result in network traffic reaching only a subset"). Is the difference in how these are treated on purpose? Based on what I saw from the notes from when we discussed in SIG-MC (ref) I think they should both mandate that the conflict raise should be required but how the implementation routes should be implementation defined.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes indeed, I used "may" for IPFamilies because the exact handling is all implementation defined but since there is a "when" in the sentence which may not apply to some implementations it seems fine to change the "may" by a "must" and some implementations won't need to care about that at all. We would most likely not be able to check that in the conformance tests though but that's a separate concerns from the KEP anyway!

@MrFreezeex MrFreezeex force-pushed the KEP1645-port-ipfamilies-more-conflict branch from 3442bf2 to 639a31b Compare December 3, 2025 18:17
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 3, 2025
@MrFreezeex MrFreezeex force-pushed the KEP1645-port-ipfamilies-more-conflict branch from 639a31b to 8f3993b Compare December 3, 2025 18:26
@MrFreezeex MrFreezeex force-pushed the KEP1645-port-ipfamilies-more-conflict branch from 8f3993b to baf4e0c Compare December 9, 2025 17:49
@lauralorenz
Copy link
Contributor

Talked in SIG-MC 12/9 about the change and how this PR is addressing two categories of thing (IPFamilies, Ports) that themselves have two things to address (whether to raise a conflict condition, and how prescriptive the KEP is about how predictable the routing is thereafter). The wording at the time seemed to me to have IPFamilies be strict on raising condition, but loose on routing, while in the Ports case it was strict on both. I talked about how I wanted the two categories (IPFamilies and Ports) the same in how they approach both of those since to me they seemed the same problem. As of now the wording was updated so that they treat each the same, so

/lgtm

That being said I want to add some background on what we talked about because I think it's relevant for any future changes.

We did talk a little bit about how a service with some of its endpoints having a new port (like old backends exposing port 80 but newer backends exposing port 80 and port 81) how it's a little different in how the consumer chooses which one to contact (as they may be materially different applications?) vs what we think a consumer would expect would be different from contacting a service with IPv4 vs IPv6 (though Arthur said it could be possible an intermediary gateway could still make those meaningfully different services in a predictable way if a user wanted to, lol). THEN I got stuck on how that situation was an invalid representation of the philosophical assumption in MCS that all backends are fungible as the same Service. THEN THEN Arthur brought up that the fact that there is a conflict at all especially at the Port level is already degrading that assumption in the first place. And in the end we got to a place where the routing is now not prescriptive for either of them, BUT I'm open to hardening that in the future especially as we see what implementations do.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 9, 2025
union of service ports declared on its constituent services.
union of service ports declared on its constituent services and raise a `PortConflict`
conflict condition. In that case, network traffic should be directed only to endpoints
from constituent services that actually expose the targeted port.
Copy link
Contributor

@tpantelis tpantelis Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify "targeted port", specifically in relation to the prior language that talks about "ports" (plural)? I assume by "targeted port" you mean a port that is in conflict, meaning one that is configured on one constituent service but not another. It sounds like you're saying such a port should still be exposed but only from the constituent cluster that has it. If so, is this now a strict requirement for implementations?

BTW, in Submariner, we only expose a port if it's configured for every constituent service.

Copy link
Member Author

@MrFreezeex MrFreezeex Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify "targeted port", specifically in relation to the prior language that talks about "ports" (plural)? I assume by "targeted port" you mean a port that is in conflict, meaning one that is configured on one constituent service but not another.

"targeted port" reference the network traffic (will try to clarify the sentence) but yes it match your explanation.

If so, is this now a strict requirement for implementations?

This PR started like that but the must is now a should so it's more a recommendation for implementation than a strict requirement in the current text then.

BTW, in Submariner, we only expose a port if it's configured for every constituent service.

Ah! But how does that works with the fact that the ServiceImport is exposing a union somehow?

In Cilium we do the union on the ServiceImport like MCS-API KEP is enforcing and pass it down directly to our derived Service and we keep the port name/number from the EndpointSlice/backend that we get from all the clusters. And IIUC what happens in kube-proxy and similarly in Cilium is that the port is matched by its name between the EndpointSlice/backends and the Service. So if you add a new port we will correctly only route traffic to the constituent clusters that actually have this port exposed, however if there is a conflict on the port name it might have some weird behavior (and it makes me think that I should probably at least document that edge case on our side 😅).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! But how does that works with the fact that the ServiceImport is exposing a union somehow?

The ServiceImport union is just for conformance - we don't use it. The real action is with the EndpointSlices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/multicluster Categorizes an issue or PR as relevant to SIG Multicluster. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants