merge has capacity filter with sheddable filter. #809

nirrozenbaum · 2025-05-09T05:48:08Z

has capacity only use was for sheddable requests (passthrough for critical ones).

has capacity only use was for sheddable requests (passthrough for critical ones). Signed-off-by: Nir Rozenbaum <[email protected]>

netlify · 2025-05-09T05:48:14Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`17b1c81`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/681e5a0577a27100087ba26e
😎 Deploy Preview	https://deploy-preview-809--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

nirrozenbaum · 2025-05-09T05:48:31Z

cc @liu-cong @ahg-g

pkg/epp/scheduling/plugins/filter/filter_test.go

liu-cong · 2025-05-09T16:30:49Z

pkg/epp/scheduling/plugins/filter/filter_test.go

 		{
 			name:   "lowQueueAndLessThanKVCacheThresholdPredicate",
-			filter: &HasCapacityFilter{queueThreshold: 0, kvCacheThreshold: 0.8},
+			req:    &types.LLMRequest{Critical: false},


nit: Can you also add a test case on Critical: true?

This is maybe not worth the effort considering #808 --> #805.

kfswain · 2025-05-09T16:54:02Z

/approve
will leave lgtm to others

k8s-ci-robot · 2025-05-09T16:54:09Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain, nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [kfswain]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

LukeAVanDrie · 2025-05-09T19:21:52Z

Just a heads up, I delete these files anyways in #805. Capacity decisions should not be a responsibility of the scheduler per the architecture proposal. Admission control (and criticality based service differentiation) should happen outside the scheduler (long term in the flow controller). The scheduler should then decide the optimal pod to route approved requests to. No reason not to submit this though.

nirrozenbaum · 2025-05-09T19:28:29Z

Just a heads up, I delete these files anyways in #805. Capacity decisions should not be a responsibility of the scheduler per the architecture proposal. Admission control (and criticality based service differentiation) should happen outside the scheduler (long term in the flow controller). The scheduler should then decide the optimal pod to route approved requests to. No reason not to submit this though.

@LukeAVanDrie yeah, sounds good.
Current PR doesn’t change anything but merging two filters that are actually the same (one is calling the other). once criticality handling is done in flow control this can be removed

LukeAVanDrie · 2025-05-09T19:35:38Z

once criticality handling is done in flow control this can be removed

This is unrelated to this PR, but I guess long term, we need to also decide if this separation of responsibilities (specifically, request shedding) is a hard rule or just for the reference implementation. I can see instances where implementers would have custom scheduling plugins that may want to drop requests still.

Co-authored-by: Cong Liu <[email protected]>

liu-cong · 2025-05-09T20:22:48Z

/lgtm

* merge has capacity filter with sheddable filter. has capacity only use was for sheddable requests (passthrough for critical ones). Signed-off-by: Nir Rozenbaum <[email protected]> * Update pkg/epp/scheduling/plugins/filter/filter_test.go Co-authored-by: Cong Liu <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]> Co-authored-by: Cong Liu <[email protected]>

merge has capacity filter with sheddable filter.

cc69319

has capacity only use was for sheddable requests (passthrough for critical ones). Signed-off-by: Nir Rozenbaum <[email protected]>

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 9, 2025

k8s-ci-robot requested review from Jeffwan and robscott May 9, 2025 05:48

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 9, 2025

liu-cong reviewed May 9, 2025

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 9, 2025

Update pkg/epp/scheduling/plugins/filter/filter_test.go

17b1c81

Co-authored-by: Cong Liu <[email protected]>

k8s-ci-robot assigned liu-cong May 9, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 9, 2025

k8s-ci-robot merged commit 2dce3ea into kubernetes-sigs:main May 9, 2025
8 checks passed

nirrozenbaum deleted the sheddable-filter branch May 9, 2025 20:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

merge has capacity filter with sheddable filter. #809

merge has capacity filter with sheddable filter. #809

Uh oh!

nirrozenbaum commented May 9, 2025

Uh oh!

netlify bot commented May 9, 2025 •

edited

Loading

Uh oh!

nirrozenbaum commented May 9, 2025

Uh oh!

Uh oh!

liu-cong May 9, 2025

Uh oh!

LukeAVanDrie May 9, 2025

Uh oh!

kfswain commented May 9, 2025

Uh oh!

k8s-ci-robot commented May 9, 2025

Uh oh!

LukeAVanDrie commented May 9, 2025 •

edited

Loading

Uh oh!

nirrozenbaum commented May 9, 2025

Uh oh!

LukeAVanDrie commented May 9, 2025 •

edited

Loading

Uh oh!

liu-cong commented May 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

merge has capacity filter with sheddable filter. #809

merge has capacity filter with sheddable filter. #809

Uh oh!

Conversation

nirrozenbaum commented May 9, 2025

Uh oh!

netlify bot commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for gateway-api-inference-extension ready!

Uh oh!

nirrozenbaum commented May 9, 2025

Uh oh!

Uh oh!

liu-cong May 9, 2025

Choose a reason for hiding this comment

Uh oh!

LukeAVanDrie May 9, 2025

Choose a reason for hiding this comment

Uh oh!

kfswain commented May 9, 2025

Uh oh!

k8s-ci-robot commented May 9, 2025

Uh oh!

LukeAVanDrie commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nirrozenbaum commented May 9, 2025

Uh oh!

LukeAVanDrie commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liu-cong commented May 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

netlify bot commented May 9, 2025 •

edited

Loading

LukeAVanDrie commented May 9, 2025 •

edited

Loading

LukeAVanDrie commented May 9, 2025 •

edited

Loading