Skip to content

Conversation

nirrozenbaum
Copy link
Contributor

@nirrozenbaum nirrozenbaum commented May 27, 2025

as discussed in scheduler design latest discussions, the plan is to completely remove the Pre/Post Cycle (previously called Pre/Post Schedule) from Scheduler. atm the only plugin that uses both is Prefix.
This PR completely removes PreCycle plugin from the scheduler.

As a side effect, this PR also fixes the following bug:

before this PR, the logic that calculated hashes in Prefix plugin PreCycle could have been buggy, since pods that have the longest prefix may be filtered out in the filter phase.

This PR is fixing this problem by calculating longest prefix only for the pods that passed the filtering phase.

cc: @ahg-g @kfswain @liu-cong

@netlify
Copy link

netlify bot commented May 27, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 27548bf
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/6836a20a4893d50008c51747
😎 Deploy Preview https://deploy-preview-876--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 27, 2025
@k8s-ci-robot k8s-ci-robot requested review from Jeffwan and robscott May 27, 2025 18:40
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 27, 2025
@liu-cong
Copy link
Contributor

/assign

@liu-cong
Copy link
Contributor

the plan is to completely remove the Pre/Post Cycle (previously called Pre/Post Schedule) from Scheduler

I am fine with removing the preCycle as there is no use case for it anyway.

Is the plan to replace PostCycle with PostResponse? IMO we need something like PreRequest. Consider this scenario, two requests longPrefix+suffix1 and longPrefix+suffix2 arrive around the same time. Suppose each request will take a long time to get the response back, with PostResponse you can schedule them onto 2 different pods (no prefix affinity) due to the latency of getting the response back. With PreRequest we can still achieve prefix cache affinity.

Copy link
Contributor

@liu-cong liu-cong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 28, 2025
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 28, 2025
@nirrozenbaum
Copy link
Contributor Author

@liu-cong applied the suggested changes, need lgtm again :)

@liu-cong
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 28, 2025
@nirrozenbaum
Copy link
Contributor Author

the plan is to completely remove the Pre/Post Cycle (previously called Pre/Post Schedule) from Scheduler

I am fine with removing the preCycle as there is no use case for it anyway.

Is the plan to replace PostCycle with PostResponse? IMO we need something like PreRequest. Consider this scenario, two requests longPrefix+suffix1 and longPrefix+suffix2 arrive around the same time. Suppose each request will take a long time to get the response back, with PostResponse you can schedule them onto 2 different pods (no prefix affinity) due to the latency of getting the response back. With PreRequest we can still achieve prefix cache affinity.

@liu-cong the discussion on scheduler extension points and interfaces is still ongoing (in PR #845).
as far as I understand, the below diagram is where we're currently at in terms of the discussions (or something along these lines, e.g., names may be different). things can still change as this is still WIP.
image

@ahg-g
Copy link
Contributor

ahg-g commented May 28, 2025

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 28, 2025
@k8s-ci-robot k8s-ci-robot merged commit 7c830cb into kubernetes-sigs:main May 28, 2025
8 checks passed
@nirrozenbaum nirrozenbaum deleted the rm-precycle branch May 29, 2025 03:32
irar2 pushed a commit to irar2/gateway-api-inference-extension that referenced this pull request Jun 3, 2025
* remove the PreCycle plugin from scheduler

Signed-off-by: Nir Rozenbaum <[email protected]>

* Apply suggestions from code review

Co-authored-by: Cong Liu <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
shmuelk pushed a commit to shmuelk/gateway-api-inference-extension that referenced this pull request Jun 3, 2025
* remove the PreCycle plugin from scheduler

Signed-off-by: Nir Rozenbaum <[email protected]>

* Apply suggestions from code review

Co-authored-by: Cong Liu <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
shmuelk pushed a commit to shmuelk/gateway-api-inference-extension that referenced this pull request Jun 9, 2025
* remove the PreCycle plugin from scheduler

Signed-off-by: Nir Rozenbaum <[email protected]>

* Apply suggestions from code review

Co-authored-by: Cong Liu <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
rlakhtakia pushed a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request Jun 11, 2025
* remove the PreCycle plugin from scheduler

Signed-off-by: Nir Rozenbaum <[email protected]>

* Apply suggestions from code review

Co-authored-by: Cong Liu <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants