-
Notifications
You must be signed in to change notification settings - Fork 180
move PostResponse plugins to requestcontrol instead of scheduler #914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Nir Rozenbaum <[email protected]>
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Signed-off-by: Nir Rozenbaum <[email protected]>
func NewDirector(datastore datastore.Datastore, scheduler Scheduler, saturationDetector SaturationDetector) *Director { | ||
return &Director{datastore, scheduler, saturationDetector} | ||
// WithPostResponsePlugins sets the given plugins as the PostResponse plugins. | ||
// If the Director has PostResponse plugins already, this call replaces the existing plugins with the given ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding to the list might be more appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intentionally left it out of the list, since this is an optional field.
I would like to avoid creation of empty slice (or using nil) when caller doesn't need any PostResponsePlugins.
If I add it as arg the code will look like:
director := requestcontrol.NewDirector(datastore, scheduler, detector, []PostResponsePlugin{})
OR
director := requestcontrol.NewDirector(datastore, scheduler, detector, nil)
on the other hand, since this field is optional, it is possible to initialize detector with or without it like this -
without:
director := requestcontrol.NewDirector(datastore, scheduler, detector)
with:
director := requestcontrol.NewDirector(datastore, scheduler, detector).
WithPostResponsePlugins(plugin1, plugin2, ...)
the latter gets also the same feeling of the Scheduler plugins.
pkg/epp/requestcontrol/types.go
Outdated
// Headers is a map of the response headers. Nil during body processing | ||
Headers map[string]string | ||
// Body Is the body of the response or nil during header processing | ||
Body map[string]string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: why is body a map and not a string (or array of strings, when streaming)?
Is it the parsed JSON?
If the post response plugin needs to know the model server (target Pod) selected, how is it communicated? Should it be stored in a previous callback and used here (e.g., map of request-id to selected pod)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. this was originally a string. I tested something locally and forgot to put back as string.
fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the post response plugin needs to know the model server (target Pod) selected, how is it communicated? Should it be stored in a previous callback and used here (e.g., map of request-id to selected pod)?
This is stored in a struct called RequestContext that gets filled during the lifecycle of a request.
https://github.com/nirrozenbaum/gateway-api-inference-extension/blob/post-response/pkg/epp/requestcontrol/director.go#L206
scorers: []*WeightedScorer{}, | ||
postCyclePlugins: []PostCycle{}, | ||
PostResponsePlugins: []PostResponse{}, | ||
filters: []Filter{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: realize this is a copy over, we should be consistent in naming (e.g., filter
vs filterPlugin
on the one hand and postCyclePlugins
on the other)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once PostResponse is out of scheduler, the next PR will be to remove PostCycle completely (there is no use case for that). we should be left with filters, scorers and picker here. so this is a temp state which should be resolved very soon.
package requestcontrol | ||
|
||
// Response contains information from the response received to be passed to PostResponse plugins | ||
type Response struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how different than that defined in pkg/epp/scheduling/types/types.go?
Should we converge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently it is not different.
I'm not sure if we should converge or keep them separate so each can evolve according to the package requirements.
in this PR the focus was solely on moving PostResponse out of Scheduler and into the requestcontrol layer. if we think it's better to converge Plugin interface I suggest to do it in a follow up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are mixising the schedulingtypes.LLMRequest
and requescontrol.Response
, I think we should at least be consistent.
I don't see a good reason why they need to differ. I think we can just consolidate them to a shared Request/Response objects, perhaps at the top level. If for any reason we need scheduler/director specific metadata they can be added to scheduler/director specific structs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is the main point here:
once PostResponse is out of Scheduler (I think we all agree it should), scheduler shouldn't care about anything related to response, and thus response is in the scope of requestcontrol.
on the other hand, in order to do a schedule call, one needs to provide the scheduler representation of the request.
(this is the current scheduling.LLMRequest).
in PostResponse we need a representation of the request that was scheduled, ideally after Prompt and other request properties have been unmarshaled, and the response.
the unmarshalled object we have for the request is the Scheduler request. I can also switch to using requestcontrol.Request but that would introduce another duplication of fields, so I preferred avoiding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me now multiple packages need to access some common request/response
structs, and it makes sense to create a common types
package containing request/response
. In the case of a specific package needing to extend from the common types, e.g., in scheduling, you can do
package scheduling
type Request struct {
commontypes.Request
additionalFields
}
But I am OK with deferring this.
Signed-off-by: Nir Rozenbaum <[email protected]>
d289af7
to
9316b51
Compare
Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Nir Rozenbaum <[email protected]>
…hen recording plugin time Signed-off-by: Nir Rozenbaum <[email protected]>
Can you raise an issue to discuss this? I will comment on some use cases that PostCycle/PreRequest is still needed. |
@liu-cong sure. we do have a PR where we discuss requirements - #905. this PR handles only the moving of PostResponse out of scheduler and into the requestcontrol layer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a couple of nits, otherwise lgtm
|
||
saturationDetector := saturationdetector.NewDetector(sdConfig, datastore, ctrl.Log) | ||
|
||
director := requestcontrol.NewDirector(datastore, scheduler, saturationDetector) // can call "director.WithPostResponsePlugins" to add post response plugins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I prefer adding WithPostResponsePlugins as an "Option" object and add that as an optinal argument to the NewDirector(). This is more discoverable and remove the need of this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't really need this comment :)
this comment was added just until we add the WithPostResponse usage in main.go
.
I implemented it this way to keep it consistent with how scheduler plugins are defined.
In general, both patterns are commonly used in go and more specifically in Kubernetes, but personally I prefer using the With...
approach which reads clearer to me and also allows adding it only when used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option pattern is optional as well, you only use when you need it. But it's more discoverable in the function signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The option pattern is optional as well, you only use when you need it.
right. I was not trying to say otherwise :).
was just making the point that both are common patterns that are widely used in the community, and personally I prefer the With..
approach, which is also aligned with what was done in Scheduler. so it keeps the plugins setup consistent across the layers.
RequestRunning bool | ||
Request *Request | ||
|
||
SchedulingRequest *schedulingtypes.LLMRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many duplicated fields in LLMRequest and RequestContext. Initially the LLMRequest was scoped to the scheduling package only.
Can we move LLMRequest out of scheduling package now it has wider scope? And consolidate duplicated fields such as the ResolvedTargetModel
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is in conflict with some of the comments on #845 where the discussion went to the direction that scheduler shouldn't rely on structs outside of the scheduling package.
yes, I agree there are duplicate fields.
we should probably converge such that those fields are kept in scheduling request only and removed from RequestContext.
if it's not a hard issue from you PoV, I suggest to defer it to a follow up PR since this hasn't change in this PR (this was the situation also before this PR).
I like to keep PRs tightly scoped (the scope of this PR is just the move of PostResponse out of scheduler).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add the scheduling pkg type as a parameter here so that we don't duplicate the parameters.
package requestcontrol | ||
|
||
// Response contains information from the response received to be passed to PostResponse plugins | ||
type Response struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are mixising the schedulingtypes.LLMRequest
and requescontrol.Response
, I think we should at least be consistent.
I don't see a good reason why they need to differ. I think we can just consolidate them to a shared Request/Response objects, perhaps at the top level. If for any reason we need scheduler/director specific metadata they can be added to scheduler/director specific structs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
package requestcontrol | ||
|
||
// Response contains information from the response received to be passed to PostResponse plugins | ||
type Response struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me now multiple packages need to access some common request/response
structs, and it makes sense to create a common types
package containing request/response
. In the case of a specific package needing to extend from the common types, e.g., in scheduling, you can do
package scheduling
type Request struct {
commontypes.Request
additionalFields
}
But I am OK with deferring this.
Since PostResponse will likely be implemented by the scheduling plugins (like prefix aware scheduling), I think we should continue to have its definition and execution handled by the scheduling pkg.
PostResponse is part of the scheduling pkg if we think about it as a callback for when the request it scheduled successfully executes. |
log.FromContext(ctx).V(logutil.DEBUG).Info("Running post-response plugin", "plugin", plugin.Name()) | ||
before := time.Now() | ||
plugin.PostResponse(ctx, request, response, targetPod) | ||
metrics.RecordRequestControlPluginProcessingLatency(PostResponsePluginType, plugin.Name(), time.Since(before)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This metric is tracking post response only, why is it called request_control_plugin_duration_seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR moves PostResponse plugin from Scheduler to Requestcontrol layer. it is the first plugin in this layer out of the ones that appear in the northstar doc.
more plugins in request control are expected to be added.
// WORKAROUND until PostResponse is out of Scheduler | ||
profileExecutionResults := map[string]*types.Result{} | ||
profiles := s.profilePicker.Pick(nil, s.profiles, profileExecutionResults) // all profiles | ||
for _, profile := range profiles { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't be running profiles on PostResponse, the profiles are defined for the plugins that get to run in the schedule call only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code was removed (and as comment states was a workaround until we get PostResponse out of scheduler)
@ahg-g hard disagree on the above. I think there are two valid approaches:
|
I am not sure how one can assert that it is completely out of scope when there is a concrete use case: prefix-aware scheduling. In this case, I am not thinking about PostResponse in the general sense of "handling response", but as a callback on the successful execution of the scheduled request to handle scheduling state maintained by the scheduling layer. Callbacks are a well established design pattern, and in cases like this, it make sense to have the definition of the callback in the layer that expects it.
Those two bullets seem to be one approach, which is what we should be doing in all cases; meaning all extension points that a plugin implements should be executable irrespective of which layer executes them. The main thing I was trying to address is ensuring that scheduling plugins remain encapsulated within the scheduling package. The question now is this actually important to maintain? |
@ahg-g I think the above statement is mixing things.
This is an implementation detail. One option to implement is using callback. the other option is to put extension point in the requestcontrol layer. I strongly prefer PostResponse in requestcontrol layer due to the reasons I specified above - response handling has nothing to do with scheduling package. this is also aligned with the northstar document and also with the previous discussions on #845 where we wrote (more than once) that PostResponse should NOT be part of the scheduler.
I'm not sure I get the point here. as of today plugins are under |
I am not looking at extension points in isolation, I am looking at it from the perspective of an end-to-end scheduling feature. So the use case has to do with scheduling state management that some scheduling features (like prefix-aware scheduling) require.
It makes sense to have PostResponse in the request control pkg, I am not disagreeing with that, but what I am trying to propose is that it also make sense to think of that extension point as a callback on a successful scheduling decision made by the scheduling layer. My thinking is that the latter interpretation of this extension point may be more relevant to us if most plugins that implement this extension point are primarily scheduling plugins. Another factor driving this other interpretation in my mind is to ensure the scheduling pkg and its plugins continue to be self contained to the extent that some other project could use it as is. I understand that this is not a priority shared by others though.
Placing the plugins in a separate pkg is indeed another option I was also thinking about, I held back from proposing it because I was hoping we could maintain the scheduling pkg as a self contained "library" as mentioned above. I don't want to hold this PR, so I am fine with this change as long as we have a principled approach to hosting plugins that implement extension points across layers. |
pkg/epp/requestcontrol/plugins.go
Outdated
type Plugin interface { | ||
// Name returns the name of the plugin. | ||
Name() string | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating a new base Plugin interface, the one in pkg/epp/scheduling/framework/plugin.go should be moved to perhaps pkg/epp/common/plugin.go.
This will help future configuration efforts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this, I am now thinking that we should have a separate plugins interface "pkg".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing. converged to a single Plugin interface under plugins pkg.
@ahg-g I think this is the best option. scheduling remains a self contained ”library”. having plugins implemented in a different package (e.g., in pkg/plugins) doesn’t break it. it’s kinda similar to having out of tree plugins. I’m also happy to converge on a single plugin interface that all layers will use, but strongly prefer to defer it to a follow up PR since I’d like to keep this one tightly scoped to moving PostSchedule outside of scheduler, while this change would include moving all plugins to a different package. The above change will also answer @shmuelk concern, and @elevran also mentioned it in one of his review comments. |
I was only talking about the base interface, not all of the other scheduler plugins. |
Referencing other networking systems with extension points (e.g., web proxies such as nginx or httpd) - they define a set of hook points and a module can implement/register for any that they need. There is no attempt to group them into a scoped collection by "phase". |
Right, but let me iterate one point that influenced this line of thinking: our mental model was to try and have a self contained scheduling library that can be used by other projects. This was my attempt to ensure that we continue to have this optionality. |
I am on board with this, but we need to find a reasonable structure to ensure it is easy to navigate the plugins, I would still consider grouping them somehow (for example: scheduling, flow control and data). Also, if we are going this direction, one thing to consider is to pull all definitions of the extensions interfaces, including the input/out types, into a common "plugins interface pkg". With this, the plugins implementation should have a single dependency, the plugins interface pkg only, wdyt? Perhaps we can have this discussion in the proposal PR. |
understood. I do wonder if that line of thought won't start pulling "pre-scheduling" (e.g., routing, flow control) into the scheduling the lib as well (same arguments can be made). |
This could be solved via directories ( |
this is an optional direction indeed. as @elevran mentioned, in this point in time all plugins are scheduling related, but thinking forward that will not be the case and we need to think from now how do we organize the plugins.
of course any structuring that we agree on is fine as long as it is documented. I also agree with @elevran following statement:
I think the scope of responsibility of each layer should be very well defined. and as I stated in one of the previous comments, when I'm trying to define what is the scope of the scheduler my thinking is something like - “Scheduler is responsible for selecting the best endpoint(s) for serving, given a request and the endpoints state.” I'm happy to converge on this topic in the proposal PR (although not exactly pure scheduler but more of general pluggability structuring). sounds to me like we're all are in alignment that having a |
Signed-off-by: Nir Rozenbaum <[email protected]>
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, nirrozenbaum The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ernetes-sigs#914) * move PostResponse plugins to requestcontrol instead of scheduler Signed-off-by: Nir Rozenbaum <[email protected]> * typo Signed-off-by: Nir Rozenbaum <[email protected]> * fixed typo raised by elevran in code review Signed-off-by: Nir Rozenbaum <[email protected]> * added general Plugin interface in requestcontrol layer Signed-off-by: Nir Rozenbaum <[email protected]> * removed LLMResponse from scheduler Signed-off-by: Nir Rozenbaum <[email protected]> * added metric for request-control plugin and fixed a copy paste typo when recording plugin time Signed-off-by: Nir Rozenbaum <[email protected]> * converged to a single plugins interface based on code review Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]>
…ernetes-sigs#914) * move PostResponse plugins to requestcontrol instead of scheduler Signed-off-by: Nir Rozenbaum <[email protected]> * typo Signed-off-by: Nir Rozenbaum <[email protected]> * fixed typo raised by elevran in code review Signed-off-by: Nir Rozenbaum <[email protected]> * added general Plugin interface in requestcontrol layer Signed-off-by: Nir Rozenbaum <[email protected]> * removed LLMResponse from scheduler Signed-off-by: Nir Rozenbaum <[email protected]> * added metric for request-control plugin and fixed a copy paste typo when recording plugin time Signed-off-by: Nir Rozenbaum <[email protected]> * converged to a single plugins interface based on code review Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]>
This PR moves PostResponse plugins out of scheduler and into requestcontrol layer (more specifically into the director).
several code changes were done as part of this change:
main.go
in order to allow easy setup of plugins as we have with scheduler. a plugin should be able to register to both scheduler plugin and post response plugin in main.WithPostResponsePlugins
function to allow easy setup of the PostResponsePlugins.gateway-api-inference-extension/pkg/epp/requestcontrol/director_test.go
Line 525 in ce5bae2
in a follow up PR, we should update the prefix plugin to use
PostResponse
instead ofPostCycle
extension point and then remove completely the PostCycle.