-
Notifications
You must be signed in to change notification settings - Fork 82
Migrate the llm-d-inference-scheduler's configuration to the new text based configuration #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Shmuel Kallner <[email protected]>
…profile names Signed-off-by: Shmuel Kallner <[email protected]>
…ode profile name Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
deploy/config/epp-config.yaml
Outdated
hashBlockSize: 5 | ||
maxPrefixBlocksToMatch: 256 | ||
lruCapacityPerServer: 31250 | ||
- name: decodeFilter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, remove the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
deploy/config/epp-config.yaml
Outdated
apiVersion: inference.networking.x-k8s.io/v1alpha1 | ||
kind: EndpointPickerConfig | ||
plugins: | ||
- name: prefixScorer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the name and rely on the default value just like the other plugins at the end of this config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
mountPath: /etc/epp | ||
volumes: | ||
- name: epp-config | ||
configMap: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we place common canned configurations in the container image just so we don't have to create a configmap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As these are for development scripts, I think it's more important to make it easy to use other configurations as the developer desires.
kind: EndpointPickerConfig | ||
plugins: | ||
- type: single-profile | ||
- name: decodeFilter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, this looks redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/architecture.md
Outdated
``` | ||
The fields in a plugin entry are: | ||
- *name* which is optional, provides a name by which the plugin instance can be referenced. If this field is | ||
omitted, the plugin's type wil be used as its name.<br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
omitted, the plugin's type wil be used as its name.<br> | |
omitted, the plugin's type will be used as its name.<br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
docs/architecture.md
Outdated
- name: prefixScorer | ||
type: prefix-cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we converge on one naming convention, I like using pascal case (e.g., PrefixScorer
).
Also, I think the cases where a name needs to be set will be limited, so I recommend to not set it unless required to not give the impression that this is required (which looks strange and redundant at first glance).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a discussion issue in GIE: kubernetes-sigs/gateway-api-inference-extension#1086
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While there should be a naming convention for plugin types, I don't think that applies as much for plugin names, when used. In particular I made the plugin names, which have been since removed, camel case to show that they are different from the plugin types and the defaulted names.
decode = "decode" | ||
prefill = "prefill" | ||
defaultDecodeProfile = "decode" | ||
defaultPßrefillProfile = "prefill" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
docs/architecture.md
Outdated
specified in-line as a parameter. The configuration defines the set of plugins to be instantiated along with their parameters. Each plugin is also given a name, enabling the same plugin type to be instantiated | ||
multiple times, if needed. Also defined is a set of SchedulingProfiles, which determine the set of | ||
plugins to be used when scheduling a request. The set of plugins instantiated must also include a | ||
Profile Handler, which determines which SchedulingProfiles will be used for a particular request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ and how their results will be processed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
field is omitted, the plugin's type will be used as its name.<br> | ||
- *type* specifies the type of the plugin to be instantiated.<br> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need <br>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't format without all of the <br> in there
Using unordered lists or blank lines make the sections much longer and harder to read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you tried double space at the end of the line?
I think this should give you the same behavior, just without html tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like trailing spaces, which are unseen and easily get lost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall PR looks very good.
I left few nits, non blocking.
verifying - was this tested e2e?
/lgtm
/approve
Which end to end tests? I have used |
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: Shmuel Kallner <[email protected]>
Signed-off-by: konflux-internal-p02 <170854209+konflux-internal-p02[bot]@users.noreply.github.com> Co-authored-by: konflux-internal-p02[bot] <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
This PR:
This PR completes issue #201