Skip to content

Changes to InferenceModel Should Trigger EndpointSlice Reconciliation #151

@danehans

Description

@danehans

The data store is not populated with the required Pod details when the InferenceModel and InferencePool CRs are added after EPP is started:

Skipping reconciling EndpointSlice because the InferencePool is not available yet: InferencePool hasn't been initialized yet
...
===DEBUG: Current Pods and metrics: []

Add the example InferenceModel and InferencePool CRs:

reconciling InferencePooldefault/vllm-llama2-7b-pool
reconciling InferenceModeldefault/inferencemodel-sample
Incoming pool ref {inference.networking.x-k8s.io InferencePool vllm-llama2-7b-pool}, server pool name: vllm-llama2-7b-pool
Adding/Updating inference model: tweet-summary
===DEBUG: Current Pods and metrics: []

Recreate the EndpointSlice for the example service and the data store reflects the required Pod details:

I0106 17:05:36.413276       1 endpointslice_reconciler.go:34] Reconciling EndpointSlice default/vllm-llama2-7b-pool-xvkcm
I0106 17:05:36.545209       1 provider.go:92] ===DEBUG: Current Pods and metrics: [Pod: vllm-llama2-7b-pool-59d86b6c85-ktsw7:10.244.0.16:8000; Metrics: {ActiveModels:map[] MaxActiveModels:0 RunningQueueSize:0 WaitingQueueSize:0 KVCacheUsagePercent:0 KvCacheMaxTokenCapacity:0}]
...

EndpointSlice reconciliation should be triggered whenever an InferencePool CRUD operation occurs since it manages the internal Pod state which depends on InferencePool details, e.g. targetPortNumber.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions