generated from kubernetes/kubernetes-template-project
-
Notifications
You must be signed in to change notification settings - Fork 182
Closed
Description
The data store is not populated with the required Pod details when the InferenceModel and InferencePool CRs are added after EPP is started:
Skipping reconciling EndpointSlice because the InferencePool is not available yet: InferencePool hasn't been initialized yet
...
===DEBUG: Current Pods and metrics: []
Add the example InferenceModel and InferencePool CRs:
reconciling InferencePooldefault/vllm-llama2-7b-pool
reconciling InferenceModeldefault/inferencemodel-sample
Incoming pool ref {inference.networking.x-k8s.io InferencePool vllm-llama2-7b-pool}, server pool name: vllm-llama2-7b-pool
Adding/Updating inference model: tweet-summary
===DEBUG: Current Pods and metrics: []
Recreate the EndpointSlice for the example service and the data store reflects the required Pod details:
I0106 17:05:36.413276 1 endpointslice_reconciler.go:34] Reconciling EndpointSlice default/vllm-llama2-7b-pool-xvkcm
I0106 17:05:36.545209 1 provider.go:92] ===DEBUG: Current Pods and metrics: [Pod: vllm-llama2-7b-pool-59d86b6c85-ktsw7:10.244.0.16:8000; Metrics: {ActiveModels:map[] MaxActiveModels:0 RunningQueueSize:0 WaitingQueueSize:0 KVCacheUsagePercent:0 KvCacheMaxTokenCapacity:0}]
...
EndpointSlice reconciliation should be triggered whenever an InferencePool CRUD operation occurs since it manages the internal Pod state which depends on InferencePool details, e.g. targetPortNumber
.
MaYuan-02
Metadata
Metadata
Assignees
Labels
No labels