test chat completions api in e2e case #868

delavet · 2025-05-23T02:34:43Z

Adds /chat/completions API e2e test case.

Fixes: #814

netlify · 2025-05-23T02:34:48Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`451fea1`
🔍 Latest deploy log	https://app.netlify.com/projects/gateway-api-inference-extension/deploys/683e664a2113f9000855449b
😎 Deploy Preview	https://deploy-preview-868--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

k8s-ci-robot · 2025-05-23T02:34:53Z

Hi @delavet. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

nirrozenbaum · 2025-05-23T08:36:30Z

/ok-to-test

nirrozenbaum

@delavet does it make sense to separate different tests into different ginkgo.It statements?
I'd expect moving some parts to the BeforeEach section, e.g., this part:

ginkgo.By("Creating an InferenceModel resource")
infModel := newInferenceModel(nsName)	
gomega.Expect(cli.Create(ctx, infModel)).To(gomega.Succeed())	

ginkgo.By("Ensuring the InferenceModel resource exists in the namespace")
gomega.Eventually(func() error {
	return cli.Get(ctx, types.NamespacedName{Namespace: infModel.Namespace, Name: infModel.Name}, infModel)	}, 
existsTimeout, interval).Should(gomega.Succeed())

I'd also expect creating some helper functions to extract the common code.
I think the e2e tests should include (each in a different ginkgo.It statement):

completions with prompt
chat completions with a single [role, message].
chat completions with multiple [role, message] entries.

delavet · 2025-05-28T11:28:32Z

@delavet does it make sense to separate different tests into different ginkgo.It statements? I'd expect moving some parts to the BeforeEach section, e.g., this part:
ginkgo.By("Creating an InferenceModel resource")
infModel := newInferenceModel(nsName)	
gomega.Expect(cli.Create(ctx, infModel)).To(gomega.Succeed())	

ginkgo.By("Ensuring the InferenceModel resource exists in the namespace")
gomega.Eventually(func() error {
	return cli.Get(ctx, types.NamespacedName{Namespace: infModel.Namespace, Name: infModel.Name}, infModel)	}, 
existsTimeout, interval).Should(gomega.Succeed())
I'd also expect creating some helper functions to extract the common code. I think the e2e tests should include (each in a different ginkgo.It statement):

completions with prompt

chat completions with a single [role, message].

chat completions with multiple [role, message] entries.

This does make more sense! I will try refactoring this ASAP.

danehans · 2025-05-30T23:08:12Z

test/e2e/epp/e2e_test.go


 			ginkgo.By("Verifying connectivity through the inference extension")
-			curlCmd := getCurlCommand(envoyName, nsName, envoyPort, modelName, curlTimeout)
+			for _, testApi := range []string{"/completions", "/chat/completions"} {


@delavet does it make sense to separate different tests into different ginkgo.It statements?

Another option is to move the ginkgo.By() statement within the range and include the name of the API endpoint so a user can see that each endpoint is being tested?

test/e2e/epp/e2e_test.go

delavet · 2025-05-31T10:15:46Z

@nirrozenbaum @danehans I have refined this PR, and now the e2e tests include coverage for both /completion and /chat/completions. The tests for /chat/completions encompass two scenarios: single-turn chat and multi-turn chat. For the single-turn chat, the same prompt used in /completion, "Write as if you were a critic: San Francisco," is applied. Regarding the organization of the tests, I finally decided to use a single It block that contains multiple By blocks through a loop. This approach appears more semantically coherent.

danehans · 2025-06-02T16:10:39Z

@delavet CI is failing due to #901. Please rebase to include #902.

nirrozenbaum · 2025-06-02T16:26:14Z

/retest

Signed-off-by: Hang Yin <[email protected]>

danehans · 2025-06-03T16:10:15Z

Changes verified:

$ make test-e2e
...
  STEP: Creating an InferenceModel resource @ 06/03/25 09:04:06.069
  STEP: Ensuring the InferenceModel resource exists in the namespace @ 06/03/25 09:04:06.074
  STEP: Verifying connectivity through the inference extension with /completions api and prompt/messages: Write as if you were a critic: San Francisco @ 06/03/25 09:04:06.082
  STEP: Verifying connectivity through the inference extension with /chat/completions api and prompt/messages: [{"role": "user", "content": "Write as if you were a critic: San Francisco"}] @ 06/03/25 09:04:06.275
  STEP: Verifying connectivity through the inference extension with /chat/completions api and prompt/messages: [{"role": "user", "content": "Write as if you were a critic: San Francisco"},{"role": "assistant", "content": "Okay, let's see..."},{"role": "user", "content": "Now summarize your thoughts."}] @ 06/03/25 09:04:06.369
  STEP: Deleting the InferenceModel test resource. @ 06/03/25 09:04:06.454
• [0.394 seconds]
------------------------------
[AfterSuite]
/Users/solo-system-dhansen/go/src/sigs.k8s.io/gateway-api-inference-extension/test/e2e/epp/e2e_suite_test.go:140
  STEP: Performing global cleanup @ 06/03/25 09:04:06.46
[AfterSuite] PASSED [0.203 seconds]
------------------------------

Ran 1 of 1 Specs in 79.008 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
--- PASS: TestAPIs (79.01s)
PASS
ok  	sigs.k8s.io/gateway-api-inference-extension/test/e2e/epp	79.036s

/lgtm
/approve

Thanks @delavet for your help!

k8s-ci-robot · 2025-06-03T16:10:23Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danehans, delavet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [danehans]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Hang Yin <[email protected]>

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 23, 2025

k8s-ci-robot requested review from liu-cong and robscott May 23, 2025 02:34

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 23, 2025

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 23, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 23, 2025

nirrozenbaum reviewed May 27, 2025

View reviewed changes

danehans reviewed May 30, 2025

View reviewed changes

test/e2e/epp/e2e_test.go Outdated Show resolved Hide resolved

delavet force-pushed the add-chat-e2e branch from 963d9a5 to 27cfb35 Compare May 31, 2025 09:11

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 31, 2025

delavet force-pushed the add-chat-e2e branch from 27cfb35 to f88626f Compare May 31, 2025 10:08

delavet requested review from danehans and nirrozenbaum May 31, 2025 10:16

add e2e test case for chat completions

451fea1

Signed-off-by: Hang Yin <[email protected]>

delavet force-pushed the add-chat-e2e branch from f88626f to 451fea1 Compare June 3, 2025 03:04

k8s-ci-robot assigned danehans Jun 3, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 3, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 3, 2025

k8s-ci-robot merged commit ede62e6 into kubernetes-sigs:main Jun 3, 2025
8 checks passed

rlakhtakia pushed a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request Jun 11, 2025

add e2e test case for chat completions (kubernetes-sigs#868)

ce315a9

Signed-off-by: Hang Yin <[email protected]>

test chat completions api in e2e case #868

test chat completions api in e2e case #868

Uh oh!

Conversation

delavet commented May 23, 2025 • edited by danehans Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for gateway-api-inference-extension ready!

Uh oh!

k8s-ci-robot commented May 23, 2025

Uh oh!

nirrozenbaum commented May 23, 2025

Uh oh!

nirrozenbaum left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

delavet commented May 28, 2025

Uh oh!

danehans May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

delavet commented May 31, 2025

Uh oh!

danehans commented Jun 2, 2025

Uh oh!

nirrozenbaum commented Jun 2, 2025

Uh oh!

danehans commented Jun 3, 2025

Uh oh!

k8s-ci-robot commented Jun 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

delavet commented May 23, 2025 •

edited by danehans

Loading

netlify bot commented May 23, 2025 •

edited

Loading

nirrozenbaum left a comment •

edited

Loading

danehans May 30, 2025 •

edited

Loading