Skip to content

Update Envoy config with longer timeouts #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
liu-cong opened this issue Oct 21, 2024 · 3 comments · Fixed by #39
Closed

Update Envoy config with longer timeouts #24

liu-cong opened this issue Oct 21, 2024 · 3 comments · Fixed by #39

Comments

@liu-cong
Copy link
Contributor

liu-cong commented Oct 21, 2024

During benchmarking I noticed as we increase QPS the request start to take longer, and I started to notice 500 errors from Envoy. @kaushikmitr had some timeout config in the poc that reduces such errors significantly. This task is to update the Envoy config with proper timeout config and document them.

### Tasks
@liu-cong
Copy link
Contributor Author

@kaushikmitr maybe you can take this?

@kaushikmitr
Copy link
Contributor

sure I can take this up

@liu-cong
Copy link
Contributor Author

closed by #39

shaneutt added a commit to shaneutt/gateway-api-inference-extension that referenced this issue Apr 18, 2025
Add inference model and pool yamls
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants