You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The client-side tokenization in guidellm fails to account for the extra tokens added in the server's chat prompt template. There are two possible workarounds:
Enable usage metrics in each request and let the server tell us how many prompt tokens there are.
Use the /completions endpoint rather than /chat/completions as the chat template is not applied on the /completions endpoint.
The text was updated successfully, but these errors were encountered:
#91 contains fixes towards both. There will be a bit of follow up work past that to enable the user further configuration on calculations and selecting which end point to request to
The client-side tokenization in guidellm fails to account for the extra tokens added in the server's chat prompt template. There are two possible workarounds:
/completions
endpoint rather than/chat/completions
as the chat template is not applied on the/completions
endpoint.The text was updated successfully, but these errors were encountered: