-
Notifications
You must be signed in to change notification settings - Fork 37
add support to multimodal in chat completions #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Juanma Barea <[email protected]>
Signed-off-by: Juanma Barea <[email protected]>
|
Hi @JuanmaBM , thanks for you PR. Code: In the code the image_url field is a string instead of an object with the 'url' field. |
|
You're right, @mayabar. However, I don't believe it's necessary to add it as an object since the simulator only provides random/echo responses and doesn't utilize the image_url data. I'm happy to modify it if you feel it's essential to meet the specification :) |
|
Thanks for your comment, @JuanmaBM , in the simulator we want to be compatible with OpenAI API specification, at this stage we don't support all fields in requests data structures, currently, our goal is to support a sub-set of the OpenAI API, but we don't want to add data structure's fields that are not defined in OpenAI API or overload existing fields. This will allow the simulator clients to be able easily to switch to real vLLM |
|
Okay, @mayabar, I've been pretty busy lately, but I'll try to update the PR tomorrow. |
Signed-off-by: Juanma Barea <[email protected]>
|
@mayabar Done. I was thinking that a schema validator might be useful for the simulator. It could be a desirable feature to verify whether an agent or AI client supports the OpenAI specification. I was considering adding a --schema-validate flag to the simulator parameters. If you think this feature would be useful, I can implement it. |
|
Hi @JuanmaBM , thank you for the update. Regarding the schema validation - the request body structure is validated by marshaling to an appropriate structure. |
| var sb strings.Builder | ||
| for _, block := range mc.Structured { | ||
| if block.Type == "text" { | ||
| sb.WriteString(block.Text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add here separator (space), this will allow better tokens number calculation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
mayabar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix lint failure
Signed-off-by: Juanma Barea <[email protected]>
Signed-off-by: Juanma Barea <[email protected]>
shmuelk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
This PR adds support for OpenAI-style multimodal request formats, allowing the simulator to correctly parse messages where the content field is an array of typed blocks (text, image_url, etc.) instead of a plain string.
Key Changes:
This enables compatibility with frameworks like Llama Stack and other OpenAI v2 clients using structured content.
Fixes: Multimodal Requests Not Supported