-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Support image output #1130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support image output #1130
Conversation
Docs Preview
|
|
@Kludex Are you planning to work on this or are we better off closing it for now? |
|
This is still in my radar, I prefer to keep it open. |
|
hi, would love access to this feature, is there an ETA? |
|
I think something like this will be necessary sooner than later. Many models can/will generate interleaved multimodal content. Slightly philosophical question, but why are the output types of an LLM different from those of ToolCall? |
@lshamis Because the types of data LLMs support as input (whether that's via the user prompt as a tool call result) are not the same as the types of data they can output. For example, all models support text input and text output, and many support image, video, audio, and document input, but only a handful support image output, and as far as I know none can output e.g. PDF files. So there's necessarily a difference between the types of things we allow tools to output (as it's anything that can be sent back to the model as input) and what models themselves can output. |
|
@dorukgezici Much appreciated! However having just spent some time looking into https://platform.openai.com/docs/guides/tools-image-generation and https://ai.google.dev/gemini-api/docs/image-generation, I think this'd take me a few hours to implement in a clean way, and someone less familiar with our architecture a lot more than that to get it work and then get it through the review cycle 😅 So I'm gonna have a crack at this tomorrow or on my flight to our team offsite on Sunday :) |
|
Closing in favor of #2970 |
Still a lot to do, and decide... It's still not type safe, and can't use
message_historyproperly.The
main.pyin the files already work tho.