generated from langchain-ai/integration-repo-template
-
Notifications
You must be signed in to change notification settings - Fork 207
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Sorry for not submitting a PR, just wanted to document this!
First issue:
- In BedrockEmbeddings support for Cohere Embed v4 #680, the issue describes that the Langchain-AWS batch embedding functions for Cohere models didn't support the new output schema of Cohere Embed v4.
- This was fixed in Cohere Embed v4 schema #681, but that commit did not apply the same fix to the non-batch embedding function for Cohere models. See: https://github.com/langchain-ai/langchain-aws/blob/main/libs/aws/langchain_aws/embeddings/bedrock.py#L192
- Right now users can't use Cohere Embed v4 for normal non-batch inputs
Second issue:
- In https://github.com/langchain-ai/langchain-aws/blob/main/libs/aws/langchain_aws/embeddings/bedrock.py#L357, Cohere embedding models have their input sizes limited to no more than 2048 characters, as expected by Cohere Embed v3.
- Cohere Embed v4 seem to support a maximum context length of 128K tokens for each text input. That's a significant increase from v3, and allows for the embedding of entire documents without requiring custom chunking logic. See: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-embed-v4.html
- Right now users can't use that extended input length support because the old v3 limit still applies.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working