Fix Issue #211 : Improved Embedding Performance by Handling Base64 Encoding #303
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This commit includes the fix described in Issue #211.
Detail
This pull request introduces several changes to the
Embedding
class and related components in theopenai-java-core
package. The primary goal is to enhance the handling of embedding vectors by supporting both float lists and Base64-encoded strings. The most important changes include the introduction of theEmbeddingValue
class, modifications to theEmbedding
class to useEmbeddingValue
, and updates to the deserialization logic.Enhancements to embedding handling:
openai-java-core/src/main/kotlin/com/openai/models/embeddings/Embedding.kt
: Modified theEmbedding
class to useEmbeddingValue
instead ofList<Double>
for embedding vectors. This includes changes to the constructor, builder, and relevant methods. [1] [2] [3] [4] [5] [6]Introduction of
EmbeddingValue
class:openai-java-core/src/main/kotlin/com/openai/models/embeddings/EmbeddingValue.kt
: Added a new classEmbeddingValue
to represent embedding vectors, which can be either a list of floats or a Base64-encoded string. This class includes methods for converting between these representations.Deserialization improvements:
openai-java-core/src/main/kotlin/com/openai/models/embeddings/EmbeddingValueDeserializer.kt
: Introduced a custom deserializerEmbeddingValueDeserializer
to handle the deserialization ofEmbeddingValue
objects from JSON, supporting both float arrays and Base64 strings.Default encoding format:
openai-java-core/src/main/kotlin/com/openai/models/embeddings/EmbeddingCreateParams.kt
: Set the defaultEncodingFormat
toBASE64
for performance improvements.Test updates:
openai-java-core/src/test/kotlin/com/openai/models/embeddings/CreateEmbeddingResponseTest.kt
andopenai-java-core/src/test/kotlin/com/openai/models/embeddings/EmbeddingTest.kt
: Updated test cases to accommodate the changes in theEmbedding
class and the introduction ofEmbeddingValue
. [1] [2] [3]This code will run look like following Java code.
This PR code will run with look like following code style.