Skip to content

Conversation

@afoucret
Copy link
Contributor

This PR introduces a new TEXT_EMBEDDING function to ES|QL, enabling users to generate dense vector embeddings for text directly within their queries.

Examples

Example 1: Generating embeddings

FROM my_index
| EVAL embedding = TEXT_EMBEDDING("my text to embed", "my_embedding_model")

Example 2: Semantic search with KNN

FROM my_index
| WHERE KNN(my_vector_field, TEXT_EMBEDDING("my search query", "my_embedding_model"))

This work is part of #131079

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Sep 19, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@afoucret afoucret requested review from a team and ioanatia September 19, 2025 07:40
@github-actions
Copy link
Contributor

github-actions bot commented Sep 19, 2025

🔍 Preview links for changed docs

@github-actions
Copy link
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Some minor nits about a common interface and some missing capability checks.

@Param(
name = InferenceFunction.INFERENCE_ID_PARAMETER_NAME,
type = { "keyword" },
description = "Identifier of the inference endpoint"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to add a function example and then regen the docs
otherwise Kibana CI will fail when we try to bring the json specification of this function to Kibana even if it's under snapshot.
this happened recently for decay too #134705 - opened a separate PR to address this #135094 since I promised to look into it.

Copy link
Contributor Author

@afoucret afoucret Sep 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be fixed in Kibana CI so it does not break in such a case.

I mean, it is expected for a function that have appliesTo set to FunctionAppliesToLifecycle.DEVELOPMENT to be incomplete. Especially examples come at the very last when writing CSV tests.

Anyway, I added an placeholder example so I will not break anything, It will be replaced when adding more realistic CSV tests.

plan.forEachExpressionUp(UnresolvedFunction.class, f -> {
String functionName = snapshotRegistry.resolveAlias(f.name());
if (snapshotRegistry.functionExists(functionName)) {
FunctionDefinition def = snapshotRegistry.resolveFunction(functionName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually this is done at the analyzer level - it's a bit odd to do the function resolution here, but I don't see another simpler option for now 🤷‍♀️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, there is other case where this kind of analysis is done outside of the analyzer (enrich policy or indices resolution).

Because the analyzer is purely synchronous and inference resolution has to be done async, it is necessary to do it this way.

return BytesRefs.toString(e.fold(FoldContext.small()));
}

public String inferenceId(UnresolvedFunction f, FunctionDefinition def) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this method is actually needed, should this be private? maybe also add a javadoc to explain what it does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments and improved the code readability.

}

@Override
public void writeTo(StreamOutput out) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to serialize this function - since we will always resolve it on the coordinator and replace it with its result.
We have other instances in ES|QL where we don't serialize - FORK, ROW, RENAME come to mind:
But this would probably be the first function that does not require serialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep the serialization even if the expression is not supposed to be moved between node.

The first reason is because the Function are supposed to implement it. If for some reason the infrastructure of execution need to move them between node in the future, I would not like to be the black sheep that make things more difficult.

Also TextEmbeddingTests extends AbstractFunctionTestCase which expect the function to be serializable.

I consider that the implementation or serialization / deserialization is simple to implement and it is probably not beneficial to let it unimplemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first reason is because the Function are supposed to implement it.

That does not mean it should. If the text_embedding function is actually serialized and sent between nodes, that's an execution path that should never happen.

Anything that extends from LogicalPlan is also supposed to implement serialization, but as I said before there are many examples of plans where we don't serialize.
We could have made the same argument for logical plans that the serialization is simple to implement, or that it's not beneficial to let it unimplemented.
However raising an exception in the case where we attempt to serialize these plans is intentional and it captures very clearly the fact that this should never happen.

I get that making this unserializable is more work, especially for AbstractFunctionTestCase, but at least then the behaviour would be correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels to me that we are removing some code that we will ultimately reintroduce when we will support non constant input text embeddings. Anyway, I pushed a version without serialization so we can move forward and merge this PR.

if (snapshotRegistry.functionExists(functionName)) {
FunctionDefinition def = snapshotRegistry.resolveFunction(functionName);
if (InferenceFunction.class.isAssignableFrom(def.clazz())) {
String inferenceId = inferenceId(f, def);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InferenceFunction has an inferenceId method already - I wonder if we can just use that, instead of trying to see which function param is called INFERENCE_ID_PARAMETER_NAME.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem is that f is not an InferenceFunction but is an UnresoivedFunction.
The only way to access the inference id is to read the argument that have the right name.
By convention I supposed that it will always be called inference_id for all the InferenceFunction we will implement in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, the fact that we are looking for the argument that is called inference_id leaks an internal implementation detail of the function.
We could also try and resolve the function - the same way the analyzer does:

FunctionDefinition def = functionRegistry.resolveFunction(functionName);
f = uf.buildResolved(configuration, def);

this would still not be ideal though - I am okay to leave this as it is - but we should reopen the discussion of making the analyzer support async rules.

public LogicalPlan apply(LogicalPlan plan, AnalyzerContext context) {
return plan.transformDown(InferencePlan.class, p -> resolveInferencePlan(p, context));
return plan.transformDown(InferencePlan.class, p -> resolveInferencePlan(p, context))
.transformExpressionsOnly(InferenceFunction.class, f -> resolveInferenceFunction(f, context));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking question for my ES|QL education: Can you help me understand why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plan.transformDown(InferencePlan.class, p -> resolveInferencePlan(p, context))

will transform all the children in the plan that have the class InferencePlan with the result of the resolveInferencePlan(p, context) call. Inference plans are typically command using inference: RERANK and COMPLETION

.transformExpressionsOnly(InferenceFunction.class, f -> resolveInferenceFunction(f, context));

will transform do the same but for InferenceFunction instead of plan. Because text embedding is our first inference function this was not yet done, so I added it.

Copy link
Contributor

@ioanatia ioanatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks really close - I would like us to look a bit more into the serialization aspect before merging

if (snapshotRegistry.functionExists(functionName)) {
FunctionDefinition def = snapshotRegistry.resolveFunction(functionName);
if (InferenceFunction.class.isAssignableFrom(def.clazz())) {
String inferenceId = inferenceId(f, def);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, the fact that we are looking for the argument that is called inference_id leaks an internal implementation detail of the function.
We could also try and resolve the function - the same way the analyzer does:

FunctionDefinition def = functionRegistry.resolveFunction(functionName);
f = uf.buildResolved(configuration, def);

this would still not be ideal though - I am okay to leave this as it is - but we should reopen the discussion of making the analyzer support async rules.

}

@Override
public void writeTo(StreamOutput out) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first reason is because the Function are supposed to implement it.

That does not mean it should. If the text_embedding function is actually serialized and sent between nodes, that's an execution path that should never happen.

Anything that extends from LogicalPlan is also supposed to implement serialization, but as I said before there are many examples of plans where we don't serialize.
We could have made the same argument for logical plans that the serialization is simple to implement, or that it's not beneficial to let it unimplemented.
However raising an exception in the case where we attempt to serialize these plans is intentional and it captures very clearly the fact that this should never happen.

I get that making this unserializable is more work, especially for AbstractFunctionTestCase, but at least then the behaviour would be correct.

@afoucret afoucret enabled auto-merge (squash) September 24, 2025 09:36
@afoucret afoucret merged commit 59c3601 into elastic:main Sep 24, 2025
33 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants