chore: Enable keyword search for Milvus inline #3073

varshaprasad96 · 2025-08-08T00:12:16Z

What does this PR do?

With milvus-io/milvus-lite#294 - Milvus Lite supports keyword search using BM25. While introducing keyword search we had explicitly disabled it for inline milvus. This PR removes the need for the check, and enables inline::milvus for tests.

Test Plan

Run llama stack with inline::milvus enabled:

pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v

INFO     2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS                                        
=========================================================================================== test session starts ============================================================================================
platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}}
rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack
configfile: pyproject.toml
plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0
asyncio: mode=Mode.AUTO
collected 3 items                                                                                                                                                                                          

tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED                                                   [ 33%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED                                                  [ 66%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED                                                   [100%]

============================================================================================ 3 passed in 4.75s =============================================================================================

varshaprasad96 · 2025-08-08T00:13:02Z

@franciscojavierarceo a quick one. With this we will have feature parity between inline and remote milvus for search modes.

franciscojavierarceo · 2025-08-08T00:24:21Z

Nice! I approved the workflow, let's wait for the tests to pass.

franciscojavierarceo · 2025-08-08T00:25:16Z

Shouldn't you update pyproject.toml for the latest release? Or do we already have the latest version that supports bm25?

franciscojavierarceo

Looks like we must have the latest version. Thanks for the quick turnaround on this!!!

leseb

We should really have a constraint on the minimal required version for milvus-lite. +1 on #3073 (comment)

varshaprasad96 · 2025-08-08T21:36:29Z

@leseb @franciscojavierarceo The reason Milvus-lite version was updated, was because we are locking pymilvus>=2.5 (https://github.com/meta-llama/llama-stack/blob/1677d6bffdf0a002abe3b827c460df4930ee83c8/uv.lock#L1746) and Milvus-lite is a transitive dependency. I've also updated to pin the Milvus-lite version to be sure that we are importing the one which has this change implemented.

ChristianZaccaria

/lgtm nice!!

tests/integration/vector_io/test_openai_vector_stores.py

Signed-off-by: Varsha Prasad Narsing <[email protected]>

varshaprasad96 · 2025-08-13T21:22:50Z

@leseb could we get this merged this, please? We would need it for upcoming release.

bbrowning

The changes look reasonable, self-contained, tested, and feedback looks to be addressed. Thanks!

Dismissing since Seb's request was addressed. 👍

Signed-off-by: Francisco Javier Arceo <[email protected]> chore: Enable keyword search for Milvus inline (llamastack#3073) With milvus-io/milvus-lite#294 - Milvus Lite supports keyword search using BM25. While introducing keyword search we had explicitly disabled it for inline milvus. This PR removes the need for the check, and enables `inline::milvus` for tests.   Run llama stack with `inline::milvus` enabled: ``` pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v ``` ``` INFO 2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS =========================================================================================== test session starts ============================================================================================ platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}} rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack configfile: pyproject.toml plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0 asyncio: mode=Mode.AUTO collected 3 items tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED [ 33%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED [ 66%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED [100%] ============================================================================================ 3 passed in 4.75s ============================================================================================= ``` Signed-off-by: Varsha Prasad Narsing <[email protected]> Co-authored-by: Francisco Arceo <[email protected]> chore: Fixup main pre commit (llamastack#3204) build: Bump version to 0.2.18 chore: Faster npm pre-commit (llamastack#3206) Adds npm to pre-commit.yml installation and caches ui Removes node installation during pre-commit.    Signed-off-by: Francisco Javier Arceo <[email protected]> chiecking in for tonight, wip moving to agents api Signed-off-by: Francisco Javier Arceo <[email protected]> remove log Signed-off-by: Francisco Javier Arceo <[email protected]> updated Signed-off-by: Francisco Javier Arceo <[email protected]> fix: disable ui-prettier & ui-eslint (llamastack#3207) chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (llamastack#3061) This PR adds a step in pre-commit to enforce using `llama_stack` logger. Currently, various parts of the code base uses different loggers. As a custom `llama_stack` logger exist and used in the codebase, it is better to standardize its utilization. Signed-off-by: Mustafa Elbehery <[email protected]> Co-authored-by: Matthew Farrellee <[email protected]> fix: fix ```openai_embeddings``` for asymmetric embedding NIMs (llamastack#3205) NVIDIA asymmetric embedding models (e.g., `nvidia/llama-3.2-nv-embedqa-1b-v2`) require an `input_type` parameter not present in the standard OpenAI embeddings API. This PR adds the `input_type="query"` as default and updates the documentation to suggest using the `embedding` API for passage embeddings.  Resolves llamastack#2892 ``` pytest -s -v tests/integration/inference/test_openai_embeddings.py --stack-config="inference=nvidia" --embedding-model="nvidia/llama-3.2-nv-embedqa-1b-v2" --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ``` cleaning up Signed-off-by: Francisco Javier Arceo <[email protected]> updating session manager to cache messages locally Signed-off-by: Francisco Javier Arceo <[email protected]> fix linter Signed-off-by: Francisco Javier Arceo <[email protected]> more cleanup Signed-off-by: Francisco Javier Arceo <[email protected]>

varshaprasad96 requested review from ashwinb, bbrowning, ehhuang, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, slekkala1, terrytangyuan and yanxi0830 as code owners August 8, 2025 00:12

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 8, 2025

franciscojavierarceo approved these changes Aug 8, 2025

View reviewed changes

leseb previously requested changes Aug 8, 2025

View reviewed changes

varshaprasad96 force-pushed the milvus/search-modes branch from 9ef6936 to 8bc2196 Compare August 8, 2025 21:34

varshaprasad96 requested a review from leseb August 11, 2025 19:31

varshaprasad96 force-pushed the milvus/search-modes branch from 8bc2196 to b329f4b Compare August 11, 2025 19:31

ChristianZaccaria approved these changes Aug 12, 2025

View reviewed changes

tests/integration/vector_io/test_openai_vector_stores.py Show resolved Hide resolved

varshaprasad96 force-pushed the milvus/search-modes branch from b329f4b to 536581e Compare August 13, 2025 05:10

chore: Enable keyword search for Milvus inline

6ea6fda

Signed-off-by: Varsha Prasad Narsing <[email protected]>

varshaprasad96 force-pushed the milvus/search-modes branch from 536581e to 6ea6fda Compare August 13, 2025 21:22

Merge branch 'main' into milvus/search-modes

2d0d13b

bbrowning approved these changes Aug 14, 2025

View reviewed changes

Merge branch 'main' into milvus/search-modes

91bf5cb

franciscojavierarceo added the re-record-tests Spin up ollama, inference and record responses for later use label Aug 14, 2025

Merge branch 'main' into milvus/search-modes

a8714d4

franciscojavierarceo added re-record-tests Spin up ollama, inference and record responses for later use and removed re-record-tests Spin up ollama, inference and record responses for later use labels Aug 14, 2025

Merge branch 'main' into milvus/search-modes

a4b3c00

franciscojavierarceo merged commit 8cc4925 into llamastack:main Aug 19, 2025
44 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: Enable keyword search for Milvus inline #3073

chore: Enable keyword search for Milvus inline #3073

Uh oh!

varshaprasad96 commented Aug 8, 2025

Uh oh!

varshaprasad96 commented Aug 8, 2025

Uh oh!

franciscojavierarceo commented Aug 8, 2025

Uh oh!

franciscojavierarceo commented Aug 8, 2025

Uh oh!

franciscojavierarceo left a comment

Uh oh!

leseb left a comment

Uh oh!

varshaprasad96 commented Aug 8, 2025

Uh oh!

ChristianZaccaria left a comment

Uh oh!

Uh oh!

varshaprasad96 commented Aug 13, 2025

Uh oh!

bbrowning left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

chore: Enable keyword search for Milvus inline #3073

chore: Enable keyword search for Milvus inline #3073

Uh oh!

Conversation

varshaprasad96 commented Aug 8, 2025

What does this PR do?

Test Plan

Uh oh!

varshaprasad96 commented Aug 8, 2025

Uh oh!

franciscojavierarceo commented Aug 8, 2025

Uh oh!

franciscojavierarceo commented Aug 8, 2025

Uh oh!

franciscojavierarceo left a comment

Choose a reason for hiding this comment

Uh oh!

leseb left a comment

Choose a reason for hiding this comment

Uh oh!

varshaprasad96 commented Aug 8, 2025

Uh oh!

ChristianZaccaria left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

varshaprasad96 commented Aug 13, 2025

Uh oh!

bbrowning left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants