Skip to content

Conversation

@varshaprasad96
Copy link
Contributor

What does this PR do?

With milvus-io/milvus-lite#294 - Milvus Lite supports keyword search using BM25. While introducing keyword search we had explicitly disabled it for inline milvus. This PR removes the need for the check, and enables inline::milvus for tests.

Test Plan

Run llama stack with inline::milvus enabled:

pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v
INFO     2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS                                        
=========================================================================================== test session starts ============================================================================================
platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}}
rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack
configfile: pyproject.toml
plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0
asyncio: mode=Mode.AUTO
collected 3 items                                                                                                                                                                                          

tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED                                                   [ 33%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED                                                  [ 66%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED                                                   [100%]

============================================================================================ 3 passed in 4.75s =============================================================================================

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 8, 2025
@varshaprasad96
Copy link
Contributor Author

@franciscojavierarceo a quick one. With this we will have feature parity between inline and remote milvus for search modes.

@franciscojavierarceo
Copy link
Collaborator

Nice! I approved the workflow, let's wait for the tests to pass.

@franciscojavierarceo
Copy link
Collaborator

Shouldn't you update pyproject.toml for the latest release? Or do we already have the latest version that supports bm25?

Copy link
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we must have the latest version. Thanks for the quick turnaround on this!!!

leseb
leseb previously requested changes Aug 8, 2025
Copy link
Collaborator

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should really have a constraint on the minimal required version for milvus-lite. +1 on #3073 (comment)

@varshaprasad96
Copy link
Contributor Author

@leseb @franciscojavierarceo The reason Milvus-lite version was updated, was because we are locking pymilvus>=2.5 (https://github.com/meta-llama/llama-stack/blob/1677d6bffdf0a002abe3b827c460df4930ee83c8/uv.lock#L1746) and Milvus-lite is a transitive dependency. I've also updated to pin the Milvus-lite version to be sure that we are importing the one which has this change implemented.

Copy link
Contributor

@ChristianZaccaria ChristianZaccaria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm nice!!

@varshaprasad96
Copy link
Contributor Author

@leseb could we get this merged this, please? We would need it for upcoming release.

Copy link
Collaborator

@bbrowning bbrowning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look reasonable, self-contained, tested, and feedback looks to be addressed. Thanks!

@franciscojavierarceo franciscojavierarceo dismissed leseb’s stale review August 14, 2025 13:45

Dismissing since Seb's request was addressed. 👍

@franciscojavierarceo franciscojavierarceo added the re-record-tests Spin up ollama, inference and record responses for later use label Aug 14, 2025
@franciscojavierarceo franciscojavierarceo added re-record-tests Spin up ollama, inference and record responses for later use and removed re-record-tests Spin up ollama, inference and record responses for later use labels Aug 14, 2025
@franciscojavierarceo franciscojavierarceo merged commit 8cc4925 into llamastack:main Aug 19, 2025
44 checks passed
franciscojavierarceo added a commit to franciscojavierarceo/llama-stack that referenced this pull request Aug 21, 2025
Signed-off-by: Francisco Javier Arceo <[email protected]>

chore: Enable keyword search for Milvus inline (llamastack#3073)

With milvus-io/milvus-lite#294 - Milvus Lite
supports keyword search using BM25. While introducing keyword search we
had explicitly disabled it for inline milvus. This PR removes the need
for the check, and enables `inline::milvus` for tests.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

Run llama stack with `inline::milvus` enabled:

```
pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v
```

```
INFO     2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS
=========================================================================================== test session starts ============================================================================================
platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}}
rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack
configfile: pyproject.toml
plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0
asyncio: mode=Mode.AUTO
collected 3 items

tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED                                                   [ 33%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED                                                  [ 66%]
tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED                                                   [100%]

============================================================================================ 3 passed in 4.75s =============================================================================================
```

Signed-off-by: Varsha Prasad Narsing <[email protected]>
Co-authored-by: Francisco Arceo <[email protected]>

chore: Fixup main pre commit (llamastack#3204)

build: Bump version to 0.2.18

chore: Faster npm pre-commit (llamastack#3206)

Adds npm to pre-commit.yml installation and caches ui
Removes node installation during pre-commit.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Francisco Javier Arceo <[email protected]>

chiecking in for tonight, wip moving to agents api

Signed-off-by: Francisco Javier Arceo <[email protected]>

remove log

Signed-off-by: Francisco Javier Arceo <[email protected]>

updated

Signed-off-by: Francisco Javier Arceo <[email protected]>

fix: disable ui-prettier & ui-eslint (llamastack#3207)

chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (llamastack#3061)

This PR adds a step in pre-commit to enforce using `llama_stack` logger.

Currently, various parts of the code base uses different loggers. As a
custom `llama_stack` logger exist and used in the codebase, it is better
to standardize its utilization.

Signed-off-by: Mustafa Elbehery <[email protected]>
Co-authored-by: Matthew Farrellee <[email protected]>

fix: fix ```openai_embeddings``` for asymmetric embedding NIMs (llamastack#3205)

NVIDIA asymmetric embedding models (e.g.,
`nvidia/llama-3.2-nv-embedqa-1b-v2`) require an `input_type` parameter
not present in the standard OpenAI embeddings API. This PR adds the
`input_type="query"` as default and updates the documentation to suggest
using the `embedding` API for passage embeddings.

<!-- If resolving an issue, uncomment and update the line below -->
Resolves llamastack#2892

```
pytest -s -v tests/integration/inference/test_openai_embeddings.py   --stack-config="inference=nvidia"   --embedding-model="nvidia/llama-3.2-nv-embedqa-1b-v2"   --env NVIDIA_API_KEY={nvidia_api_key}   --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com"
```

cleaning up

Signed-off-by: Francisco Javier Arceo <[email protected]>

updating session manager to cache messages locally

Signed-off-by: Francisco Javier Arceo <[email protected]>

fix linter

Signed-off-by: Francisco Javier Arceo <[email protected]>

more cleanup

Signed-off-by: Francisco Javier Arceo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. re-record-tests Spin up ollama, inference and record responses for later use

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants