feat(UI): Adding Conversation History #3203

franciscojavierarceo · 2025-08-19T16:09:14Z

What does this PR do?

Introduces the Agent Session creation for the Playground and allows users to set tools
- note tools are actually not usable yet and this is marked explicitly
- this also caches sessions locally for faster loading on the UI and deletes them appropriately
- allows users to easily create new sessions as well
Moved Model Configuration settings and "System Message" / Prompt to the left component
Added new logo and favicon
Added new typing animation when LLM is generating

Create New Session

List of Sessions

Test Plan

Unit tests added

franciscojavierarceo · 2025-08-19T19:42:11Z

llama_stack/ui/components/chat-playground/session-manager.tsx

+  name: string;
+  messages: Message[];
+  selectedModel: string;
+  selectedVectorDb: string;


forthcoming

Signed-off-by: Francisco Javier Arceo <[email protected]>

ashwinb · 2025-08-19T23:40:14Z

I think this is the best practical solution given we don't support APIs for this on the BE

our agents API offers sessions right? to me this was the reason why we had them. could you explain why that doesn't or won't work?

franciscojavierarceo · 2025-08-19T23:52:05Z

I was uncertain about the state of the agents API given some discussions during the office hours, but if we intend on maintaining agents API I'll move it there.

Signed-off-by: Francisco Javier Arceo <[email protected]> chore: Enable keyword search for Milvus inline (llamastack#3073) With milvus-io/milvus-lite#294 - Milvus Lite supports keyword search using BM25. While introducing keyword search we had explicitly disabled it for inline milvus. This PR removes the need for the check, and enables `inline::milvus` for tests.   Run llama stack with `inline::milvus` enabled: ``` pytest tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes --stack-config=http://localhost:8321 --embedding-model=all-MiniLM-L6-v2 -v ``` ``` INFO 2025-08-07 17:06:20,932 tests.integration.conftest:64 tests: Setting DISABLE_CODE_SANDBOX=1 for macOS =========================================================================================== test session starts ============================================================================================ platform darwin -- Python 3.12.11, pytest-7.4.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python cachedir: .pytest_cache metadata: {'Python': '3.12.11', 'Platform': 'macOS-14.7.6-arm64-arm-64bit', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.23.8', 'cov': '6.0.0', 'timeout': '2.2.0', 'socket': '0.7.0', 'html': '3.1.1', 'langsmith': '0.3.39', 'anyio': '4.8.0', 'metadata': '3.0.0'}} rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack configfile: pyproject.toml plugins: asyncio-0.23.8, cov-6.0.0, timeout-2.2.0, socket-0.7.0, html-3.1.1, langsmith-0.3.39, anyio-4.8.0, metadata-3.0.0 asyncio: mode=Mode.AUTO collected 3 items tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-vector] PASSED [ 33%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-keyword] PASSED [ 66%] tests/integration/vector_io/test_openai_vector_stores.py::test_openai_vector_store_search_modes[None-None-all-MiniLM-L6-v2-None-384-hybrid] PASSED [100%] ============================================================================================ 3 passed in 4.75s ============================================================================================= ``` Signed-off-by: Varsha Prasad Narsing <[email protected]> Co-authored-by: Francisco Arceo <[email protected]> chore: Fixup main pre commit (llamastack#3204) build: Bump version to 0.2.18 chore: Faster npm pre-commit (llamastack#3206) Adds npm to pre-commit.yml installation and caches ui Removes node installation during pre-commit.    Signed-off-by: Francisco Javier Arceo <[email protected]> chiecking in for tonight, wip moving to agents api Signed-off-by: Francisco Javier Arceo <[email protected]> remove log Signed-off-by: Francisco Javier Arceo <[email protected]> updated Signed-off-by: Francisco Javier Arceo <[email protected]> fix: disable ui-prettier & ui-eslint (llamastack#3207) chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (llamastack#3061) This PR adds a step in pre-commit to enforce using `llama_stack` logger. Currently, various parts of the code base uses different loggers. As a custom `llama_stack` logger exist and used in the codebase, it is better to standardize its utilization. Signed-off-by: Mustafa Elbehery <[email protected]> Co-authored-by: Matthew Farrellee <[email protected]> fix: fix ```openai_embeddings``` for asymmetric embedding NIMs (llamastack#3205) NVIDIA asymmetric embedding models (e.g., `nvidia/llama-3.2-nv-embedqa-1b-v2`) require an `input_type` parameter not present in the standard OpenAI embeddings API. This PR adds the `input_type="query"` as default and updates the documentation to suggest using the `embedding` API for passage embeddings.  Resolves llamastack#2892 ``` pytest -s -v tests/integration/inference/test_openai_embeddings.py --stack-config="inference=nvidia" --embedding-model="nvidia/llama-3.2-nv-embedqa-1b-v2" --env NVIDIA_API_KEY={nvidia_api_key} --env NVIDIA_BASE_URL="https://integrate.api.nvidia.com" ``` cleaning up Signed-off-by: Francisco Javier Arceo <[email protected]> updating session manager to cache messages locally Signed-off-by: Francisco Javier Arceo <[email protected]> fix linter Signed-off-by: Francisco Javier Arceo <[email protected]> more cleanup Signed-off-by: Francisco Javier Arceo <[email protected]>

franciscojavierarceo · 2025-08-21T20:13:24Z