Skip to content

Conversation

@Anxhela21
Copy link
Contributor

@Anxhela21 Anxhela21 commented Sep 16, 2025

Description

Add RAG chunks in the query response

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Query responses now include retrieved RAG chunks, referenced documents, and tool-call details for richer, more traceable results.
  • Configuration

    • Added sentence-transformers embedding provider, a RAG runtime provider, a configurable vector DB entry (my_knowledge_base) and embedding model; bundled builtin RAG toolgroup and default RAG tool.
  • Chores

    • Fixed dependency file formatting and added a static-analysis ignore entry.
  • Tests

    • Updated tests to validate RAG-chunk model and propagation in streaming and response tests.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 16, 2025

Walkthrough

Adds RAG-aware models and plumbing: new RAGChunk/ToolCall/ReferencedDocument types, extraction of RAG chunks from tool outputs into TurnSummary, persisted serialized rag_chunks in transcripts, and surfacing rag_chunks/tool_calls/referenced_documents in QueryResponse; also adds embedding/vector runtime config entries.

Changes

Cohort / File(s) Summary
Config: runtime
run.yaml
Adds inference provider sentence-transformers and tool_runtime rag-runtime; adds vector_dbs entry my_knowledge_base (embedding model sentence-transformers/all-mpnet-base-v2, dim 768, provider faiss); adds model metadata and tool_group builtin::rag.
Dependency file
pyproject.toml
Minor TOML syntax fix (removed trailing comma in dependencies).
API Endpoint: Query shaping
src/app/endpoints/query.py
Serializes TurnSummary.rag_chunks to dicts for persistence; maps summary.tool_calls → list of ToolCall; builds referenced_documents from rag chunk sources; logs new steps; persists transcript with serialized rag_chunks; populates QueryResponse with rag_chunks, tool_calls, and referenced_documents.
Response models
src/models/responses.py
Adds RAGChunk, ToolCall, and ReferencedDocument (doc_url now optional); extends QueryResponse with rag_chunks, tool_calls, and referenced_documents; updates examples/schema.
Types / TurnSummary
src/utils/types.py
Adds rag_chunks: list[RAGChunk] = [] to TurnSummary; consolidates tool response handling into response_content; introduces _extract_rag_chunks_from_response to parse JSON/list-of-chunks with fallbacks.
Transcripts storage
src/utils/transcripts.py
Changes store_transcript(..., rag_chunks: list[str], ...)rag_chunks: list[dict] (serialized RAGChunk dicts) and updates docstring.
Constants
src/constants.py
Adds DEFAULT_RAG_TOOL = "knowledge_search".
Tests: responses & RAG
tests/unit/models/responses/test_query_response.py, tests/unit/models/responses/test_rag_chunk.py
Adds tests for RAGChunk, ToolCall, ReferencedDocument and expands QueryResponse tests to cover rag_chunks/tool_calls/referenced_documents; new test module for RAGChunk validations.
Tests: endpoint
tests/unit/app/endpoints/test_streaming_query.py
Updates test data to include a RAGChunk in TurnSummary and assert rag_chunks propagate into stored transcript; imports adjusted.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant API as /query Endpoint
  participant ORCH as Orchestrator/LLM
  participant TOOL as RAG Tool (knowledge_search)
  participant EMB as Embedder (sentence-transformers)
  participant VDB as Vector DB (faiss)

  C->>API: POST /query
  API->>ORCH: run turn (prompt, tools)
  alt RAG tool invoked
    ORCH->>TOOL: knowledge_search(query)
    TOOL->>EMB: embed(query)
    EMB-->>TOOL: embedding
    TOOL->>VDB: search(embedding)
    VDB-->>TOOL: chunks + scores
    TOOL-->>ORCH: tool response (includes chunks)
  end
  ORCH-->>API: TurnSummary (messages, tool_calls, rag_chunks)
  API->>API: normalize rag_chunks → list[dict]
  API->>API: build tool_calls and referenced_documents
  API->>API: store_transcript(..., rag_chunks=dicts, ...)
  API-->>C: QueryResponse { rag_chunks?, tool_calls?, referenced_documents? }
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • tisnik
  • manstis

Poem

I nibble on chunks of knowledge bright,
I map each call and tuck facts tight,
Vectors hum and embeddings take flight,
I hop through logs and store the light,
Then share the crumbs — a rabbit's delight. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "LCORE-601: Add RAG chunks in query response" directly and concisely summarizes the primary change in the PR—adding RAG chunks to the query response—and references the related ticket for context. It is specific, free of noise, and clearly communicates the main functional change reviewers should expect. The phrasing is short and appropriate for scanable commit/PR history.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (8)
run.yaml (1)

145-145: Add trailing newline to satisfy linters.

YAMLlint flags missing newline at EOF.

src/utils/types.py (2)

91-94: Guard on tool name with constant/enum (optional).

Literal "knowledge_search" string can drift from runtime config. Consider a module-level constant.


95-142: RAG chunk extraction: broaden JSON tolerance and cap payload (optional).

Some providers return keys like text/content/snippet and score as strings. Consider normalizing and adding a soft cap to avoid oversized responses.

@@
-                if isinstance(data, dict) and "chunks" in data:
+                if isinstance(data, dict) and "chunks" in data:
                     for chunk in data["chunks"]:
-                        self.rag_chunks.append(
+                        text = chunk.get("content") or chunk.get("text") or ""
+                        score = chunk.get("score")
+                        try:
+                            score = float(score) if score is not None else None
+                        except (TypeError, ValueError):
+                            score = None
+                        self.rag_chunks.append(
                             RAGChunkData(
-                                content=chunk.get("content", ""),
+                                content=text,
                                 source=chunk.get("source"),
-                                score=chunk.get("score")
+                                score=score
                             )
                         )
@@
-                    for chunk in data:
+                    for chunk in data:
                         if isinstance(chunk, dict):
-                            self.rag_chunks.append(
+                            text = chunk.get("content") or chunk.get("text") or str(chunk)
+                            score = chunk.get("score")
+                            try:
+                                score = float(score) if score is not None else None
+                            except (TypeError, ValueError):
+                                score = None
+                            self.rag_chunks.append(
                                 RAGChunkData(
-                                    content=chunk.get("content", str(chunk)),
+                                    content=text,
                                     source=chunk.get("source"),
-                                    score=chunk.get("score")
+                                    score=score
                                 )
                             )
+            # Optional: bound number of chunks
+            if len(self.rag_chunks) > 50:
+                self.rag_chunks = self.rag_chunks[:50]
src/models/responses.py (1)

53-59: API naming vs. internal naming mismatch.

Internal ToolCallSummary uses “response” (str) while API uses “result” (dict). It’s fine, but consider aligning the field name to “response” for consistency, or document the mapping decisively.

src/app/endpoints/query.py (4)

254-263: Normalize tool call result to JSON when possible.

Try JSON-decoding tool responses to provide structured results; fall back to string.

 tool_calls = [
     ToolCall(
         tool_name=tc.name,
         arguments=tc.args if isinstance(tc.args, dict) else {"query": str(tc.args)},
-        result={"response": tc.response} if tc.response else None
+        result=(
+            (json.loads(tc.response) if isinstance(tc.response, str) else tc.response)
+            if tc.response
+            else None
+        )
     )
     for tc in summary.tool_calls
 ]

276-290: O(n^2) chunk_count; pre-aggregate for large results (optional).

For many chunks, the per-source sum is quadratic. Pre-count once.

-        referenced_docs = []
-        doc_sources = set()
-        for chunk in summary.rag_chunks:
-            if chunk.source and chunk.source not in doc_sources:
-                doc_sources.add(chunk.source)
-                referenced_docs.append(
-                    ReferencedDocument(
-                        url=chunk.source if chunk.source.startswith("http") else None,
-                        title=chunk.source,
-                        chunk_count=sum(1 for c in summary.rag_chunks if c.source == chunk.source)
-                    )
-                )
+        from collections import Counter
+        src_counts = Counter(c.source for c in summary.rag_chunks if c.source)
+        referenced_docs = [
+            ReferencedDocument(
+                url=src if src.startswith("http") else None,
+                title=src,
+                chunk_count=cnt,
+            )
+            for src, cnt in src_counts.items()
+        ]

229-244: Persist transcripts with actual rag_chunks.

You already compute summary.rag_chunks; pass them to store_transcript.

-                rag_chunks=[],  # TODO(lucasagomes): implement rag_chunks
+                rag_chunks=summary.rag_chunks,

246-256: Reduce log verbosity to debug.

These info logs are per-request and noisy. Downgrade to debug to keep prod logs clean.

-        logger.info("Persisting conversation details...")
+        logger.debug("Persisting conversation details...")
@@
-        logger.info("Processing tool calls...")
+        logger.debug("Processing tool calls...")
@@
-        logger.info("Processing RAG chunks...")
+        logger.debug("Processing RAG chunks...")
@@
-        logger.info("Extracting referenced documents...")
+        logger.debug("Extracting referenced documents...")
@@
-        logger.info("Building final response...")
+        logger.debug("Building final response...")
@@
-        logger.info("Query processing completed successfully!")
+        logger.debug("Query processing completed successfully!")

Also applies to: 267-275, 276-290, 291-301

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 146e8eb and ef94176.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • pyproject.toml (1 hunks)
  • run.yaml (3 hunks)
  • src/app/endpoints/query.py (2 hunks)
  • src/models/responses.py (3 hunks)
  • src/utils/types.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/app/endpoints/query.py (1)
src/models/responses.py (4)
  • QueryResponse (61-136)
  • RAGChunk (37-42)
  • ReferencedDocument (45-50)
  • ToolCall (53-58)
🪛 YAMLlint (1.37.1)
run.yaml

[error] 145-145: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: e2e_tests
🔇 Additional comments (6)
src/models/responses.py (2)

37-43: RAGChunk model — LGTM.


91-105: Optional fields on QueryResponse — good choice.

Omitting empty lists keeps payloads small and stable for clients.

Also applies to: 112-137

src/app/endpoints/query.py (1)

267-275: RAG chunks mapping — LGTM.

Straight-through mapping from TurnSummary is correct.

run.yaml (1)

63-66: Toolgroup wiring validated — no changes required.
run.yaml defines toolgroup_id: builtin::rag → provider_id: rag-runtime (run.yaml lines 143–145) and src/app/endpoints/query.py references "builtin::rag/knowledge_search" (src/app/endpoints/query.py lines 636–638). inline::sentence-transformers and inline::rag-runtime entries are fine.

pyproject.toml (2)

53-68: Move heavy ML deps out of core runtime into an optional 'rag' dependency-group — verify imports first

torch/trl/peft/datasets/faiss-cpu/autoevals/sentence-transformers will bloat prod images and cold-starts; gate them behind an extras/dependency-group and only install for RAG-enabled deployments.

Apply this minimal removal from default deps:

@@
-    "datasets>=3.6.0",
-    "faiss-cpu>=1.11.0",
-    "autoevals>=0.0.129",
-    "torch>=2.7.1",
-    "peft>=0.15.2",
-    "trl>=0.18.2",
-    "sentence-transformers>=5.1.0",

Then add a dedicated group (example):

[dependency-groups]
rag = [
    "sentence-transformers>=5.1.0",
    "faiss-cpu>=1.11.0",
    "datasets>=3.6.0",
    "torch>=2.7.1",
    "peft>=0.15.2",
    "trl>=0.18.2",
    "autoevals>=0.0.129",
]
  • Align litellm version constraints to a single lower bound across groups to avoid resolver surprises.
  • Verification: I could not search the repo (rg returned no files). Re-run the import check locally to confirm these packages are not required by core runtime, e.g.:
rg -nP -C2 -uu --glob '!**/tests/**' \
  -e '^\s*(from|import)\s+torch\b' \
  -e '^\s*(from|import)\s+trl\b' \
  -e '^\s*(from|import)\s+peft\b' \
  -e '^\s*(from|import)\s+datasets\b' \
  -e '^\s*(from|import)\s+sentence_transformers\b'

48-48: Confirm SQLAlchemy floor change and verify compatibility with 2.0.41

  • Evidence: pyproject.toml:48 => "sqlalchemy>=2.0.41"; uv.lock currently resolves to sqlalchemy 2.0.43 (uv.lock entries ~3303–3321, upload-time 2025-08-11). Repo search returned no 'alembic' references.
  • Action: run CI/local tests with sqlalchemy==2.0.41 and exercise migrations and DB adapters (psycopg2/aiosqlite). If widening the floor was unintentional or causes failures, tighten the minimum to >=2.0.43 or pin the required patch.

Signed-off-by: Anxhela Coba <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
src/utils/types.py (2)

72-72: Good: mutable default avoided for rag_chunks

Switch to Field(default_factory=list) resolves shared-state risk.


10-10: Fix Pydantic import to unblock CI

Field must be imported from pydantic, not pydantic.main (see failing E2E log).

-from pydantic.main import BaseModel, Field
+from pydantic import BaseModel, Field
run.yaml (1)

111-111: Parameterize vector DB path via env var

Avoid hard-coded path; support overrides with a sane default.

-        db_path: .llama/distributions/ollama/faiss_store.db # Location of vector database
+        db_path: ${env.VECTOR_STORE_DB_PATH:.llama/distributions/ollama/faiss_store.db} # Location of vector database
🧹 Nitpick comments (1)
src/utils/types.py (1)

95-142: Make RAG parsing more robust (dict-without-chunks, list of scalars)

Current logic drops data when JSON is a dict without "chunks" or a list of non-dicts. Add safe fallbacks.

         try:
             # Parse the response to get chunks
             # Try JSON first
             try:
                 data = json.loads(response_content)
                 if isinstance(data, dict) and "chunks" in data:
                     for chunk in data["chunks"]:
                         self.rag_chunks.append(
                             RAGChunkData(
                                 content=chunk.get("content", ""),
                                 source=chunk.get("source"),
                                 score=chunk.get("score")
                             )
                         )
-                elif isinstance(data, list):
+                elif isinstance(data, dict):
+                    # Fallback: treat whole dict as a single chunk
+                    self.rag_chunks.append(
+                        RAGChunkData(
+                            content=json.dumps(data, ensure_ascii=False),
+                            source="knowledge_search",
+                            score=None,
+                        )
+                    )
+                elif isinstance(data, list):
                     # Handle list of chunks
                     for chunk in data:
-                        if isinstance(chunk, dict):
+                        if isinstance(chunk, dict):
                             self.rag_chunks.append(
                                 RAGChunkData(
                                     content=chunk.get("content", str(chunk)),
                                     source=chunk.get("source"),
                                     score=chunk.get("score")
                                 )
                             )
+                        else:
+                            # Fallback: stringify scalar/other types
+                            self.rag_chunks.append(
+                                RAGChunkData(
+                                    content=str(chunk),
+                                    source="knowledge_search",
+                                    score=None,
+                                )
+                            )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef94176 and 2d798ec.

📒 Files selected for processing (2)
  • run.yaml (3 hunks)
  • src/utils/types.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/utils/types.py (1)
src/models/responses.py (1)
  • ToolCall (53-58)
🪛 GitHub Actions: E2E Tests
src/utils/types.py

[error] 10-10: ImportError: cannot import name 'Field' from 'pydantic.main'. Did you mean: '_fields'?

🔇 Additional comments (5)
src/utils/types.py (1)

80-94: Nice consolidation of response content + targeted extraction

Computing response_content once and gating RAG extraction on knowledge_search looks solid.

run.yaml (4)

63-66: Embedding provider addition looks good

Inline sentence-transformers provider is correctly declared and referenced later in models.


105-107: RAG runtime provider wiring looks correct

Provider registered and later exposed via tool_group.


132-137: Model entry aligns with embedding config

Model/provider IDs and embedding dimension are consistent.


143-145: Tool group exposes RAG runtime correctly

builtin::rag bound to rag-runtime is consistent with providers.

Signed-off-by: Anxhela Coba <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/utils/types.py (1)

70-73: Default-initialize tool_calls to an empty list.

Prevents None/missing list issues and keeps parity with rag_chunks. This was raised previously and is still applicable.

Apply:

-    tool_calls: list[ToolCallSummary]
+    tool_calls: list[ToolCallSummary] = Field(default_factory=list)
🧹 Nitpick comments (4)
tests/unit/app/endpoints/test_streaming_query.py (1)

346-356: Validate transcript API shape to avoid duplicated rag chunk plumbing.

You pass rag chunks inside summary.rag_chunks while also passing a separate rag_chunks=[] arg to store_transcript. If the latter is unused or redundant, consider removing it from the API or populating it consistently to avoid confusion.

If keeping both, add an assertion that the service ultimately persists summary.rag_chunks so we don’t silently drop them.

src/utils/types.py (3)

80-94: Guard against unexpected content shapes from tool responses.

interleaved_content_as_str(resp.content) can raise if resp.content is absent or malformed; a small try/except avoids derailing the whole summary build.

Apply:

-            response_content = interleaved_content_as_str(resp.content) if resp else None
+            try:
+                response_content = interleaved_content_as_str(resp.content) if resp else None
+            except Exception:
+                response_content = None

95-142: Add fallback when JSON parses but yields an unknown shape.

If json.loads succeeds but returns a non-list/dict or a dict without "chunks", no chunk is appended. Fall back to a single chunk in that case.

Apply:

     def _extract_rag_chunks_from_response(self, response_content: str) -> None:
         """Extract RAG chunks from tool response content."""
         try:
-            # Parse the response to get chunks
+            # Parse the response to get chunks
+            before_count = len(self.rag_chunks)
             # Try JSON first
             try:
                 data = json.loads(response_content)
                 if isinstance(data, dict) and "chunks" in data:
                     for chunk in data["chunks"]:
                         self.rag_chunks.append(
                             RAGChunkData(
                                 content=chunk.get("content", ""),
                                 source=chunk.get("source"),
-                                score=chunk.get("score")
+                                score=chunk.get("score"),
                             )
                         )
                 elif isinstance(data, list):
                     # Handle list of chunks
                     for chunk in data:
                         if isinstance(chunk, dict):
                             self.rag_chunks.append(
                                 RAGChunkData(
                                     content=chunk.get("content", str(chunk)),
                                     source=chunk.get("source"),
-                                    score=chunk.get("score")
+                                    score=chunk.get("score"),
                                 )
                             )
             except json.JSONDecodeError:
                 # If not JSON, treat the entire response as a single chunk
                 if response_content.strip():
                     self.rag_chunks.append(
                         RAGChunkData(
                             content=response_content,
                             source="knowledge_search",
                             score=None
                         )
                     )
+            # If JSON parsed but nothing was added, fall back to a single chunk
+            if response_content.strip() and before_count == len(self.rag_chunks):
+                self.rag_chunks.append(
+                    RAGChunkData(
+                        content=response_content,
+                        source="knowledge_search",
+                        score=None,
+                    )
+                )
         except Exception:
             # Treat response as single chunk
             if response_content.strip():
                 self.rag_chunks.append(
                     RAGChunkData(
                         content=response_content,
                         source="knowledge_search",
                         score=None
                     )
                 )

3-4: Nit: unify typing style (builtins vs typing aliases).

You mix List[...] and list[...]. Prefer a single style (builtins in 3.9+) for consistency.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d798ec and 927754c.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • pyproject.toml (1 hunks)
  • src/utils/types.py (2 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (1)
tests/unit/app/endpoints/test_streaming_query.py (1)
src/utils/types.py (3)
  • ToolCallSummary (43-56)
  • TurnSummary (67-141)
  • RAGChunkData (59-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: e2e_tests
🔇 Additional comments (2)
tests/unit/app/endpoints/test_streaming_query.py (1)

48-48: LGTM: importing RAGChunkData for test fixtures is appropriate.

The import aligns tests with the new TurnSummary contract.

src/utils/types.py (1)

59-65: RAGChunkData model looks good.

Clear fields, optionality is sensible. No issues.

@asamal4
Copy link
Contributor

asamal4 commented Sep 17, 2025

cc: @tisnik @lpiwowar

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks ok in overall, thank you. Put some nits into comments.

@tisnik tisnik changed the title [LCORE-601] Add RAG chunks in query response LCORE-601: Add RAG chunks in query response Sep 17, 2025
Copy link
Contributor

@are-ces are-ces left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of nits, otherwise lgtm!

Signed-off-by: Anxhela Coba <[email protected]>
Signed-off-by: Anxhela Coba <[email protected]>
Copy link
Contributor

@VladimirKadlec VladimirKadlec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 💪

Copy link
Contributor

@are-ces are-ces left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added one comment. Otherwise LGTM nice work!  

Signed-off-by: Anxhela Coba <[email protected]>
Copy link
Contributor

@VladimirKadlec VladimirKadlec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Signed-off-by: Anxhela Coba <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (7)
src/utils/types.py (2)

91-136: Add lightweight observability and broader schema fallback when parsing chunks.

  • Add debug logs for parse paths/failures.
  • Accept common keys ("text", "page_content") when "content" missing.

Apply:

@@
-    def _extract_rag_chunks_from_response(self, response_content: str) -> None:
+    def _extract_rag_chunks_from_response(self, response_content: str) -> None:
         """Extract RAG chunks from tool response content."""
         try:
             # Parse the response to get chunks
             # Try JSON first
             try:
                 data = json.loads(response_content)
                 if isinstance(data, dict) and "chunks" in data:
                     for chunk in data["chunks"]:
                         self.rag_chunks.append(
                             RAGChunk(
-                                content=chunk.get("content", ""),
+                                content=chunk.get("content")
+                                or chunk.get("text")
+                                or chunk.get("page_content", ""),
                                 source=chunk.get("source"),
                                 score=chunk.get("score"),
                             )
                         )
                 elif isinstance(data, list):
                     # Handle list of chunks
                     for chunk in data:
                         if isinstance(chunk, dict):
                             self.rag_chunks.append(
                                 RAGChunk(
-                                    content=chunk.get("content", str(chunk)),
+                                    content=chunk.get("content")
+                                    or chunk.get("text")
+                                    or chunk.get("page_content", str(chunk)),
                                     source=chunk.get("source"),
                                     score=chunk.get("score"),
                                 )
                             )
             except json.JSONDecodeError:
                 # If not JSON, treat the entire response as a single chunk
                 if response_content.strip():
                     self.rag_chunks.append(
                         RAGChunk(
                             content=response_content,
                             source=DEFAULT_RAG_TOOL,
                             score=None,
                         )
                     )
         except (KeyError, AttributeError, TypeError, ValueError):
             # Treat response as single chunk on data access/structure errors
             if response_content.strip():
                 self.rag_chunks.append(
                     RAGChunk(
                         content=response_content, source=DEFAULT_RAG_TOOL, score=None
                     )
                 )

Additionally (outside this hunk), add module logging:

# at top of file
import logging
logger = logging.getLogger(__name__)

…and emit optional debug logs in each branch if desired (kept out of diff for brevity).


87-90: Tool-name match may miss namespaced tools; make the check robust.

Granite/MCP providers often report namespaced tool names (e.g., "builtin::rag.knowledge_search"). Consider tolerant matching.

-            if tc.tool_name == DEFAULT_RAG_TOOL and resp and response_content:
+            if resp and response_content and (
+                tc.tool_name == DEFAULT_RAG_TOOL
+                or tc.tool_name.endswith(f".{DEFAULT_RAG_TOOL}")
+            ):
                 self._extract_rag_chunks_from_response(response_content)

Found single strict comparison at src/utils/types.py:88. Log tc.tool_name once at runtime to confirm actual values.

src/models/responses.py (2)

37-43: Constrain RAGChunk.score to a sane range.

If scores are normalized, bound them to [0.0, 1.0] to prevent bad data.

-    score: Optional[float] = Field(None, description="Relevance score")
+    score: Optional[float] = Field(
+        None, ge=0.0, le=1.0, description="Relevance score (0..1)"
+    )

55-61: Class name “ToolCall” risks confusion with llama_stack_client.ToolCall.

To avoid import shadowing/mistakes, consider renaming to APIToolCall or ToolCallInfo.

src/utils/transcripts.py (2)

18-18: Use module-qualified logger name.

Prefer name to a hardcoded string for flexibility.

-logger = logging.getLogger("utils.transcripts")
+logger = logging.getLogger(__name__)

42-56: Tighten rag_chunks typing to dict[str, Any] and reflect in signature.

Improves clarity and static checks.

+from typing import Any
@@
-    rag_chunks: list[dict],
+    rag_chunks: list[dict[str, Any]],
@@
-        rag_chunks: The list of serialized `RAGChunk` dictionaries.
+        rag_chunks: The list of serialized `RAGChunk` dictionaries.
run.yaml (1)

63-66: Config looks coherent; make paths/envs overridable and keep IDs consistent.

  • Vector DB path: make it env-overridable for deployments.
  • Embedding model/dimension: consistent with 768; good.
  • Provider IDs (“faiss”, “sentence-transformers”, “rag-runtime”): consistent; good.
-        db_path: .llama/distributions/ollama/faiss_store.db # Location of vector database
+        db_path: ${env.VECTOR_STORE_DB_PATH:.llama/distributions/ollama/faiss_store.db} # Location of vector database

Optional: consider env-driving the embedding model ID to avoid drift between models:embedding and vector_dbs entries.

Also applies to: 105-107, 111-116, 126-130, 132-137, 143-145

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 927754c and 99080a8.

📒 Files selected for processing (8)
  • pyproject.toml (1 hunks)
  • run.yaml (3 hunks)
  • src/app/endpoints/query.py (3 hunks)
  • src/constants.py (1 hunks)
  • src/models/responses.py (2 hunks)
  • src/utils/transcripts.py (2 hunks)
  • src/utils/types.py (2 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/app/endpoints/query.py
  • tests/unit/app/endpoints/test_streaming_query.py
  • pyproject.toml
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/utils/transcripts.py
  • src/constants.py
  • src/utils/types.py
  • src/models/responses.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/utils/transcripts.py
  • src/constants.py
  • src/utils/types.py
  • src/models/responses.py
src/constants.py

📄 CodeRabbit inference engine (CLAUDE.md)

Keep shared constants in a central src/constants.py with descriptive comments

Files:

  • src/constants.py
src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

  • src/models/responses.py
src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

  • src/models/responses.py
🧬 Code graph analysis (1)
src/utils/types.py (1)
src/models/responses.py (1)
  • RAGChunk (37-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (1)
src/utils/types.py (1)

65-66: Fix mutable default and missing default for collections (Pydantic v2).

  • rag_chunks uses [] which is shared across instances.
  • tool_calls has no default but is appended to later.

Switch both to Field(default_factory=list) and import Field.

Apply:

-from pydantic import BaseModel
+from pydantic import BaseModel, Field
@@
 class TurnSummary(BaseModel):
@@
-    tool_calls: list[ToolCallSummary]
-    rag_chunks: list[RAGChunk] = []
+    tool_calls: list[ToolCallSummary] = Field(default_factory=list)
+    rag_chunks: list[RAGChunk] = Field(default_factory=list)

Comment on lines +55 to +56
# default RAG tool value
DEFAULT_RAG_TOOL = "knowledge_search"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

🧩 Analysis chain

Good centralization; verify no hardcoded tool names remain.

Constant name/readability look good. Please confirm there are no remaining string literals "knowledge_search" outside constants.


🏁 Script executed:

#!/usr/bin/env bash
# Find stray literals (exclude config/examples)
rg -nP -S "(^|[^A-Za-z0-9_])knowledge_search([^A-Za-z0-9_]|$)" -g '!**/run.yaml' -g '!**/README*' -g '!**/docs/**'

Length of output: 148


🏁 Script executed:

#!/usr/bin/env bash
set -euo pipefail
echo "Searching repository for literal 'knowledge_search' using multiple fallbacks..."

# 1) ripgrep (robust flags)
if command -v rg >/dev/null 2>&1; then
  echo ":: rg --hidden --no-ignore -uu -F -S"
  rg --hidden --no-ignore -uu -F -n "knowledge_search" -S || echo "rg: no matches or search completed"
else
  echo "rg not available"
fi

# 2) git grep fallback
if git rev-parse --is-inside-work-tree >/dev/null 2>&1; then
  echo ":: git grep -n"
  git grep -n "knowledge_search" || echo "git grep: no matches"
else
  echo "Not a git repo (git grep skipped)"
fi

# 3) find + grep fallback (exclude common heavy dirs)
echo ":: find + grep (excluding .git and node_modules)"
find . -type f -not -path "./.git/*" -not -path "./node_modules/*" -not -path "./dist/*" -print0 \
  | xargs -0 grep -nH --line-number -I "knowledge_search" || echo "find+grep: no matches"

Length of output: 6544


Replace remaining 'knowledge_search' string literals with DEFAULT_RAG_TOOL

Search found literal occurrences in source and tests; update usages to import and use src.constants.DEFAULT_RAG_TOOL (use f"builtin::rag/{DEFAULT_RAG_TOOL}" where the namespaced form is required).

Locations:

  • src/models/responses.py:129
  • src/utils/types.py:87 (comment)
  • src/app/endpoints/query.py:642
  • src/app/endpoints/streaming_query.py:444
  • tests/unit/app/endpoints/test_query.py:1144
  • tests/unit/app/endpoints/test_streaming_query.py: multiple locations (71–85, 233, 242, 342, 349–350, 995–1023)

Comment on lines +93 to +94
rag_chunks: list[RAGChunk] = []

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix mutable default list in API model.

rag_chunks uses [] which is shared across responses; switch to default_factory.

-    rag_chunks: list[RAGChunk] = []
+    rag_chunks: list[RAGChunk] = Field(
+        default_factory=list, description="List of RAG chunks used to generate the response"
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
rag_chunks: list[RAGChunk] = []
rag_chunks: list[RAGChunk] = Field(
default_factory=list, description="List of RAG chunks used to generate the response"
)
🤖 Prompt for AI Agents
In src/models/responses.py around lines 93-94, the RAGChunk list is initialized
with a mutable default (rag_chunks: list[RAGChunk] = []) which is shared across
instances; replace it with a non-shared default_factory by importing Field from
pydantic (if not already imported) and change the declaration to use
Field(default_factory=list), i.e. rag_chunks: list[RAGChunk] =
Field(default_factory=list).

Signed-off-by: Anxhela Coba <[email protected]>
@Anxhela21 Anxhela21 requested a review from tisnik September 22, 2025 15:17
Copy link
Contributor

@asamal4 asamal4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM..
I am okay if we add test cases with a follow up PR..
But as soon as possible, because this repo/functionality is changed frequently by several people.

Signed-off-by: Anxhela Coba <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
src/app/endpoints/query.py (2)

274-275: Tune log levels: prefer debug for noisy internal steps.

These are verbose, per-request messages better suited for debug to reduce log noise in production.

Suggested changes:

  • logger.info(...) -> logger.debug(...) for processing/persisting steps.

Also applies to: 283-284, 299-300, 316-317, 324-325


57-57: Use module logger name per guidelines.

Prefer logging.getLogger(name) to follow project convention.

-logger = logging.getLogger("app.endpoints.handlers")
+logger = logging.getLogger(__name__)
src/models/responses.py (1)

148-148: Remove trailing whitespace in src/models/responses.py:148

Trailing whitespace on line 148 is flagged by pylint — remove the trailing space to fix formatting.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0dd1f1d and 0d774ce.

📒 Files selected for processing (4)
  • pyproject.toml (2 hunks)
  • src/app/endpoints/query.py (4 hunks)
  • src/constants.py (1 hunks)
  • src/models/responses.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/constants.py
  • pyproject.toml
🧰 Additional context used
📓 Path-based instructions (7)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/models/responses.py
  • src/app/endpoints/query.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/models/responses.py
  • src/app/endpoints/query.py
src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

  • src/models/responses.py
src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

  • src/models/responses.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/query.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/query.py
🧬 Code graph analysis (1)
src/app/endpoints/query.py (1)
src/models/responses.py (3)
  • ToolCall (44-49)
  • ReferencedDocument (52-64)
  • QueryResponse (66-147)
🪛 GitHub Actions: Black
src/models/responses.py

[error] 1-1: Black formatting check failed: 1 file would be reformatted by 'black --check .'. Run 'black --write .' to fix code style issues.

🪛 GitHub Actions: Pyright
src/models/responses.py

[error] 60-60: pyright: Type 'None' is not assignable to declared type 'AnyUrl'.

src/app/endpoints/query.py

[error] 307-307: pyright: Argument of type 'AnyUrl | None' cannot be assigned to parameter 'doc_url' of type 'AnyUrl' in function 'init'. None is not assignable to 'AnyUrl'.

🪛 GitHub Actions: Python linter
src/models/responses.py

[error] 101-101: Pylint: trailing whitespace (C0303) detected. (Command: 'uv run pylint src tests' exit code 16)


[error] 148-148: Pylint: trailing whitespace (C0303) detected. (Command: 'uv run pylint src tests' exit code 16)

🪛 GitHub Actions: Type checks
src/models/responses.py

[error] 60-60: mypy: Incompatible types in assignment (expression has type "None", variable has type "AnyUrl") [assignment]. Command: 'uv run mypy --explicit-package-bases --disallow-untyped-calls --disallow-untyped-defs --disallow-incomplete-defs --ignore-missing-imports --disable-error-code attr-defined src/'

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (5)
src/models/responses.py (4)

97-100: Prefer non-optional list for tool_calls for consistency.

Lists are better modeled as empty lists instead of None for API consumers; aligns with prior feedback on non-optional lists.

-    tool_calls: Optional[list[ToolCall]] = Field(
-        None,
-        description="List of tool calls made during response generation",
-    )
+    tool_calls: list[ToolCall] = Field(
+        default_factory=list,
+        description="List of tool calls made during response generation",
+    )

123-136: Examples look good.

Example payload for rag_chunks and tool_calls is coherent and helpful for API docs.


95-96: Avoid mutable default for rag_chunks.

Use default_factory to prevent shared mutable defaults across instances. This was already flagged earlier.

-    rag_chunks: list[RAGChunk] = []
+    rag_chunks: list[RAGChunk] = Field(
+        default_factory=list, description="List of RAG chunks used to generate the response"
+    )

101-101: Remove trailing whitespace in src/models/responses.py (line 101). Black/pylint failing; black is unavailable in the verification environment — run locally: black --check src/models/responses.py || black src/models/responses.py and commit the formatted file.

src/app/endpoints/query.py (1)

253-256: LGTM: single conversion of rag chunks for persistence.

Converting once to dicts for transcripts is clean and avoids repeated serialization.

Signed-off-by: Anxhela Coba <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/models/responses.py (3)

99-102: Return a stable, non-null list for tool_calls.

For a consistent API shape (and per prior reviewer consensus), default to an empty list instead of None.

Apply:

-    tool_calls: Optional[list[ToolCall]] = Field(
-        None,
-        description="List of tool calls made during response generation",
-    )
+    tool_calls: list[ToolCall] = Field(
+        default_factory=list,
+        description="List of tool calls made during response generation",
+    )

37-43: Constrain score to [0, 1] for predictable semantics.

If scores are normalized, enforce via validation.

Apply:

-    score: Optional[float] = Field(None, description="Relevance score")
+    score: Optional[float] = Field(
+        default=None, description="Relevance score (0..1)", ge=0.0, le=1.0
+    )

If scores are not normalized, ignore.


45-51: Use a JSON type alias for arguments/result for clarity.

Improves readability across models using JSON blobs.

Add at module top (outside this hunk):

# Type aliases
JSONObj = dict[str, Any]

Then update:

-    arguments: dict[str, Any] = Field(description="Arguments passed to the tool")
-    result: Optional[dict[str, Any]] = Field(None, description="Result from the tool")
+    arguments: JSONObj = Field(description="Arguments passed to the tool")
+    result: Optional[JSONObj] = Field(None, description="Result from the tool")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d774ce and 0871751.

📒 Files selected for processing (1)
  • src/models/responses.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/models/responses.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/models/responses.py
src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

  • src/models/responses.py
src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

  • src/models/responses.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (3)
src/models/responses.py (3)

61-63: Good fix: doc_url is now optional (resolves type-checker errors).

This aligns with earlier feedback and usage where None is passed.

Please confirm pyright/mypy are green on this change in CI.


125-138: Examples look consistent with the schema.

The example RAG chunk and tool call structures align with the new models.


97-98: Fix mutable default for rag_chunks and add field metadata.

Use default_factory to avoid shared list across instances and surface description in OpenAPI.

Apply:

-    rag_chunks: list[RAGChunk] = []
+    rag_chunks: list[RAGChunk] = Field(
+        default_factory=list,
+        description="List of RAG chunks used to generate the response",
+    )

Signed-off-by: Anxhela Coba <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/unit/models/responses/test_query_response.py (2)

3-3: Use absolute imports for consistency.

The import uses a relative import path that may not align with the project structure. Based on the relevant code snippets showing src/models/responses.py, this should use an absolute import.

-from models.responses import QueryResponse, RAGChunk, ToolCall, ReferencedDocument
+from src.models.responses import QueryResponse, RAGChunk, ToolCall, ReferencedDocument

92-98: ToolCall model doesn't validate arguments/result — confirm or add validation

src/models/responses.py: ToolCall declares arguments: dict[str, Any] and result: Optional[dict[str, Any]] with no validators, so the test's dicts are accepted as-is. If you expect specific shapes, replace these with TypedDict/Pydantic submodels or add field validators; otherwise confirm flexible typing is intentional.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0871751 and 930cd74.

📒 Files selected for processing (2)
  • tests/unit/models/responses/test_query_response.py (2 hunks)
  • tests/unit/models/responses/test_rag_chunk.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/unit/models/responses/test_rag_chunk.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • tests/unit/models/responses/test_query_response.py
tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard

Files:

  • tests/unit/models/responses/test_query_response.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests

Files:

  • tests/unit/models/responses/test_query_response.py
🧬 Code graph analysis (1)
tests/unit/models/responses/test_query_response.py (1)
src/models/responses.py (4)
  • QueryResponse (68-149)
  • RAGChunk (37-42)
  • ToolCall (45-50)
  • ReferencedDocument (53-65)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests
🔇 Additional comments (5)
tests/unit/models/responses/test_query_response.py (5)

24-28: LGTM! Good test coverage for default behavior.

This test correctly validates that rag_chunks is empty by default, ensuring backward compatibility for existing functionality.


29-63: Excellent comprehensive test for RAG chunks functionality.

This test thoroughly validates RAG chunks with all fields populated, including proper access to content, source, and score attributes. The test data is realistic and well-structured.


64-81: Great coverage of optional field scenarios.

This test effectively validates the optional nature of source and score fields in RAG chunks, covering all combinations of present/absent optional fields.


82-129: Comprehensive integration test with all new features.

This test provides excellent end-to-end validation of the complete QueryResponse with RAG chunks, tool calls, and referenced documents. The assertions thoroughly verify all nested attributes and relationships.


100-107: No action required — ReferencedDocument.doc_url already uses pydantic.AnyUrl. The model in src/models/responses.py declares doc_url: Optional[AnyUrl] = Field(...), so URL validation is enforced.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik merged commit 7173f72 into lightspeed-core:main Sep 25, 2025
18 of 19 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Nov 19, 2025
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants