LCORE-601: Add RAG chunks in query response #550

Anxhela21 · 2025-09-16T16:51:04Z

Description

Add RAG chunks in the query response

Type of change

Related Tickets & Documents

Related Issue # https://issues.redhat.com/browse/LCORE-601
Closes # https://issues.redhat.com/browse/LCORE-601

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Query responses now include retrieved RAG chunks, referenced documents, and tool-call details for richer, more traceable results.
Configuration
- Added sentence-transformers embedding provider, a RAG runtime provider, a configurable vector DB entry (my_knowledge_base) and embedding model; bundled builtin RAG toolgroup and default RAG tool.
Chores
- Fixed dependency file formatting and added a static-analysis ignore entry.
Tests
- Updated tests to validate RAG-chunk model and propagation in streaming and response tests.

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai · 2025-09-16T16:51:11Z

Walkthrough

Adds RAG-aware models and plumbing: new RAGChunk/ToolCall/ReferencedDocument types, extraction of RAG chunks from tool outputs into TurnSummary, persisted serialized rag_chunks in transcripts, and surfacing rag_chunks/tool_calls/referenced_documents in QueryResponse; also adds embedding/vector runtime config entries.

Changes

Cohort / File(s)	Summary
Config: runtime `run.yaml`	Adds inference provider `sentence-transformers` and tool_runtime `rag-runtime`; adds `vector_dbs` entry `my_knowledge_base` (embedding model `sentence-transformers/all-mpnet-base-v2`, dim 768, provider `faiss`); adds model metadata and tool_group `builtin::rag`.
Dependency file `pyproject.toml`	Minor TOML syntax fix (removed trailing comma in dependencies).
API Endpoint: Query shaping `src/app/endpoints/query.py`	Serializes `TurnSummary.rag_chunks` to dicts for persistence; maps `summary.tool_calls` → list of `ToolCall`; builds `referenced_documents` from rag chunk sources; logs new steps; persists transcript with serialized rag_chunks; populates `QueryResponse` with `rag_chunks`, `tool_calls`, and `referenced_documents`.
Response models `src/models/responses.py`	Adds `RAGChunk`, `ToolCall`, and `ReferencedDocument` (doc_url now optional); extends `QueryResponse` with `rag_chunks`, `tool_calls`, and `referenced_documents`; updates examples/schema.
Types / TurnSummary `src/utils/types.py`	Adds `rag_chunks: list[RAGChunk] = []` to `TurnSummary`; consolidates tool response handling into `response_content`; introduces `_extract_rag_chunks_from_response` to parse JSON/list-of-chunks with fallbacks.
Transcripts storage `src/utils/transcripts.py`	Changes `store_transcript(..., rag_chunks: list[str], ...)` → `rag_chunks: list[dict]` (serialized RAGChunk dicts) and updates docstring.
Constants `src/constants.py`	Adds `DEFAULT_RAG_TOOL = "knowledge_search"`.
Tests: responses & RAG `tests/unit/models/responses/test_query_response.py`, `tests/unit/models/responses/test_rag_chunk.py`	Adds tests for `RAGChunk`, `ToolCall`, `ReferencedDocument` and expands `QueryResponse` tests to cover rag_chunks/tool_calls/referenced_documents; new test module for `RAGChunk` validations.
Tests: endpoint `tests/unit/app/endpoints/test_streaming_query.py`	Updates test data to include a `RAGChunk` in `TurnSummary` and assert rag_chunks propagate into stored transcript; imports adjusted.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant API as /query Endpoint
  participant ORCH as Orchestrator/LLM
  participant TOOL as RAG Tool (knowledge_search)
  participant EMB as Embedder (sentence-transformers)
  participant VDB as Vector DB (faiss)

  C->>API: POST /query
  API->>ORCH: run turn (prompt, tools)
  alt RAG tool invoked
    ORCH->>TOOL: knowledge_search(query)
    TOOL->>EMB: embed(query)
    EMB-->>TOOL: embedding
    TOOL->>VDB: search(embedding)
    VDB-->>TOOL: chunks + scores
    TOOL-->>ORCH: tool response (includes chunks)
  end
  ORCH-->>API: TurnSummary (messages, tool_calls, rag_chunks)
  API->>API: normalize rag_chunks → list[dict]
  API->>API: build tool_calls and referenced_documents
  API->>API: store_transcript(..., rag_chunks=dicts, ...)
  API-->>C: QueryResponse { rag_chunks?, tool_calls?, referenced_documents? }

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Add tool calls to stored transcripts #404 — Overlaps with transcript persistence and structured tool-call / RAG data handling changes.

Suggested reviewers

tisnik
manstis

Poem

I nibble on chunks of knowledge bright,
I map each call and tuck facts tight,
Vectors hum and embeddings take flight,
I hop through logs and store the light,
Then share the crumbs — a rabbit's delight. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "LCORE-601: Add RAG chunks in query response" directly and concisely summarizes the primary change in the PR—adding RAG chunks to the query response—and references the related ticket for context. It is specific, free of noise, and clearly communicates the main functional change reviewers should expect. The phrasing is short and appropriate for scanable commit/PR history.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (8)

run.yaml (1)

145-145: Add trailing newline to satisfy linters.

YAMLlint flags missing newline at EOF.

src/utils/types.py (2)

91-94: Guard on tool name with constant/enum (optional).

Literal "knowledge_search" string can drift from runtime config. Consider a module-level constant.

95-142: RAG chunk extraction: broaden JSON tolerance and cap payload (optional).

Some providers return keys like text/content/snippet and score as strings. Consider normalizing and adding a soft cap to avoid oversized responses.

@@
-                if isinstance(data, dict) and "chunks" in data:
+                if isinstance(data, dict) and "chunks" in data:
                     for chunk in data["chunks"]:
-                        self.rag_chunks.append(
+                        text = chunk.get("content") or chunk.get("text") or ""
+                        score = chunk.get("score")
+                        try:
+                            score = float(score) if score is not None else None
+                        except (TypeError, ValueError):
+                            score = None
+                        self.rag_chunks.append(
                             RAGChunkData(
-                                content=chunk.get("content", ""),
+                                content=text,
                                 source=chunk.get("source"),
-                                score=chunk.get("score")
+                                score=score
                             )
                         )
@@
-                    for chunk in data:
+                    for chunk in data:
                         if isinstance(chunk, dict):
-                            self.rag_chunks.append(
+                            text = chunk.get("content") or chunk.get("text") or str(chunk)
+                            score = chunk.get("score")
+                            try:
+                                score = float(score) if score is not None else None
+                            except (TypeError, ValueError):
+                                score = None
+                            self.rag_chunks.append(
                                 RAGChunkData(
-                                    content=chunk.get("content", str(chunk)),
+                                    content=text,
                                     source=chunk.get("source"),
-                                    score=chunk.get("score")
+                                    score=score
                                 )
                             )
+            # Optional: bound number of chunks
+            if len(self.rag_chunks) > 50:
+                self.rag_chunks = self.rag_chunks[:50]

src/models/responses.py (1)

53-59: API naming vs. internal naming mismatch.

Internal ToolCallSummary uses “response” (str) while API uses “result” (dict). It’s fine, but consider aligning the field name to “response” for consistency, or document the mapping decisively.

src/app/endpoints/query.py (4)

254-263: Normalize tool call result to JSON when possible.

Try JSON-decoding tool responses to provide structured results; fall back to string.

 tool_calls = [
     ToolCall(
         tool_name=tc.name,
         arguments=tc.args if isinstance(tc.args, dict) else {"query": str(tc.args)},
-        result={"response": tc.response} if tc.response else None
+        result=(
+            (json.loads(tc.response) if isinstance(tc.response, str) else tc.response)
+            if tc.response
+            else None
+        )
     )
     for tc in summary.tool_calls
 ]

276-290: O(n^2) chunk_count; pre-aggregate for large results (optional).

For many chunks, the per-source sum is quadratic. Pre-count once.

-        referenced_docs = []
-        doc_sources = set()
-        for chunk in summary.rag_chunks:
-            if chunk.source and chunk.source not in doc_sources:
-                doc_sources.add(chunk.source)
-                referenced_docs.append(
-                    ReferencedDocument(
-                        url=chunk.source if chunk.source.startswith("http") else None,
-                        title=chunk.source,
-                        chunk_count=sum(1 for c in summary.rag_chunks if c.source == chunk.source)
-                    )
-                )
+        from collections import Counter
+        src_counts = Counter(c.source for c in summary.rag_chunks if c.source)
+        referenced_docs = [
+            ReferencedDocument(
+                url=src if src.startswith("http") else None,
+                title=src,
+                chunk_count=cnt,
+            )
+            for src, cnt in src_counts.items()
+        ]

229-244: Persist transcripts with actual rag_chunks.

You already compute summary.rag_chunks; pass them to store_transcript.

-                rag_chunks=[],  # TODO(lucasagomes): implement rag_chunks
+                rag_chunks=summary.rag_chunks,

246-256: Reduce log verbosity to debug.

These info logs are per-request and noisy. Downgrade to debug to keep prod logs clean.

-        logger.info("Persisting conversation details...")
+        logger.debug("Persisting conversation details...")
@@
-        logger.info("Processing tool calls...")
+        logger.debug("Processing tool calls...")
@@
-        logger.info("Processing RAG chunks...")
+        logger.debug("Processing RAG chunks...")
@@
-        logger.info("Extracting referenced documents...")
+        logger.debug("Extracting referenced documents...")
@@
-        logger.info("Building final response...")
+        logger.debug("Building final response...")
@@
-        logger.info("Query processing completed successfully!")
+        logger.debug("Query processing completed successfully!")

Also applies to: 267-275, 276-290, 291-301

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 146e8eb and ef94176.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

pyproject.toml (1 hunks)
run.yaml (3 hunks)
src/app/endpoints/query.py (2 hunks)
src/models/responses.py (3 hunks)
src/utils/types.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/app/endpoints/query.py (1)

src/models/responses.py (4)

QueryResponse (61-136)

RAGChunk (37-42)

ReferencedDocument (45-50)

ToolCall (53-58)

🪛 YAMLlint (1.37.1)

run.yaml

[error] 145-145: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: e2e_tests

🔇 Additional comments (6)

src/models/responses.py (2)

37-43: RAGChunk model — LGTM.

91-105: Optional fields on QueryResponse — good choice.

Omitting empty lists keeps payloads small and stable for clients.

Also applies to: 112-137

src/app/endpoints/query.py (1)

267-275: RAG chunks mapping — LGTM.

Straight-through mapping from TurnSummary is correct.

run.yaml (1)

63-66: Toolgroup wiring validated — no changes required.
run.yaml defines toolgroup_id: builtin::rag → provider_id: rag-runtime (run.yaml lines 143–145) and src/app/endpoints/query.py references "builtin::rag/knowledge_search" (src/app/endpoints/query.py lines 636–638). inline::sentence-transformers and inline::rag-runtime entries are fine.
pyproject.toml (2)
53-68: Move heavy ML deps out of core runtime into an optional 'rag' dependency-group — verify imports first

torch/trl/peft/datasets/faiss-cpu/autoevals/sentence-transformers will bloat prod images and cold-starts; gate them behind an extras/dependency-group and only install for RAG-enabled deployments.

Apply this minimal removal from default deps:
@@
-    "datasets>=3.6.0",
-    "faiss-cpu>=1.11.0",
-    "autoevals>=0.0.129",
-    "torch>=2.7.1",
-    "peft>=0.15.2",
-    "trl>=0.18.2",
-    "sentence-transformers>=5.1.0",
Then add a dedicated group (example):
[dependency-groups]
rag = [
    "sentence-transformers>=5.1.0",
    "faiss-cpu>=1.11.0",
    "datasets>=3.6.0",
    "torch>=2.7.1",
    "peft>=0.15.2",
    "trl>=0.18.2",
    "autoevals>=0.0.129",
]
Align litellm version constraints to a single lower bound across groups to avoid resolver surprises.

Verification: I could not search the repo (rg returned no files). Re-run the import check locally to confirm these packages are not required by core runtime, e.g.:
rg -nP -C2 -uu --glob '!**/tests/**' \
  -e '^\s*(from|import)\s+torch\b' \
  -e '^\s*(from|import)\s+trl\b' \
  -e '^\s*(from|import)\s+peft\b' \
  -e '^\s*(from|import)\s+datasets\b' \
  -e '^\s*(from|import)\s+sentence_transformers\b'
48-48: Confirm SQLAlchemy floor change and verify compatibility with 2.0.41

Evidence: pyproject.toml:48 => "sqlalchemy>=2.0.41"; uv.lock currently resolves to sqlalchemy 2.0.43 (uv.lock entries ~3303–3321, upload-time 2025-08-11). Repo search returned no 'alembic' references.

Action: run CI/local tests with sqlalchemy==2.0.41 and exercise migrations and DB adapters (psycopg2/aiosqlite). If widening the floor was unintentional or causes failures, tighten the minimum to >=2.0.43 or pin the required patch.

run.yaml

src/utils/types.py

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (3)

src/utils/types.py (2)
72-72: Good: mutable default avoided for rag_chunks

Switch to Field(default_factory=list) resolves shared-state risk.

10-10: Fix Pydantic import to unblock CI

Field must be imported from pydantic, not pydantic.main (see failing E2E log).
-from pydantic.main import BaseModel, Field
+from pydantic import BaseModel, Field
run.yaml (1)
111-111: Parameterize vector DB path via env var

Avoid hard-coded path; support overrides with a sane default.
-        db_path: .llama/distributions/ollama/faiss_store.db # Location of vector database
+        db_path: ${env.VECTOR_STORE_DB_PATH:.llama/distributions/ollama/faiss_store.db} # Location of vector database

🧹 Nitpick comments (1)

src/utils/types.py (1)

95-142: Make RAG parsing more robust (dict-without-chunks, list of scalars)

Current logic drops data when JSON is a dict without "chunks" or a list of non-dicts. Add safe fallbacks.

         try:
             # Parse the response to get chunks
             # Try JSON first
             try:
                 data = json.loads(response_content)
                 if isinstance(data, dict) and "chunks" in data:
                     for chunk in data["chunks"]:
                         self.rag_chunks.append(
                             RAGChunkData(
                                 content=chunk.get("content", ""),
                                 source=chunk.get("source"),
                                 score=chunk.get("score")
                             )
                         )
-                elif isinstance(data, list):
+                elif isinstance(data, dict):
+                    # Fallback: treat whole dict as a single chunk
+                    self.rag_chunks.append(
+                        RAGChunkData(
+                            content=json.dumps(data, ensure_ascii=False),
+                            source="knowledge_search",
+                            score=None,
+                        )
+                    )
+                elif isinstance(data, list):
                     # Handle list of chunks
                     for chunk in data:
-                        if isinstance(chunk, dict):
+                        if isinstance(chunk, dict):
                             self.rag_chunks.append(
                                 RAGChunkData(
                                     content=chunk.get("content", str(chunk)),
                                     source=chunk.get("source"),
                                     score=chunk.get("score")
                                 )
                             )
+                        else:
+                            # Fallback: stringify scalar/other types
+                            self.rag_chunks.append(
+                                RAGChunkData(
+                                    content=str(chunk),
+                                    source="knowledge_search",
+                                    score=None,
+                                )
+                            )

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef94176 and 2d798ec.

📒 Files selected for processing (2)

run.yaml (3 hunks)
src/utils/types.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/utils/types.py (1)

src/models/responses.py (1)

ToolCall (53-58)

🪛 GitHub Actions: E2E Tests

src/utils/types.py

[error] 10-10: ImportError: cannot import name 'Field' from 'pydantic.main'. Did you mean: '_fields'?

🔇 Additional comments (5)

src/utils/types.py (1)

80-94: Nice consolidation of response content + targeted extraction

Computing response_content once and gating RAG extraction on knowledge_search looks solid.

run.yaml (4)

63-66: Embedding provider addition looks good

Inline sentence-transformers provider is correctly declared and referenced later in models.

105-107: RAG runtime provider wiring looks correct

Provider registered and later exposed via tool_group.

132-137: Model entry aligns with embedding config

Model/provider IDs and embedding dimension are consistent.

143-145: Tool group exposes RAG runtime correctly

builtin::rag bound to rag-runtime is consistent with providers.

run.yaml

src/utils/types.py

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

src/utils/types.py (1)
70-73: Default-initialize tool_calls to an empty list.

Prevents None/missing list issues and keeps parity with rag_chunks. This was raised previously and is still applicable.

Apply:
-    tool_calls: list[ToolCallSummary]
+    tool_calls: list[ToolCallSummary] = Field(default_factory=list)

🧹 Nitpick comments (4)

tests/unit/app/endpoints/test_streaming_query.py (1)

346-356: Validate transcript API shape to avoid duplicated rag chunk plumbing.

You pass rag chunks inside summary.rag_chunks while also passing a separate rag_chunks=[] arg to store_transcript. If the latter is unused or redundant, consider removing it from the API or populating it consistently to avoid confusion.

If keeping both, add an assertion that the service ultimately persists summary.rag_chunks so we don’t silently drop them.

src/utils/types.py (3)

80-94: Guard against unexpected content shapes from tool responses.

interleaved_content_as_str(resp.content) can raise if resp.content is absent or malformed; a small try/except avoids derailing the whole summary build.

Apply:

-            response_content = interleaved_content_as_str(resp.content) if resp else None
+            try:
+                response_content = interleaved_content_as_str(resp.content) if resp else None
+            except Exception:
+                response_content = None

95-142: Add fallback when JSON parses but yields an unknown shape.

If json.loads succeeds but returns a non-list/dict or a dict without "chunks", no chunk is appended. Fall back to a single chunk in that case.

Apply:

     def _extract_rag_chunks_from_response(self, response_content: str) -> None:
         """Extract RAG chunks from tool response content."""
         try:
-            # Parse the response to get chunks
+            # Parse the response to get chunks
+            before_count = len(self.rag_chunks)
             # Try JSON first
             try:
                 data = json.loads(response_content)
                 if isinstance(data, dict) and "chunks" in data:
                     for chunk in data["chunks"]:
                         self.rag_chunks.append(
                             RAGChunkData(
                                 content=chunk.get("content", ""),
                                 source=chunk.get("source"),
-                                score=chunk.get("score")
+                                score=chunk.get("score"),
                             )
                         )
                 elif isinstance(data, list):
                     # Handle list of chunks
                     for chunk in data:
                         if isinstance(chunk, dict):
                             self.rag_chunks.append(
                                 RAGChunkData(
                                     content=chunk.get("content", str(chunk)),
                                     source=chunk.get("source"),
-                                    score=chunk.get("score")
+                                    score=chunk.get("score"),
                                 )
                             )
             except json.JSONDecodeError:
                 # If not JSON, treat the entire response as a single chunk
                 if response_content.strip():
                     self.rag_chunks.append(
                         RAGChunkData(
                             content=response_content,
                             source="knowledge_search",
                             score=None
                         )
                     )
+            # If JSON parsed but nothing was added, fall back to a single chunk
+            if response_content.strip() and before_count == len(self.rag_chunks):
+                self.rag_chunks.append(
+                    RAGChunkData(
+                        content=response_content,
+                        source="knowledge_search",
+                        score=None,
+                    )
+                )
         except Exception:
             # Treat response as single chunk
             if response_content.strip():
                 self.rag_chunks.append(
                     RAGChunkData(
                         content=response_content,
                         source="knowledge_search",
                         score=None
                     )
                 )

3-4: Nit: unify typing style (builtins vs typing aliases).

You mix List[...] and list[...]. Prefer a single style (builtins in 3.9+) for consistency.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d798ec and 927754c.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (3)

pyproject.toml (1 hunks)
src/utils/types.py (2 hunks)
tests/unit/app/endpoints/test_streaming_query.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

pyproject.toml

🧰 Additional context used

🧬 Code graph analysis (1)

tests/unit/app/endpoints/test_streaming_query.py (1)

src/utils/types.py (3)

ToolCallSummary (43-56)

TurnSummary (67-141)

RAGChunkData (59-64)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: e2e_tests

🔇 Additional comments (2)

tests/unit/app/endpoints/test_streaming_query.py (1)

48-48: LGTM: importing RAGChunkData for test fixtures is appropriate.

The import aligns tests with the new TurnSummary contract.

src/utils/types.py (1)

59-65: RAGChunkData model looks good.

Clear fields, optionality is sensible. No issues.

src/app/endpoints/query.py

src/models/responses.py

src/utils/types.py

asamal4 · 2025-09-17T00:04:06Z

cc: @tisnik @lpiwowar

tisnik

it looks ok in overall, thank you. Put some nits into comments.

pyproject.toml

src/models/responses.py

src/utils/types.py

pyproject.toml

src/app/endpoints/query.py

are-ces

Added a couple of nits, otherwise lgtm!

Signed-off-by: Anxhela Coba <[email protected]>

VladimirKadlec

Nice work 💪

uv.lock

src/utils/types.py

are-ces

I have added one comment. Otherwise LGTM nice work!

tests/unit/app/endpoints/test_streaming_query.py

run.yaml

Signed-off-by: Anxhela Coba <[email protected]>

VladimirKadlec

lgtm

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (7)

src/utils/types.py (2)

91-136: Add lightweight observability and broader schema fallback when parsing chunks.

Add debug logs for parse paths/failures.
Accept common keys ("text", "page_content") when "content" missing.

Apply:

@@
-    def _extract_rag_chunks_from_response(self, response_content: str) -> None:
+    def _extract_rag_chunks_from_response(self, response_content: str) -> None:
         """Extract RAG chunks from tool response content."""
         try:
             # Parse the response to get chunks
             # Try JSON first
             try:
                 data = json.loads(response_content)
                 if isinstance(data, dict) and "chunks" in data:
                     for chunk in data["chunks"]:
                         self.rag_chunks.append(
                             RAGChunk(
-                                content=chunk.get("content", ""),
+                                content=chunk.get("content")
+                                or chunk.get("text")
+                                or chunk.get("page_content", ""),
                                 source=chunk.get("source"),
                                 score=chunk.get("score"),
                             )
                         )
                 elif isinstance(data, list):
                     # Handle list of chunks
                     for chunk in data:
                         if isinstance(chunk, dict):
                             self.rag_chunks.append(
                                 RAGChunk(
-                                    content=chunk.get("content", str(chunk)),
+                                    content=chunk.get("content")
+                                    or chunk.get("text")
+                                    or chunk.get("page_content", str(chunk)),
                                     source=chunk.get("source"),
                                     score=chunk.get("score"),
                                 )
                             )
             except json.JSONDecodeError:
                 # If not JSON, treat the entire response as a single chunk
                 if response_content.strip():
                     self.rag_chunks.append(
                         RAGChunk(
                             content=response_content,
                             source=DEFAULT_RAG_TOOL,
                             score=None,
                         )
                     )
         except (KeyError, AttributeError, TypeError, ValueError):
             # Treat response as single chunk on data access/structure errors
             if response_content.strip():
                 self.rag_chunks.append(
                     RAGChunk(
                         content=response_content, source=DEFAULT_RAG_TOOL, score=None
                     )
                 )

Additionally (outside this hunk), add module logging:

# at top of file
import logging
logger = logging.getLogger(__name__)

…and emit optional debug logs in each branch if desired (kept out of diff for brevity).

87-90: Tool-name match may miss namespaced tools; make the check robust.

Granite/MCP providers often report namespaced tool names (e.g., "builtin::rag.knowledge_search"). Consider tolerant matching.

-            if tc.tool_name == DEFAULT_RAG_TOOL and resp and response_content:
+            if resp and response_content and (
+                tc.tool_name == DEFAULT_RAG_TOOL
+                or tc.tool_name.endswith(f".{DEFAULT_RAG_TOOL}")
+            ):
                 self._extract_rag_chunks_from_response(response_content)

Found single strict comparison at src/utils/types.py:88. Log tc.tool_name once at runtime to confirm actual values.

src/models/responses.py (2)

37-43: Constrain RAGChunk.score to a sane range.

If scores are normalized, bound them to [0.0, 1.0] to prevent bad data.
-    score: Optional[float] = Field(None, description="Relevance score")
+    score: Optional[float] = Field(
+        None, ge=0.0, le=1.0, description="Relevance score (0..1)"
+    )
55-61: Class name “ToolCall” risks confusion with llama_stack_client.ToolCall.

To avoid import shadowing/mistakes, consider renaming to APIToolCall or ToolCallInfo.

src/utils/transcripts.py (2)

18-18: Use module-qualified logger name.

Prefer name to a hardcoded string for flexibility.
-logger = logging.getLogger("utils.transcripts")
+logger = logging.getLogger(__name__)
42-56: Tighten rag_chunks typing to dict[str, Any] and reflect in signature.

Improves clarity and static checks.
+from typing import Any
@@
-    rag_chunks: list[dict],
+    rag_chunks: list[dict[str, Any]],
@@
-        rag_chunks: The list of serialized `RAGChunk` dictionaries.
+        rag_chunks: The list of serialized `RAGChunk` dictionaries.

run.yaml (1)

63-66: Config looks coherent; make paths/envs overridable and keep IDs consistent.

Vector DB path: make it env-overridable for deployments.

Embedding model/dimension: consistent with 768; good.

Provider IDs (“faiss”, “sentence-transformers”, “rag-runtime”): consistent; good.
-        db_path: .llama/distributions/ollama/faiss_store.db # Location of vector database
+        db_path: ${env.VECTOR_STORE_DB_PATH:.llama/distributions/ollama/faiss_store.db} # Location of vector database
Optional: consider env-driving the embedding model ID to avoid drift between models:embedding and vector_dbs entries.

Also applies to: 105-107, 111-116, 126-130, 132-137, 143-145

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 927754c and 99080a8.

📒 Files selected for processing (8)

pyproject.toml (1 hunks)
run.yaml (3 hunks)
src/app/endpoints/query.py (3 hunks)
src/constants.py (1 hunks)
src/models/responses.py (2 hunks)
src/utils/transcripts.py (2 hunks)
src/utils/types.py (2 hunks)
tests/unit/app/endpoints/test_streaming_query.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

src/app/endpoints/query.py
tests/unit/app/endpoints/test_streaming_query.py
pyproject.toml

🧰 Additional context used

📓 Path-based instructions (5)

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

src/utils/transcripts.py
src/constants.py
src/utils/types.py
src/models/responses.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

src/utils/transcripts.py
src/constants.py
src/utils/types.py
src/models/responses.py

src/constants.py

📄 CodeRabbit inference engine (CLAUDE.md)

Keep shared constants in a central src/constants.py with descriptive comments

Files:

src/constants.py

src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

src/models/responses.py

src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

src/models/responses.py

🧬 Code graph analysis (1)

src/utils/types.py (1)

src/models/responses.py (1)

RAGChunk (37-42)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-pr
GitHub Check: e2e_tests

🔇 Additional comments (1)

src/utils/types.py (1)
65-66: Fix mutable default and missing default for collections (Pydantic v2).

rag_chunks uses [] which is shared across instances.

tool_calls has no default but is appended to later.

Switch both to Field(default_factory=list) and import Field.

Apply:
-from pydantic import BaseModel
+from pydantic import BaseModel, Field
@@
 class TurnSummary(BaseModel):
@@
-    tool_calls: list[ToolCallSummary]
-    rag_chunks: list[RAGChunk] = []
+    tool_calls: list[ToolCallSummary] = Field(default_factory=list)
+    rag_chunks: list[RAGChunk] = Field(default_factory=list)

coderabbitai · 2025-09-22T14:03:36Z

src/constants.py

+# default RAG tool value
+DEFAULT_RAG_TOOL = "knowledge_search"


🛠️ Refactor suggestion

🧩 Analysis chain

Good centralization; verify no hardcoded tool names remain.

Constant name/readability look good. Please confirm there are no remaining string literals "knowledge_search" outside constants.

🏁 Script executed:

#!/usr/bin/env bash # Find stray literals (exclude config/examples) rg -nP -S "(^|[^A-Za-z0-9_])knowledge_search([^A-Za-z0-9_]|$)" -g '!**/run.yaml' -g '!**/README*' -g '!**/docs/**'

Length of output: 148

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail echo "Searching repository for literal 'knowledge_search' using multiple fallbacks..." # 1) ripgrep (robust flags) if command -v rg >/dev/null 2>&1; then echo ":: rg --hidden --no-ignore -uu -F -S" rg --hidden --no-ignore -uu -F -n "knowledge_search" -S || echo "rg: no matches or search completed" else echo "rg not available" fi # 2) git grep fallback if git rev-parse --is-inside-work-tree >/dev/null 2>&1; then echo ":: git grep -n" git grep -n "knowledge_search" || echo "git grep: no matches" else echo "Not a git repo (git grep skipped)" fi # 3) find + grep fallback (exclude common heavy dirs) echo ":: find + grep (excluding .git and node_modules)" find . -type f -not -path "./.git/*" -not -path "./node_modules/*" -not -path "./dist/*" -print0 \ | xargs -0 grep -nH --line-number -I "knowledge_search" || echo "find+grep: no matches"

Length of output: 6544

Replace remaining 'knowledge_search' string literals with DEFAULT_RAG_TOOL

Search found literal occurrences in source and tests; update usages to import and use src.constants.DEFAULT_RAG_TOOL (use f"builtin::rag/{DEFAULT_RAG_TOOL}" where the namespaced form is required).

Locations:

src/models/responses.py:129

src/utils/types.py:87 (comment)

src/app/endpoints/query.py:642

src/app/endpoints/streaming_query.py:444

tests/unit/app/endpoints/test_query.py:1144

tests/unit/app/endpoints/test_streaming_query.py: multiple locations (71–85, 233, 242, 342, 349–350, 995–1023)

coderabbitai · 2025-09-22T14:03:36Z

src/models/responses.py

+    rag_chunks: list[RAGChunk] = []
+


⚠️ Potential issue

Fix mutable default list in API model.

rag_chunks uses [] which is shared across responses; switch to default_factory.

- rag_chunks: list[RAGChunk] = [] + rag_chunks: list[RAGChunk] = Field( + default_factory=list, description="List of RAG chunks used to generate the response" + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

rag_chunks: list[RAGChunk] = []

rag_chunks: list[RAGChunk] = Field(

default_factory=list, description="List of RAG chunks used to generate the response"

)

🤖 Prompt for AI Agents

In src/models/responses.py around lines 93-94, the RAGChunk list is initialized with a mutable default (rag_chunks: list[RAGChunk] = []) which is shared across instances; replace it with a non-shared default_factory by importing Field from pydantic (if not already imported) and change the declaration to use Field(default_factory=list), i.e. rag_chunks: list[RAGChunk] = Field(default_factory=list).

Signed-off-by: Anxhela Coba <[email protected]>

asamal4

LGTM..
I am okay if we add test cases with a follow up PR..
But as soon as possible, because this repo/functionality is changed frequently by several people.

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (3)

src/app/endpoints/query.py (2)
274-275: Tune log levels: prefer debug for noisy internal steps.

These are verbose, per-request messages better suited for debug to reduce log noise in production.

Suggested changes:

logger.info(...) -> logger.debug(...) for processing/persisting steps.

Also applies to: 283-284, 299-300, 316-317, 324-325

57-57: Use module logger name per guidelines.

Prefer logging.getLogger(name) to follow project convention.
-logger = logging.getLogger("app.endpoints.handlers")
+logger = logging.getLogger(__name__)
src/models/responses.py (1)

148-148: Remove trailing whitespace in src/models/responses.py:148

Trailing whitespace on line 148 is flagged by pylint — remove the trailing space to fix formatting.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0dd1f1d and 0d774ce.

📒 Files selected for processing (4)

pyproject.toml (2 hunks)
src/app/endpoints/query.py (4 hunks)
src/constants.py (1 hunks)
src/models/responses.py (5 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

src/constants.py
pyproject.toml

🧰 Additional context used

📓 Path-based instructions (7)

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

src/models/responses.py
src/app/endpoints/query.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

src/models/responses.py
src/app/endpoints/query.py

src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

src/models/responses.py

src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

src/models/responses.py

src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

src/app/endpoints/query.py

src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

src/app/endpoints/query.py

src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

src/app/endpoints/query.py

🧬 Code graph analysis (1)

src/app/endpoints/query.py (1)

src/models/responses.py (3)

ToolCall (44-49)

ReferencedDocument (52-64)

QueryResponse (66-147)

🪛 GitHub Actions: Black

src/models/responses.py

[error] 1-1: Black formatting check failed: 1 file would be reformatted by 'black --check .'. Run 'black --write .' to fix code style issues.

🪛 GitHub Actions: Pyright

src/models/responses.py

[error] 60-60: pyright: Type 'None' is not assignable to declared type 'AnyUrl'.

src/app/endpoints/query.py

[error] 307-307: pyright: Argument of type 'AnyUrl | None' cannot be assigned to parameter 'doc_url' of type 'AnyUrl' in function 'init'. None is not assignable to 'AnyUrl'.

🪛 GitHub Actions: Python linter

src/models/responses.py

[error] 101-101: Pylint: trailing whitespace (C0303) detected. (Command: 'uv run pylint src tests' exit code 16)

[error] 148-148: Pylint: trailing whitespace (C0303) detected. (Command: 'uv run pylint src tests' exit code 16)

🪛 GitHub Actions: Type checks

src/models/responses.py

[error] 60-60: mypy: Incompatible types in assignment (expression has type "None", variable has type "AnyUrl") [assignment]. Command: 'uv run mypy --explicit-package-bases --disallow-untyped-calls --disallow-untyped-defs --disallow-incomplete-defs --ignore-missing-imports --disable-error-code attr-defined src/'

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-pr
GitHub Check: e2e_tests

🔇 Additional comments (5)

src/models/responses.py (4)
97-100: Prefer non-optional list for tool_calls for consistency.

Lists are better modeled as empty lists instead of None for API consumers; aligns with prior feedback on non-optional lists.
-    tool_calls: Optional[list[ToolCall]] = Field(
-        None,
-        description="List of tool calls made during response generation",
-    )
+    tool_calls: list[ToolCall] = Field(
+        default_factory=list,
+        description="List of tool calls made during response generation",
+    )
123-136: Examples look good.

Example payload for rag_chunks and tool_calls is coherent and helpful for API docs.

95-96: Avoid mutable default for rag_chunks.

Use default_factory to prevent shared mutable defaults across instances. This was already flagged earlier.
-    rag_chunks: list[RAGChunk] = []
+    rag_chunks: list[RAGChunk] = Field(
+        default_factory=list, description="List of RAG chunks used to generate the response"
+    )
101-101: Remove trailing whitespace in src/models/responses.py (line 101). Black/pylint failing; black is unavailable in the verification environment — run locally: black --check src/models/responses.py || black src/models/responses.py and commit the formatted file.
src/app/endpoints/query.py (1)

253-256: LGTM: single conversion of rag chunks for persistence.

Converting once to dicts for transcripts is clean and avoids repeated serialization.

src/app/endpoints/query.py

src/models/responses.py

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

src/models/responses.py (3)
99-102: Return a stable, non-null list for tool_calls.

For a consistent API shape (and per prior reviewer consensus), default to an empty list instead of None.

Apply:
-    tool_calls: Optional[list[ToolCall]] = Field(
-        None,
-        description="List of tool calls made during response generation",
-    )
+    tool_calls: list[ToolCall] = Field(
+        default_factory=list,
+        description="List of tool calls made during response generation",
+    )
37-43: Constrain score to [0, 1] for predictable semantics.

If scores are normalized, enforce via validation.

Apply:
-    score: Optional[float] = Field(None, description="Relevance score")
+    score: Optional[float] = Field(
+        default=None, description="Relevance score (0..1)", ge=0.0, le=1.0
+    )
If scores are not normalized, ignore.

45-51: Use a JSON type alias for arguments/result for clarity.

Improves readability across models using JSON blobs.

Add at module top (outside this hunk):
# Type aliases
JSONObj = dict[str, Any]
Then update:
-    arguments: dict[str, Any] = Field(description="Arguments passed to the tool")
-    result: Optional[dict[str, Any]] = Field(None, description="Result from the tool")
+    arguments: JSONObj = Field(description="Arguments passed to the tool")
+    result: Optional[JSONObj] = Field(None, description="Result from the tool")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d774ce and 0871751.

📒 Files selected for processing (1)

src/models/responses.py (4 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

src/models/responses.py

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

src/models/responses.py

src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

src/models/responses.py

src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

src/models/responses.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-pr
GitHub Check: e2e_tests

🔇 Additional comments (3)

src/models/responses.py (3)
61-63: Good fix: doc_url is now optional (resolves type-checker errors).

This aligns with earlier feedback and usage where None is passed.

Please confirm pyright/mypy are green on this change in CI.

125-138: Examples look consistent with the schema.

The example RAG chunk and tool call structures align with the new models.

97-98: Fix mutable default for rag_chunks and add field metadata.

Use default_factory to avoid shared list across instances and surface description in OpenAPI.

Apply:
-    rag_chunks: list[RAGChunk] = []
+    rag_chunks: list[RAGChunk] = Field(
+        default_factory=list,
+        description="List of RAG chunks used to generate the response",
+    )

src/models/responses.py

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

tests/unit/models/responses/test_query_response.py (2)
3-3: Use absolute imports for consistency.

The import uses a relative import path that may not align with the project structure. Based on the relevant code snippets showing src/models/responses.py, this should use an absolute import.
-from models.responses import QueryResponse, RAGChunk, ToolCall, ReferencedDocument
+from src.models.responses import QueryResponse, RAGChunk, ToolCall, ReferencedDocument
92-98: ToolCall model doesn't validate arguments/result — confirm or add validation

src/models/responses.py: ToolCall declares arguments: dict[str, Any] and result: Optional[dict[str, Any]] with no validators, so the test's dicts are accepted as-is. If you expect specific shapes, replace these with TypedDict/Pydantic submodels or add field validators; otherwise confirm flexible typing is intentional.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0871751 and 930cd74.

📒 Files selected for processing (2)

tests/unit/models/responses/test_query_response.py (2 hunks)
tests/unit/models/responses/test_rag_chunk.py (1 hunks)

✅ Files skipped from review due to trivial changes (1)

tests/unit/models/responses/test_rag_chunk.py

🧰 Additional context used

📓 Path-based instructions (3)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

tests/unit/models/responses/test_query_response.py

tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard

Files:

tests/unit/models/responses/test_query_response.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests

Files:

tests/unit/models/responses/test_query_response.py

🧬 Code graph analysis (1)

tests/unit/models/responses/test_query_response.py (1)

src/models/responses.py (4)

QueryResponse (68-149)

RAGChunk (37-42)

ToolCall (45-50)

ReferencedDocument (53-65)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-pr
GitHub Check: e2e_tests

🔇 Additional comments (5)

tests/unit/models/responses/test_query_response.py (5)

24-28: LGTM! Good test coverage for default behavior.

This test correctly validates that rag_chunks is empty by default, ensuring backward compatibility for existing functionality.

29-63: Excellent comprehensive test for RAG chunks functionality.

This test thoroughly validates RAG chunks with all fields populated, including proper access to content, source, and score attributes. The test data is realistic and well-structured.

64-81: Great coverage of optional field scenarios.

This test effectively validates the optional nature of source and score fields in RAG chunks, covering all combinations of present/absent optional fields.

82-129: Comprehensive integration test with all new features.

This test provides excellent end-to-end validation of the complete QueryResponse with RAG chunks, tool calls, and referenced documents. The assertions thoroughly verify all nested attributes and relationships.

100-107: No action required — ReferencedDocument.doc_url already uses pydantic.AnyUrl. The model in src/models/responses.py declares doc_url: Optional[AnyUrl] = Field(...), so URL validation is enforced.

tests/unit/models/responses/test_query_response.py

tisnik

LGTM

Add RAG chunks in query response

ef94176

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai bot reviewed Sep 16, 2025

View reviewed changes

run.yaml Outdated Show resolved Hide resolved

run.yaml Show resolved Hide resolved

src/utils/types.py Show resolved Hide resolved

default values

2d798ec

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai bot reviewed Sep 16, 2025

View reviewed changes

run.yaml Outdated Show resolved Hide resolved

src/utils/types.py Show resolved Hide resolved

add test for RAGChunks

927754c

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai bot reviewed Sep 16, 2025

View reviewed changes

asamal4 reviewed Sep 16, 2025

View reviewed changes

src/app/endpoints/query.py Outdated Show resolved Hide resolved

src/models/responses.py Outdated Show resolved Hide resolved

src/utils/types.py Outdated Show resolved Hide resolved

tisnik requested changes Sep 17, 2025

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

src/models/responses.py Outdated Show resolved Hide resolved

src/models/responses.py Outdated Show resolved Hide resolved

src/utils/types.py Outdated Show resolved Hide resolved

tisnik changed the title ~~[LCORE-601] Add RAG chunks in query response~~ LCORE-601: Add RAG chunks in query response Sep 17, 2025

VladimirKadlec suggested changes Sep 17, 2025

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

pyproject.toml Outdated Show resolved Hide resolved

are-ces reviewed Sep 17, 2025

View reviewed changes

src/app/endpoints/query.py Outdated Show resolved Hide resolved

are-ces reviewed Sep 17, 2025

View reviewed changes

This was referenced Sep 17, 2025

Added RAG chunks in streaming_query response #556

Closed

Added RAG chunks in streaming_query response #558

Closed

Anxhela21 added 2 commits September 18, 2025 15:04

pr review comments

92d28fe

Signed-off-by: Anxhela Coba <[email protected]>

remove lbirary dependencies

ffd014f

Signed-off-by: Anxhela Coba <[email protected]>

VladimirKadlec suggested changes Sep 19, 2025

View reviewed changes

uv.lock Show resolved Hide resolved

src/utils/types.py Outdated Show resolved Hide resolved

are-ces reviewed Sep 19, 2025

View reviewed changes

src/utils/types.py Outdated Show resolved Hide resolved

are-ces reviewed Sep 19, 2025

View reviewed changes

asamal4 reviewed Sep 19, 2025

View reviewed changes

tests/unit/app/endpoints/test_streaming_query.py Show resolved Hide resolved

asamal4 reviewed Sep 19, 2025

View reviewed changes

run.yaml Outdated Show resolved Hide resolved

variables and uv lock

98f8141

Signed-off-by: Anxhela Coba <[email protected]>

VladimirKadlec approved these changes Sep 22, 2025

View reviewed changes

lint

99080a8

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai bot reviewed Sep 22, 2025

View reviewed changes

disable lint on query.py

0dd1f1d

Signed-off-by: Anxhela Coba <[email protected]>

Anxhela21 requested a review from tisnik September 22, 2025 15:17

asamal4 approved these changes Sep 22, 2025

View reviewed changes

asamal4 mentioned this pull request Sep 23, 2025

LCORE-347: Implement referenced documents support on /query endpoint #572

Merged

18 tasks

rebase and resolve conflict

0d774ce

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

src/app/endpoints/query.py Show resolved Hide resolved

src/app/endpoints/query.py Show resolved Hide resolved

src/app/endpoints/query.py Show resolved Hide resolved

src/app/endpoints/query.py Show resolved Hide resolved

src/models/responses.py Outdated Show resolved Hide resolved

lint

0871751

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai bot reviewed Sep 23, 2025

View reviewed changes

src/models/responses.py Show resolved Hide resolved

unit tests

930cd74

Signed-off-by: Anxhela Coba <[email protected]>

coderabbitai bot reviewed Sep 24, 2025

View reviewed changes

tests/unit/models/responses/test_query_response.py Show resolved Hide resolved

are-ces approved these changes Sep 25, 2025

View reviewed changes

tisnik approved these changes Sep 25, 2025

View reviewed changes

tisnik merged commit 7173f72 into lightspeed-core:main Sep 25, 2025
18 of 19 checks passed

This was referenced Sep 25, 2025

Replacing RAG tool name with constant #583

Merged

LCORE-693: Added rag_chunks to streaming_query #585

Merged

LCORE-632: Regenerated OpenAPI specification #590

Merged

coderabbitai bot mentioned this pull request Nov 3, 2025

Base implementation of non-streaming Responses API #753

Merged

18 tasks

coderabbitai bot mentioned this pull request Nov 19, 2025

LCORE-945: optional topic summary #812

Merged

18 tasks

		# default RAG tool value
		DEFAULT_RAG_TOOL = "knowledge_search"

-    rag_chunks: list[RAGChunk] = []
+    rag_chunks: list[RAGChunk] = Field(
+        default_factory=list, description="List of RAG chunks used to generate the response"
+    )

LCORE-601: Add RAG chunks in query response #550

LCORE-601: Add RAG chunks in query response #550

Uh oh!

Conversation

Anxhela21 commented Sep 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asamal4 commented Sep 17, 2025

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

are-ces left a comment

Choose a reason for hiding this comment

Uh oh!

VladimirKadlec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

are-ces left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

VladimirKadlec left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

asamal4 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Anxhela21 commented Sep 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 16, 2025 •

edited

Loading

asamal4 left a comment •

edited

Loading