-
Notifications
You must be signed in to change notification settings - Fork 56
LCORE-601: Add RAG chunks in query response #550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCORE-601: Add RAG chunks in query response #550
Conversation
Signed-off-by: Anxhela Coba <[email protected]>
WalkthroughAdds RAG-aware models and plumbing: new RAGChunk/ToolCall/ReferencedDocument types, extraction of RAG chunks from tool outputs into TurnSummary, persisted serialized rag_chunks in transcripts, and surfacing rag_chunks/tool_calls/referenced_documents in QueryResponse; also adds embedding/vector runtime config entries. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Client
participant API as /query Endpoint
participant ORCH as Orchestrator/LLM
participant TOOL as RAG Tool (knowledge_search)
participant EMB as Embedder (sentence-transformers)
participant VDB as Vector DB (faiss)
C->>API: POST /query
API->>ORCH: run turn (prompt, tools)
alt RAG tool invoked
ORCH->>TOOL: knowledge_search(query)
TOOL->>EMB: embed(query)
EMB-->>TOOL: embedding
TOOL->>VDB: search(embedding)
VDB-->>TOOL: chunks + scores
TOOL-->>ORCH: tool response (includes chunks)
end
ORCH-->>API: TurnSummary (messages, tool_calls, rag_chunks)
API->>API: normalize rag_chunks → list[dict]
API->>API: build tool_calls and referenced_documents
API->>API: store_transcript(..., rag_chunks=dicts, ...)
API-->>C: QueryResponse { rag_chunks?, tool_calls?, referenced_documents? }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (8)
run.yaml (1)
145-145: Add trailing newline to satisfy linters.YAMLlint flags missing newline at EOF.
src/utils/types.py (2)
91-94: Guard on tool name with constant/enum (optional).Literal "knowledge_search" string can drift from runtime config. Consider a module-level constant.
95-142: RAG chunk extraction: broaden JSON tolerance and cap payload (optional).Some providers return keys like text/content/snippet and score as strings. Consider normalizing and adding a soft cap to avoid oversized responses.
@@ - if isinstance(data, dict) and "chunks" in data: + if isinstance(data, dict) and "chunks" in data: for chunk in data["chunks"]: - self.rag_chunks.append( + text = chunk.get("content") or chunk.get("text") or "" + score = chunk.get("score") + try: + score = float(score) if score is not None else None + except (TypeError, ValueError): + score = None + self.rag_chunks.append( RAGChunkData( - content=chunk.get("content", ""), + content=text, source=chunk.get("source"), - score=chunk.get("score") + score=score ) ) @@ - for chunk in data: + for chunk in data: if isinstance(chunk, dict): - self.rag_chunks.append( + text = chunk.get("content") or chunk.get("text") or str(chunk) + score = chunk.get("score") + try: + score = float(score) if score is not None else None + except (TypeError, ValueError): + score = None + self.rag_chunks.append( RAGChunkData( - content=chunk.get("content", str(chunk)), + content=text, source=chunk.get("source"), - score=chunk.get("score") + score=score ) ) + # Optional: bound number of chunks + if len(self.rag_chunks) > 50: + self.rag_chunks = self.rag_chunks[:50]src/models/responses.py (1)
53-59: API naming vs. internal naming mismatch.Internal ToolCallSummary uses “response” (str) while API uses “result” (dict). It’s fine, but consider aligning the field name to “response” for consistency, or document the mapping decisively.
src/app/endpoints/query.py (4)
254-263: Normalize tool call result to JSON when possible.Try JSON-decoding tool responses to provide structured results; fall back to string.
tool_calls = [ ToolCall( tool_name=tc.name, arguments=tc.args if isinstance(tc.args, dict) else {"query": str(tc.args)}, - result={"response": tc.response} if tc.response else None + result=( + (json.loads(tc.response) if isinstance(tc.response, str) else tc.response) + if tc.response + else None + ) ) for tc in summary.tool_calls ]
276-290: O(n^2) chunk_count; pre-aggregate for large results (optional).For many chunks, the per-source sum is quadratic. Pre-count once.
- referenced_docs = [] - doc_sources = set() - for chunk in summary.rag_chunks: - if chunk.source and chunk.source not in doc_sources: - doc_sources.add(chunk.source) - referenced_docs.append( - ReferencedDocument( - url=chunk.source if chunk.source.startswith("http") else None, - title=chunk.source, - chunk_count=sum(1 for c in summary.rag_chunks if c.source == chunk.source) - ) - ) + from collections import Counter + src_counts = Counter(c.source for c in summary.rag_chunks if c.source) + referenced_docs = [ + ReferencedDocument( + url=src if src.startswith("http") else None, + title=src, + chunk_count=cnt, + ) + for src, cnt in src_counts.items() + ]
229-244: Persist transcripts with actual rag_chunks.You already compute summary.rag_chunks; pass them to store_transcript.
- rag_chunks=[], # TODO(lucasagomes): implement rag_chunks + rag_chunks=summary.rag_chunks,
246-256: Reduce log verbosity to debug.These info logs are per-request and noisy. Downgrade to debug to keep prod logs clean.
- logger.info("Persisting conversation details...") + logger.debug("Persisting conversation details...") @@ - logger.info("Processing tool calls...") + logger.debug("Processing tool calls...") @@ - logger.info("Processing RAG chunks...") + logger.debug("Processing RAG chunks...") @@ - logger.info("Extracting referenced documents...") + logger.debug("Extracting referenced documents...") @@ - logger.info("Building final response...") + logger.debug("Building final response...") @@ - logger.info("Query processing completed successfully!") + logger.debug("Query processing completed successfully!")Also applies to: 267-275, 276-290, 291-301
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
pyproject.toml(1 hunks)run.yaml(3 hunks)src/app/endpoints/query.py(2 hunks)src/models/responses.py(3 hunks)src/utils/types.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/app/endpoints/query.py (1)
src/models/responses.py (4)
QueryResponse(61-136)RAGChunk(37-42)ReferencedDocument(45-50)ToolCall(53-58)
🪛 YAMLlint (1.37.1)
run.yaml
[error] 145-145: no new line character at the end of file
(new-line-at-end-of-file)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: e2e_tests
🔇 Additional comments (6)
src/models/responses.py (2)
37-43: RAGChunk model — LGTM.
91-105: Optional fields on QueryResponse — good choice.Omitting empty lists keeps payloads small and stable for clients.
Also applies to: 112-137
src/app/endpoints/query.py (1)
267-275: RAG chunks mapping — LGTM.Straight-through mapping from TurnSummary is correct.
run.yaml (1)
63-66: Toolgroup wiring validated — no changes required.
run.yaml defines toolgroup_id: builtin::rag → provider_id: rag-runtime (run.yaml lines 143–145) and src/app/endpoints/query.py references "builtin::rag/knowledge_search" (src/app/endpoints/query.py lines 636–638). inline::sentence-transformers and inline::rag-runtime entries are fine.pyproject.toml (2)
53-68: Move heavy ML deps out of core runtime into an optional 'rag' dependency-group — verify imports firsttorch/trl/peft/datasets/faiss-cpu/autoevals/sentence-transformers will bloat prod images and cold-starts; gate them behind an extras/dependency-group and only install for RAG-enabled deployments.
Apply this minimal removal from default deps:
@@ - "datasets>=3.6.0", - "faiss-cpu>=1.11.0", - "autoevals>=0.0.129", - "torch>=2.7.1", - "peft>=0.15.2", - "trl>=0.18.2", - "sentence-transformers>=5.1.0",Then add a dedicated group (example):
[dependency-groups] rag = [ "sentence-transformers>=5.1.0", "faiss-cpu>=1.11.0", "datasets>=3.6.0", "torch>=2.7.1", "peft>=0.15.2", "trl>=0.18.2", "autoevals>=0.0.129", ]
- Align litellm version constraints to a single lower bound across groups to avoid resolver surprises.
- Verification: I could not search the repo (rg returned no files). Re-run the import check locally to confirm these packages are not required by core runtime, e.g.:
rg -nP -C2 -uu --glob '!**/tests/**' \ -e '^\s*(from|import)\s+torch\b' \ -e '^\s*(from|import)\s+trl\b' \ -e '^\s*(from|import)\s+peft\b' \ -e '^\s*(from|import)\s+datasets\b' \ -e '^\s*(from|import)\s+sentence_transformers\b'
48-48: Confirm SQLAlchemy floor change and verify compatibility with 2.0.41
- Evidence: pyproject.toml:48 => "sqlalchemy>=2.0.41"; uv.lock currently resolves to sqlalchemy 2.0.43 (uv.lock entries ~3303–3321, upload-time 2025-08-11). Repo search returned no 'alembic' references.
- Action: run CI/local tests with sqlalchemy==2.0.41 and exercise migrations and DB adapters (psycopg2/aiosqlite). If widening the floor was unintentional or causes failures, tighten the minimum to >=2.0.43 or pin the required patch.
Signed-off-by: Anxhela Coba <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
src/utils/types.py (2)
72-72: Good: mutable default avoided for rag_chunksSwitch to
Field(default_factory=list)resolves shared-state risk.
10-10: Fix Pydantic import to unblock CI
Fieldmust be imported frompydantic, notpydantic.main(see failing E2E log).-from pydantic.main import BaseModel, Field +from pydantic import BaseModel, Fieldrun.yaml (1)
111-111: Parameterize vector DB path via env varAvoid hard-coded path; support overrides with a sane default.
- db_path: .llama/distributions/ollama/faiss_store.db # Location of vector database + db_path: ${env.VECTOR_STORE_DB_PATH:.llama/distributions/ollama/faiss_store.db} # Location of vector database
🧹 Nitpick comments (1)
src/utils/types.py (1)
95-142: Make RAG parsing more robust (dict-without-chunks, list of scalars)Current logic drops data when JSON is a dict without "chunks" or a list of non-dicts. Add safe fallbacks.
try: # Parse the response to get chunks # Try JSON first try: data = json.loads(response_content) if isinstance(data, dict) and "chunks" in data: for chunk in data["chunks"]: self.rag_chunks.append( RAGChunkData( content=chunk.get("content", ""), source=chunk.get("source"), score=chunk.get("score") ) ) - elif isinstance(data, list): + elif isinstance(data, dict): + # Fallback: treat whole dict as a single chunk + self.rag_chunks.append( + RAGChunkData( + content=json.dumps(data, ensure_ascii=False), + source="knowledge_search", + score=None, + ) + ) + elif isinstance(data, list): # Handle list of chunks for chunk in data: - if isinstance(chunk, dict): + if isinstance(chunk, dict): self.rag_chunks.append( RAGChunkData( content=chunk.get("content", str(chunk)), source=chunk.get("source"), score=chunk.get("score") ) ) + else: + # Fallback: stringify scalar/other types + self.rag_chunks.append( + RAGChunkData( + content=str(chunk), + source="knowledge_search", + score=None, + ) + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
run.yaml(3 hunks)src/utils/types.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/utils/types.py (1)
src/models/responses.py (1)
ToolCall(53-58)
🪛 GitHub Actions: E2E Tests
src/utils/types.py
[error] 10-10: ImportError: cannot import name 'Field' from 'pydantic.main'. Did you mean: '_fields'?
🔇 Additional comments (5)
src/utils/types.py (1)
80-94: Nice consolidation of response content + targeted extractionComputing
response_contentonce and gating RAG extraction onknowledge_searchlooks solid.run.yaml (4)
63-66: Embedding provider addition looks goodInline sentence-transformers provider is correctly declared and referenced later in models.
105-107: RAG runtime provider wiring looks correctProvider registered and later exposed via tool_group.
132-137: Model entry aligns with embedding configModel/provider IDs and embedding dimension are consistent.
143-145: Tool group exposes RAG runtime correctly
builtin::ragbound torag-runtimeis consistent with providers.
Signed-off-by: Anxhela Coba <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/utils/types.py (1)
70-73: Default-initialize tool_calls to an empty list.Prevents None/missing list issues and keeps parity with
rag_chunks. This was raised previously and is still applicable.Apply:
- tool_calls: list[ToolCallSummary] + tool_calls: list[ToolCallSummary] = Field(default_factory=list)
🧹 Nitpick comments (4)
tests/unit/app/endpoints/test_streaming_query.py (1)
346-356: Validate transcript API shape to avoid duplicated rag chunk plumbing.You pass rag chunks inside
summary.rag_chunkswhile also passing a separaterag_chunks=[]arg tostore_transcript. If the latter is unused or redundant, consider removing it from the API or populating it consistently to avoid confusion.If keeping both, add an assertion that the service ultimately persists
summary.rag_chunksso we don’t silently drop them.src/utils/types.py (3)
80-94: Guard against unexpected content shapes from tool responses.
interleaved_content_as_str(resp.content)can raise ifresp.contentis absent or malformed; a small try/except avoids derailing the whole summary build.Apply:
- response_content = interleaved_content_as_str(resp.content) if resp else None + try: + response_content = interleaved_content_as_str(resp.content) if resp else None + except Exception: + response_content = None
95-142: Add fallback when JSON parses but yields an unknown shape.If
json.loadssucceeds but returns a non-list/dict or a dict without"chunks", no chunk is appended. Fall back to a single chunk in that case.Apply:
def _extract_rag_chunks_from_response(self, response_content: str) -> None: """Extract RAG chunks from tool response content.""" try: - # Parse the response to get chunks + # Parse the response to get chunks + before_count = len(self.rag_chunks) # Try JSON first try: data = json.loads(response_content) if isinstance(data, dict) and "chunks" in data: for chunk in data["chunks"]: self.rag_chunks.append( RAGChunkData( content=chunk.get("content", ""), source=chunk.get("source"), - score=chunk.get("score") + score=chunk.get("score"), ) ) elif isinstance(data, list): # Handle list of chunks for chunk in data: if isinstance(chunk, dict): self.rag_chunks.append( RAGChunkData( content=chunk.get("content", str(chunk)), source=chunk.get("source"), - score=chunk.get("score") + score=chunk.get("score"), ) ) except json.JSONDecodeError: # If not JSON, treat the entire response as a single chunk if response_content.strip(): self.rag_chunks.append( RAGChunkData( content=response_content, source="knowledge_search", score=None ) ) + # If JSON parsed but nothing was added, fall back to a single chunk + if response_content.strip() and before_count == len(self.rag_chunks): + self.rag_chunks.append( + RAGChunkData( + content=response_content, + source="knowledge_search", + score=None, + ) + ) except Exception: # Treat response as single chunk if response_content.strip(): self.rag_chunks.append( RAGChunkData( content=response_content, source="knowledge_search", score=None ) )
3-4: Nit: unify typing style (builtins vs typing aliases).You mix
List[...]andlist[...]. Prefer a single style (builtins in 3.9+) for consistency.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
pyproject.toml(1 hunks)src/utils/types.py(2 hunks)tests/unit/app/endpoints/test_streaming_query.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (1)
tests/unit/app/endpoints/test_streaming_query.py (1)
src/utils/types.py (3)
ToolCallSummary(43-56)TurnSummary(67-141)RAGChunkData(59-64)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: e2e_tests
🔇 Additional comments (2)
tests/unit/app/endpoints/test_streaming_query.py (1)
48-48: LGTM: importing RAGChunkData for test fixtures is appropriate.The import aligns tests with the new TurnSummary contract.
src/utils/types.py (1)
59-65: RAGChunkData model looks good.Clear fields, optionality is sensible. No issues.
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks ok in overall, thank you. Put some nits into comments.
are-ces
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of nits, otherwise lgtm!
Signed-off-by: Anxhela Coba <[email protected]>
Signed-off-by: Anxhela Coba <[email protected]>
VladimirKadlec
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 💪
are-ces
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added one comment. Otherwise LGTM nice work!
Signed-off-by: Anxhela Coba <[email protected]>
VladimirKadlec
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Signed-off-by: Anxhela Coba <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (7)
src/utils/types.py (2)
91-136: Add lightweight observability and broader schema fallback when parsing chunks.
- Add debug logs for parse paths/failures.
- Accept common keys ("text", "page_content") when "content" missing.
Apply:
@@ - def _extract_rag_chunks_from_response(self, response_content: str) -> None: + def _extract_rag_chunks_from_response(self, response_content: str) -> None: """Extract RAG chunks from tool response content.""" try: # Parse the response to get chunks # Try JSON first try: data = json.loads(response_content) if isinstance(data, dict) and "chunks" in data: for chunk in data["chunks"]: self.rag_chunks.append( RAGChunk( - content=chunk.get("content", ""), + content=chunk.get("content") + or chunk.get("text") + or chunk.get("page_content", ""), source=chunk.get("source"), score=chunk.get("score"), ) ) elif isinstance(data, list): # Handle list of chunks for chunk in data: if isinstance(chunk, dict): self.rag_chunks.append( RAGChunk( - content=chunk.get("content", str(chunk)), + content=chunk.get("content") + or chunk.get("text") + or chunk.get("page_content", str(chunk)), source=chunk.get("source"), score=chunk.get("score"), ) ) except json.JSONDecodeError: # If not JSON, treat the entire response as a single chunk if response_content.strip(): self.rag_chunks.append( RAGChunk( content=response_content, source=DEFAULT_RAG_TOOL, score=None, ) ) except (KeyError, AttributeError, TypeError, ValueError): # Treat response as single chunk on data access/structure errors if response_content.strip(): self.rag_chunks.append( RAGChunk( content=response_content, source=DEFAULT_RAG_TOOL, score=None ) )Additionally (outside this hunk), add module logging:
# at top of file import logging logger = logging.getLogger(__name__)…and emit optional debug logs in each branch if desired (kept out of diff for brevity).
87-90: Tool-name match may miss namespaced tools; make the check robust.Granite/MCP providers often report namespaced tool names (e.g., "builtin::rag.knowledge_search"). Consider tolerant matching.
- if tc.tool_name == DEFAULT_RAG_TOOL and resp and response_content: + if resp and response_content and ( + tc.tool_name == DEFAULT_RAG_TOOL + or tc.tool_name.endswith(f".{DEFAULT_RAG_TOOL}") + ): self._extract_rag_chunks_from_response(response_content)Found single strict comparison at src/utils/types.py:88. Log tc.tool_name once at runtime to confirm actual values.
src/models/responses.py (2)
37-43: Constrain RAGChunk.score to a sane range.If scores are normalized, bound them to [0.0, 1.0] to prevent bad data.
- score: Optional[float] = Field(None, description="Relevance score") + score: Optional[float] = Field( + None, ge=0.0, le=1.0, description="Relevance score (0..1)" + )
55-61: Class name “ToolCall” risks confusion with llama_stack_client.ToolCall.To avoid import shadowing/mistakes, consider renaming to APIToolCall or ToolCallInfo.
src/utils/transcripts.py (2)
18-18: Use module-qualified logger name.Prefer name to a hardcoded string for flexibility.
-logger = logging.getLogger("utils.transcripts") +logger = logging.getLogger(__name__)
42-56: Tighten rag_chunks typing to dict[str, Any] and reflect in signature.Improves clarity and static checks.
+from typing import Any @@ - rag_chunks: list[dict], + rag_chunks: list[dict[str, Any]], @@ - rag_chunks: The list of serialized `RAGChunk` dictionaries. + rag_chunks: The list of serialized `RAGChunk` dictionaries.run.yaml (1)
63-66: Config looks coherent; make paths/envs overridable and keep IDs consistent.
- Vector DB path: make it env-overridable for deployments.
- Embedding model/dimension: consistent with 768; good.
- Provider IDs (“faiss”, “sentence-transformers”, “rag-runtime”): consistent; good.
- db_path: .llama/distributions/ollama/faiss_store.db # Location of vector database + db_path: ${env.VECTOR_STORE_DB_PATH:.llama/distributions/ollama/faiss_store.db} # Location of vector databaseOptional: consider env-driving the embedding model ID to avoid drift between models:embedding and vector_dbs entries.
Also applies to: 105-107, 111-116, 126-130, 132-137, 143-145
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
pyproject.toml(1 hunks)run.yaml(3 hunks)src/app/endpoints/query.py(3 hunks)src/constants.py(1 hunks)src/models/responses.py(2 hunks)src/utils/transcripts.py(2 hunks)src/utils/types.py(2 hunks)tests/unit/app/endpoints/test_streaming_query.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- src/app/endpoints/query.py
- tests/unit/app/endpoints/test_streaming_query.py
- pyproject.toml
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/utils/transcripts.pysrc/constants.pysrc/utils/types.pysrc/models/responses.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/utils/transcripts.pysrc/constants.pysrc/utils/types.pysrc/models/responses.py
src/constants.py
📄 CodeRabbit inference engine (CLAUDE.md)
Keep shared constants in a central src/constants.py with descriptive comments
Files:
src/constants.py
src/{models/**/*.py,configuration.py}
📄 CodeRabbit inference engine (CLAUDE.md)
src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)
Files:
src/models/responses.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation
Files:
src/models/responses.py
🧬 Code graph analysis (1)
src/utils/types.py (1)
src/models/responses.py (1)
RAGChunk(37-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (1)
src/utils/types.py (1)
65-66: Fix mutable default and missing default for collections (Pydantic v2).
- rag_chunks uses [] which is shared across instances.
- tool_calls has no default but is appended to later.
Switch both to Field(default_factory=list) and import Field.
Apply:
-from pydantic import BaseModel +from pydantic import BaseModel, Field @@ class TurnSummary(BaseModel): @@ - tool_calls: list[ToolCallSummary] - rag_chunks: list[RAGChunk] = [] + tool_calls: list[ToolCallSummary] = Field(default_factory=list) + rag_chunks: list[RAGChunk] = Field(default_factory=list)
| # default RAG tool value | ||
| DEFAULT_RAG_TOOL = "knowledge_search" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
🧩 Analysis chain
Good centralization; verify no hardcoded tool names remain.
Constant name/readability look good. Please confirm there are no remaining string literals "knowledge_search" outside constants.
🏁 Script executed:
#!/usr/bin/env bash
# Find stray literals (exclude config/examples)
rg -nP -S "(^|[^A-Za-z0-9_])knowledge_search([^A-Za-z0-9_]|$)" -g '!**/run.yaml' -g '!**/README*' -g '!**/docs/**'Length of output: 148
🏁 Script executed:
#!/usr/bin/env bash
set -euo pipefail
echo "Searching repository for literal 'knowledge_search' using multiple fallbacks..."
# 1) ripgrep (robust flags)
if command -v rg >/dev/null 2>&1; then
echo ":: rg --hidden --no-ignore -uu -F -S"
rg --hidden --no-ignore -uu -F -n "knowledge_search" -S || echo "rg: no matches or search completed"
else
echo "rg not available"
fi
# 2) git grep fallback
if git rev-parse --is-inside-work-tree >/dev/null 2>&1; then
echo ":: git grep -n"
git grep -n "knowledge_search" || echo "git grep: no matches"
else
echo "Not a git repo (git grep skipped)"
fi
# 3) find + grep fallback (exclude common heavy dirs)
echo ":: find + grep (excluding .git and node_modules)"
find . -type f -not -path "./.git/*" -not -path "./node_modules/*" -not -path "./dist/*" -print0 \
| xargs -0 grep -nH --line-number -I "knowledge_search" || echo "find+grep: no matches"Length of output: 6544
Replace remaining 'knowledge_search' string literals with DEFAULT_RAG_TOOL
Search found literal occurrences in source and tests; update usages to import and use src.constants.DEFAULT_RAG_TOOL (use f"builtin::rag/{DEFAULT_RAG_TOOL}" where the namespaced form is required).
Locations:
- src/models/responses.py:129
- src/utils/types.py:87 (comment)
- src/app/endpoints/query.py:642
- src/app/endpoints/streaming_query.py:444
- tests/unit/app/endpoints/test_query.py:1144
- tests/unit/app/endpoints/test_streaming_query.py: multiple locations (71–85, 233, 242, 342, 349–350, 995–1023)
| rag_chunks: list[RAGChunk] = [] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix mutable default list in API model.
rag_chunks uses [] which is shared across responses; switch to default_factory.
- rag_chunks: list[RAGChunk] = []
+ rag_chunks: list[RAGChunk] = Field(
+ default_factory=list, description="List of RAG chunks used to generate the response"
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rag_chunks: list[RAGChunk] = [] | |
| rag_chunks: list[RAGChunk] = Field( | |
| default_factory=list, description="List of RAG chunks used to generate the response" | |
| ) |
🤖 Prompt for AI Agents
In src/models/responses.py around lines 93-94, the RAGChunk list is initialized
with a mutable default (rag_chunks: list[RAGChunk] = []) which is shared across
instances; replace it with a non-shared default_factory by importing Field from
pydantic (if not already imported) and change the declaration to use
Field(default_factory=list), i.e. rag_chunks: list[RAGChunk] =
Field(default_factory=list).
Signed-off-by: Anxhela Coba <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM..
I am okay if we add test cases with a follow up PR..
But as soon as possible, because this repo/functionality is changed frequently by several people.
Signed-off-by: Anxhela Coba <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (3)
src/app/endpoints/query.py (2)
274-275: Tune log levels: prefer debug for noisy internal steps.These are verbose, per-request messages better suited for debug to reduce log noise in production.
Suggested changes:
- logger.info(...) -> logger.debug(...) for processing/persisting steps.
Also applies to: 283-284, 299-300, 316-317, 324-325
57-57: Use module logger name per guidelines.Prefer logging.getLogger(name) to follow project convention.
-logger = logging.getLogger("app.endpoints.handlers") +logger = logging.getLogger(__name__)src/models/responses.py (1)
148-148: Remove trailing whitespace in src/models/responses.py:148Trailing whitespace on line 148 is flagged by pylint — remove the trailing space to fix formatting.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
pyproject.toml(2 hunks)src/app/endpoints/query.py(4 hunks)src/constants.py(1 hunks)src/models/responses.py(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/constants.py
- pyproject.toml
🧰 Additional context used
📓 Path-based instructions (7)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/models/responses.pysrc/app/endpoints/query.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/models/responses.pysrc/app/endpoints/query.py
src/{models/**/*.py,configuration.py}
📄 CodeRabbit inference engine (CLAUDE.md)
src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)
Files:
src/models/responses.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation
Files:
src/models/responses.py
src/app/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code
Files:
src/app/endpoints/query.py
src/{app/**/*.py,client.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use async def for I/O-bound operations and external API calls
Files:
src/app/endpoints/query.py
src/app/endpoints/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling
Files:
src/app/endpoints/query.py
🧬 Code graph analysis (1)
src/app/endpoints/query.py (1)
src/models/responses.py (3)
ToolCall(44-49)ReferencedDocument(52-64)QueryResponse(66-147)
🪛 GitHub Actions: Black
src/models/responses.py
[error] 1-1: Black formatting check failed: 1 file would be reformatted by 'black --check .'. Run 'black --write .' to fix code style issues.
🪛 GitHub Actions: Pyright
src/models/responses.py
[error] 60-60: pyright: Type 'None' is not assignable to declared type 'AnyUrl'.
src/app/endpoints/query.py
[error] 307-307: pyright: Argument of type 'AnyUrl | None' cannot be assigned to parameter 'doc_url' of type 'AnyUrl' in function 'init'. None is not assignable to 'AnyUrl'.
🪛 GitHub Actions: Python linter
src/models/responses.py
[error] 101-101: Pylint: trailing whitespace (C0303) detected. (Command: 'uv run pylint src tests' exit code 16)
[error] 148-148: Pylint: trailing whitespace (C0303) detected. (Command: 'uv run pylint src tests' exit code 16)
🪛 GitHub Actions: Type checks
src/models/responses.py
[error] 60-60: mypy: Incompatible types in assignment (expression has type "None", variable has type "AnyUrl") [assignment]. Command: 'uv run mypy --explicit-package-bases --disallow-untyped-calls --disallow-untyped-defs --disallow-incomplete-defs --ignore-missing-imports --disable-error-code attr-defined src/'
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (5)
src/models/responses.py (4)
97-100: Prefer non-optional list for tool_calls for consistency.Lists are better modeled as empty lists instead of None for API consumers; aligns with prior feedback on non-optional lists.
- tool_calls: Optional[list[ToolCall]] = Field( - None, - description="List of tool calls made during response generation", - ) + tool_calls: list[ToolCall] = Field( + default_factory=list, + description="List of tool calls made during response generation", + )
123-136: Examples look good.Example payload for rag_chunks and tool_calls is coherent and helpful for API docs.
95-96: Avoid mutable default for rag_chunks.Use default_factory to prevent shared mutable defaults across instances. This was already flagged earlier.
- rag_chunks: list[RAGChunk] = [] + rag_chunks: list[RAGChunk] = Field( + default_factory=list, description="List of RAG chunks used to generate the response" + )
101-101: Remove trailing whitespace in src/models/responses.py (line 101). Black/pylint failing;blackis unavailable in the verification environment — run locally:black --check src/models/responses.py || black src/models/responses.pyand commit the formatted file.src/app/endpoints/query.py (1)
253-256: LGTM: single conversion of rag chunks for persistence.Converting once to dicts for transcripts is clean and avoids repeated serialization.
Signed-off-by: Anxhela Coba <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/models/responses.py (3)
99-102: Return a stable, non-null list for tool_calls.For a consistent API shape (and per prior reviewer consensus), default to an empty list instead of None.
Apply:
- tool_calls: Optional[list[ToolCall]] = Field( - None, - description="List of tool calls made during response generation", - ) + tool_calls: list[ToolCall] = Field( + default_factory=list, + description="List of tool calls made during response generation", + )
37-43: Constrain score to [0, 1] for predictable semantics.If scores are normalized, enforce via validation.
Apply:
- score: Optional[float] = Field(None, description="Relevance score") + score: Optional[float] = Field( + default=None, description="Relevance score (0..1)", ge=0.0, le=1.0 + )If scores are not normalized, ignore.
45-51: Use a JSON type alias for arguments/result for clarity.Improves readability across models using JSON blobs.
Add at module top (outside this hunk):
# Type aliases JSONObj = dict[str, Any]Then update:
- arguments: dict[str, Any] = Field(description="Arguments passed to the tool") - result: Optional[dict[str, Any]] = Field(None, description="Result from the tool") + arguments: JSONObj = Field(description="Arguments passed to the tool") + result: Optional[JSONObj] = Field(None, description="Result from the tool")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/models/responses.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)
Files:
src/models/responses.py
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
src/models/responses.py
src/{models/**/*.py,configuration.py}
📄 CodeRabbit inference engine (CLAUDE.md)
src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)
Files:
src/models/responses.py
src/models/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation
Files:
src/models/responses.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (3)
src/models/responses.py (3)
61-63: Good fix: doc_url is now optional (resolves type-checker errors).This aligns with earlier feedback and usage where None is passed.
Please confirm pyright/mypy are green on this change in CI.
125-138: Examples look consistent with the schema.The example RAG chunk and tool call structures align with the new models.
97-98: Fix mutable default for rag_chunks and add field metadata.Use default_factory to avoid shared list across instances and surface description in OpenAPI.
Apply:
- rag_chunks: list[RAGChunk] = [] + rag_chunks: list[RAGChunk] = Field( + default_factory=list, + description="List of RAG chunks used to generate the response", + )
Signed-off-by: Anxhela Coba <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
tests/unit/models/responses/test_query_response.py (2)
3-3: Use absolute imports for consistency.The import uses a relative import path that may not align with the project structure. Based on the relevant code snippets showing
src/models/responses.py, this should use an absolute import.-from models.responses import QueryResponse, RAGChunk, ToolCall, ReferencedDocument +from src.models.responses import QueryResponse, RAGChunk, ToolCall, ReferencedDocument
92-98: ToolCall model doesn't validatearguments/result— confirm or add validationsrc/models/responses.py: ToolCall declares
arguments: dict[str, Any]andresult: Optional[dict[str, Any]]with no validators, so the test's dicts are accepted as-is. If you expect specific shapes, replace these with TypedDict/Pydantic submodels or add field validators; otherwise confirm flexible typing is intentional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tests/unit/models/responses/test_query_response.py(2 hunks)tests/unit/models/responses/test_rag_chunk.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/unit/models/responses/test_rag_chunk.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed
Files:
tests/unit/models/responses/test_query_response.py
tests/{unit,integration}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard
Files:
tests/unit/models/responses/test_query_response.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests
Files:
tests/unit/models/responses/test_query_response.py
🧬 Code graph analysis (1)
tests/unit/models/responses/test_query_response.py (1)
src/models/responses.py (4)
QueryResponse(68-149)RAGChunk(37-42)ToolCall(45-50)ReferencedDocument(53-65)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (5)
tests/unit/models/responses/test_query_response.py (5)
24-28: LGTM! Good test coverage for default behavior.This test correctly validates that
rag_chunksis empty by default, ensuring backward compatibility for existing functionality.
29-63: Excellent comprehensive test for RAG chunks functionality.This test thoroughly validates RAG chunks with all fields populated, including proper access to content, source, and score attributes. The test data is realistic and well-structured.
64-81: Great coverage of optional field scenarios.This test effectively validates the optional nature of
sourceandscorefields in RAG chunks, covering all combinations of present/absent optional fields.
82-129: Comprehensive integration test with all new features.This test provides excellent end-to-end validation of the complete QueryResponse with RAG chunks, tool calls, and referenced documents. The assertions thoroughly verify all nested attributes and relationships.
100-107: No action required — ReferencedDocument.doc_url already uses pydantic.AnyUrl. The model in src/models/responses.py declaresdoc_url: Optional[AnyUrl] = Field(...), so URL validation is enforced.
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Add RAG chunks in the query response
Type of change
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Configuration
Chores
Tests