Skip to content

Conversation

@asimurka
Copy link

@asimurka asimurka commented Oct 24, 2025

Description

This pull request temporarily converts application/json and application/xml MIME types to text/plain to prevent errors with the current Llama Stack implementation. This workaround will be removed once Llama Stack supports these MIME types.
Added e2e tests checking the successful response status if concerned attachment types are passed to query.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Enhanced support for XML and JSON file attachments in query requests
    • Attachments are now automatically normalized for improved LLM processing compatibility
  • Tests

    • Added test coverage for XML and JSON attachment handling in both standard and streaming query modes

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 24, 2025

Walkthrough

The pull request adds MIME-type normalization for document handling across both query and streaming query endpoints. Documents with application/json or application/xml MIME types are converted to text/plain before being sent to the Llama Stack agent. Corresponding test scenarios verify proper handling of XML and JSON attachments.

Changes

Cohort / File(s) Summary
Endpoint MIME-Type Normalization
src/app/endpoints/query.py, src/app/endpoints/streaming_query.py
Added document transformation logic to normalize MIME types (application/json and application/xml converted to text/plain) before sending to Llama Stack agent. Replaced direct pass-through of query_request.get_documents() with transformed documents list in create_turn calls. Document type import added to streaming_query.py.
Feature Test Coverage
tests/e2e/features/query.feature, tests/e2e/features/streaming_query.feature
Added new test scenarios verifying LLM response handling for XML and JSON attachments in both standard and streaming query contexts. Minor formatting adjustments to existing test structures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The changes follow a consistent pattern across two endpoint files with straightforward MIME-type transformation logic. However, multiple affected files, new test scenarios, and the need to verify behavior consistency across both query variants add moderate complexity.

Suggested reviewers

  • tisnik
  • are-ces
  • umago

Poem

🐰 Hops through JSON, hops through XML,
MIME types transformed to plain text, oh so practical!
Documents normalized with care so true,
Tests verify each attachment—both old and new!
Llama Stack receives them, ready to play,
A cleaner protocol brightens the day!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "Changing unsupported mime types to text/plain" directly aligns with the main change described in the raw summary, which involves transforming application/json and application/xml MIME types to text/plain in both query.py and streaming_query.py endpoints before sending documents to the Llama Stack agent. The title is concise, specific, and clearly communicates the primary change without vague terminology or unnecessary noise. A teammate reviewing the git history would immediately understand that this PR addresses MIME type handling, which matches the actual implementation and test additions that verify the new behavior with XML and JSON attachments.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 45c7482 and 79acf46.

📒 Files selected for processing (4)
  • src/app/endpoints/query.py (2 hunks)
  • src/app/endpoints/streaming_query.py (4 hunks)
  • tests/e2e/features/query.feature (1 hunks)
  • tests/e2e/features/streaming_query.feature (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Write E2E tests as Gherkin feature files for behave

Files:

  • tests/e2e/features/query.feature
  • tests/e2e/features/streaming_query.feature
🧬 Code graph analysis (2)
src/app/endpoints/query.py (1)
src/models/requests.py (1)
  • get_documents (197-204)
src/app/endpoints/streaming_query.py (1)
src/models/requests.py (1)
  • get_documents (197-204)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests (azure)
  • GitHub Check: e2e_tests (ci)
🔇 Additional comments (5)
src/app/endpoints/query.py (2)

21-21: LGTM: Import added for Document type annotation.

The import is necessary for annotating the transformed documents list.


696-709: Based on my verification, I can confirm the duplication exists as stated. However, I cannot definitively determine the Document type behavior from the external library within the sandbox. Let me generate the final review:


Extract duplicated MIME type transformation to a shared helper function.

Duplication confirmed: identical transformation logic exists in both src/app/endpoints/query.py (lines 703-709) and src/app/endpoints/streaming_query.py (lines 1048-1054), including the same TODO comment. Extract to src/utils/endpoints.py:

def normalize_document_mime_types(documents: list[dict]) -> list:
    """
    Normalize MIME types for documents.
    
    Converts application/json and application/xml to text/plain as a workaround
    for Llama Stack limitations.
    
    TODO: LCORE-881 - Remove when Llama Stack supports these MIME types natively.
    
    Args:
        documents: List of document dictionaries from get_documents()
        
    Returns:
        List of documents with normalized MIME types
    """
    return [
        (
            {"content": doc["content"], "mime_type": "text/plain"}
            if doc["mime_type"].lower() in ("application/json", "application/xml")
            else doc
        )
        for doc in documents
    ]

Then replace both occurrences with:

documents = normalize_document_mime_types(query_request.get_documents())

Regarding the type annotation: verify whether Document from llama_stack_client.types.agents.turn_create_params accepts dict-like objects or requires proper instantiation, as the code creates dictionaries but declares list[Document].

tests/e2e/features/streaming_query.feature (1)

93-117: LGTM: E2E test coverage for MIME type workaround.

The test scenario appropriately verifies that XML and JSON attachments are handled successfully. This provides good coverage for the MIME type normalization workaround.

src/app/endpoints/streaming_query.py (1)

24-24: LGTM: Import added for Document type annotation.

The import is necessary for annotating the transformed documents list.

tests/e2e/features/query.feature (1)

118-142: LGTM: E2E test coverage for MIME type workaround.

The test scenario appropriately verifies that XML and JSON attachments are handled successfully in the query endpoint. This mirrors the streaming_query test and provides consistent coverage.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@asimurka asimurka changed the title Changing unsupported mime types to text/plain LCORE-784: Changing unsupported mime types to text/plain Oct 24, 2025
Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik merged commit 2f071fd into lightspeed-core:main Oct 24, 2025
18 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants