Skip to content

[Epic] ⚡ Performance - orjson JSON Serialization #1294

@crivetimihai

Description

@crivetimihai

⚡ Performance - orjson JSON Serialization

Goal

Replace default JSON serialization with orjson for dramatically faster JSON encoding/decoding:

  1. Add orjson dependency to project
  2. Configure FastAPI to use orjson as default JSON serializer
  3. Update Pydantic models to leverage orjson optimizations
  4. Benchmark performance improvements for large payloads
  5. Ensure compatibility with existing API responses

This provides 2-3x faster JSON serialization/deserialization with significant improvements for large JSON responses (tool lists, server lists, bulk exports).

Why Now?

JSON serialization is a critical performance bottleneck for API servers:

  1. Performance: orjson is 2-3x faster than stdlib json, 1.5-2x faster than ujson
  2. Large Payloads: Massive improvement for responses with 100+ tools/servers/resources
  3. Low Overhead: Drop-in replacement, minimal code changes required
  4. Pydantic Compatible: Works seamlessly with FastAPI and Pydantic models
  5. Production Ready: Used by major companies (Reddit, Stripe, etc.)
  6. Type Safety: Strict RFC 8259 compliance, better error handling

📖 User Stories

US-1: API Client - Faster JSON Response Times

As an API Client
I want JSON responses to be serialized quickly
So that API calls complete faster and use less server CPU

Acceptance Criteria:

Given I am requesting a list of 500 tools via GET /tools
When the response is serialized with orjson
Then the serialization time should be 2-3x faster than stdlib json
And the response format should remain unchanged
And the Content-Type should be application/json

Given I am sending a POST request with complex JSON data
When the request is deserialized with orjson
Then parsing should be 2-3x faster than stdlib json
And Pydantic validation should work correctly
And no data should be lost or corrupted

Technical Requirements:

  • Install orjson>=3.10.0
  • Create ORJSONResponse class
  • Configure FastAPI with default_response_class
  • Verify no breaking changes in API behavior
US-2: Admin UI User - Faster Page Loads with JSON Data

As an Admin UI User
I want pages that load large datasets to be responsive
So that I can work efficiently without waiting

Acceptance Criteria:

Given I am viewing the tools page with 200+ tools
When the JSON response is generated
Then the page should load 20-40% faster than with stdlib json
And all tool data should display correctly
And no UI rendering errors should occur

Given I am performing a bulk export operation
When exporting 1000+ entities to JSON
Then the export should complete 2-3x faster
And the JSON file should be valid and complete

Technical Requirements:

  • All endpoints use orjson serialization
  • Large payloads (>100KB) show measurable improvement
  • No breaking changes in UI JavaScript parsing
US-3: DevOps Engineer - Monitor JSON Serialization Performance

As a DevOps Engineer
I want to measure JSON serialization performance
So that I can validate the improvement from orjson

Acceptance Criteria:

Given I am running a load test against GET /tools
When I compare orjson vs stdlib json performance
Then requests/second should increase by 15-30%
And CPU usage should decrease by 10-20%
And response latency should decrease by 20-40%

Given I am monitoring production metrics
When I observe JSON serialization times
Then p95 serialization latency should be <5ms for typical responses
And no increase in error rates should occur

Technical Requirements:

  • Benchmark script comparing stdlib json vs orjson
  • Load testing with wrk or locust
  • Metrics showing serialization time reduction
  • No regression in API correctness

🏗 Architecture

JSON Serialization Flow

graph TD
    A[FastAPI Endpoint] --> B{Default Response Class}
    B -->|ORJSONResponse| C[orjson.dumps]
    B -->|JSONResponse| D[stdlib json.dumps]
    C --> E[Binary bytes]
    D --> F[String]
    E --> G[Encode to bytes]
    F --> H[Encode to bytes]
    G --> I[HTTP Response]
    H --> I

    style C fill:#90EE90
    style D fill:#FFB6C1
    style E fill:#90EE90
    style F fill:#FFB6C1
Loading

Performance Comparison

graph LR
    A[Serialize 1000 Objects] --> B[stdlib json: 100ms]
    A --> C[ujson: 65ms]
    A --> D[orjson: 35ms]

    style B fill:#FFB6C1
    style C fill:#FFE4B5
    style D fill:#90EE90
Loading

Implementation Example

# mcpgateway/utils/orjson_response.py

from typing import Any
import orjson
from fastapi.responses import JSONResponse


class ORJSONResponse(JSONResponse):
    """
    Custom JSON response class using orjson for faster serialization.

    orjson is 2-3x faster than stdlib json and produces more compact output.
    It handles datetime, UUID, and numpy types natively.
    """
    media_type = "application/json"

    def render(self, content: Any) -> bytes:
        """
        Render content to JSON bytes using orjson.

        Options:
        - OPT_NON_STR_KEYS: Allow non-string dict keys (ints, etc.)
        - OPT_SERIALIZE_NUMPY: Support numpy arrays if present
        """
        return orjson.dumps(
            content,
            option=orjson.OPT_NON_STR_KEYS | orjson.OPT_SERIALIZE_NUMPY
        )
# mcpgateway/main.py

from mcpgateway.utils.orjson_response import ORJSONResponse

app = FastAPI(
    title=settings.app_name,
    version=__version__,
    default_response_class=ORJSONResponse,  # Use orjson for all responses
    # ... other config
)
# Pydantic Model Configuration (if needed)

from pydantic import BaseModel
import orjson


class ToolBase(BaseModel):
    """Base Pydantic model with orjson configuration."""

    class Config:
        # Configure Pydantic to use orjson for serialization
        json_loads = orjson.loads
        json_dumps = lambda v, *, default: orjson.dumps(v, default=default).decode()

📋 Implementation Tasks

Phase 1: Dependencies & Setup ✅

  • Add orjson Dependency
    • Add orjson>=3.10.0 to pyproject.toml dependencies section
    • Run make install-dev to install orjson
    • Verify installation: python -c "import orjson; print(orjson.__version__)"
    • Verify orjson is compiled (should be fast, Rust-based)

Phase 2: Create ORJSONResponse Class ✅

  • Create Custom Response Class

    • Create new file: mcpgateway/utils/orjson_response.py
    • Implement ORJSONResponse class extending JSONResponse
    • Add render() method using orjson.dumps()
    • Configure orjson options: OPT_NON_STR_KEYS | OPT_SERIALIZE_NUMPY
    • Add comprehensive docstring explaining benefits and options
  • Add Unit Tests

    • Create tests/unit/mcpgateway/utils/test_orjson_response.py
    • Test serialization of simple objects (dict, list)
    • Test serialization of datetime objects
    • Test serialization of UUID objects
    • Test serialization of None values
    • Test serialization of nested objects
    • Test that response is bytes (not str)

Phase 3: Configure FastAPI ✅

  • Update FastAPI Application

    • Import ORJSONResponse in mcpgateway/main.py (top of file)
    • Update FastAPI app initialization (around line 878)
    • Add default_response_class=ORJSONResponse parameter
    • Add comment explaining orjson performance benefits
  • Verify Endpoint Compatibility

    • Test GET /health endpoint
    • Test GET /tools endpoint (large response)
    • Test GET /servers endpoint
    • Test POST endpoints (request/response)
    • Verify no breaking changes in response format

Phase 4: Router Updates ✅

  • Check Router Response Classes

    • Review admin.py for explicit response_class parameters
    • Review version.py for custom JSON responses
    • Update any routers that override default response class
    • Ensure consistency across all routers
  • Update API Router

    • Verify APIRouter instances inherit default response class
    • Check if any endpoints explicitly use JSONResponse
    • Replace explicit JSONResponse with ORJSONResponse if needed

Phase 5: Pydantic Configuration ✅

  • Review Pydantic Model Configuration

    • Check if base models in mcpgateway/models.py need orjson config
    • Check if schema models in mcpgateway/schemas.py need orjson config
    • Decide if Pydantic orjson integration is needed (often not required)
  • Test Pydantic Serialization

    • Test Pydantic model .dict() method
    • Test Pydantic model .json() method
    • Test datetime serialization in Pydantic models
    • Test UUID serialization in Pydantic models
    • Verify no changes in serialization format

Phase 6: Testing & Validation ✅

  • Test All Endpoint Types

    • Test GET endpoints (list and detail views)
    • Test POST endpoints (create operations)
    • Test PUT/PATCH endpoints (update operations)
    • Test DELETE endpoints
    • Test error responses (4xx, 5xx)
  • Test Edge Cases

    • Test with None values in response
    • Test with empty lists and dicts
    • Test with deeply nested objects (>10 levels)
    • Test with large strings (descriptions >10KB)
    • Test with special characters and Unicode
    • Test with binary data (base64 encoded)
  • Test Data Types

    • Test datetime serialization (ISO 8601 format)
    • Test UUID serialization (string format)
    • Test Decimal serialization (if used)
    • Test Enum serialization
    • Test custom model serialization

Phase 7: Performance Benchmarking ✅

  • Create Benchmark Script

    • Create scripts/benchmark_json_serialization.py
    • Generate test data (100, 1000, 10000 objects)
    • Measure stdlib json.dumps() time
    • Measure orjson.dumps() time
    • Calculate speedup percentage
    • Output results in table format
  • Benchmark Large Payloads

    • Measure GET /tools with 100+ tools (stdlib json)
    • Measure GET /tools with 100+ tools (orjson)
    • Measure GET /servers with 50+ servers
    • Measure bulk export operations
    • Document results in benchmark report
  • Load Testing

    • Run wrk against GET /tools: wrk -t4 -c100 -d30s http://localhost:4444/tools
    • Record requests/second, latency percentiles (p50, p95, p99)
    • Measure CPU usage during test
    • Compare with baseline (stdlib json) if available
    • Document performance improvement

Phase 8: Documentation ✅

  • Update CLAUDE.md

    • Add section on orjson configuration
    • Document performance improvements (2-3x faster)
    • Explain ORJSONResponse class usage
    • Add troubleshooting notes
  • Update Code Comments

    • Add inline comments in ORJSONResponse class
    • Add comments in main.py explaining orjson config
    • Document any behavioral differences from stdlib json
  • Create Benchmark Documentation

    • Document benchmark results (tables and graphs)
    • Document load testing results
    • Add performance comparison charts
    • Include recommendations for optimization

Phase 9: Quality Assurance ✅

  • Code Quality

    • Run make autoflake isort black to format code
    • Run make flake8 and fix any issues
    • Run make pylint and address warnings
    • Pass make verify checks
  • Testing

    • Run make doctest test - all tests pass
    • Run make htmlcov - check code coverage
    • Add integration tests for orjson serialization
    • Verify no regression in existing tests

✅ Success Criteria

  • orjson installed and imported successfully
  • ORJSONResponse class created and tested
  • FastAPI configured to use orjson by default
  • All existing endpoints work without breaking changes
  • Pydantic models serialize correctly with orjson
  • 2-3x performance improvement measured for large payloads
  • Edge cases handled correctly (None, empty, nested, Unicode)
  • Datetime and UUID serialization works correctly
  • Benchmarks documented showing performance gains
  • No regression in API behavior or response format
  • Load testing shows measurable improvement
  • Code coverage maintained or improved

🏁 Definition of Done

  • orjson added to pyproject.toml and installed
  • ORJSONResponse class implemented in mcpgateway/utils/orjson_response.py
  • FastAPI app configured with default_response_class=ORJSONResponse
  • Pydantic models reviewed and configured (if needed)
  • All API endpoints tested and working
  • Edge cases tested (None, empty, nested, Unicode, special types)
  • Performance benchmarks completed and documented
  • Load testing shows 15-30% throughput improvement
  • Unit tests for ORJSONResponse (90%+ coverage)
  • Integration tests pass
  • Code passes make verify checks
  • Documentation updated (CLAUDE.md, code comments)
  • Benchmark report created
  • Ready for production deployment

📝 Additional Notes

🔹 orjson Features:

  • Fast: 2-3x faster than stdlib json, uses Rust implementation
  • Strict: RFC 8259 compliant, catches serialization errors early
  • Compact: Produces smaller output than stdlib json
  • Type Support: datetime, UUID, numpy arrays, dataclasses, Pydantic models
  • Options: OPT_INDENT_2, OPT_SORT_KEYS, OPT_SERIALIZE_NUMPY, OPT_NAIVE_UTC
  • Binary Output: Returns bytes directly (no string→bytes conversion)

🔹 Performance Comparison (typical):

  • stdlib json.dumps: 100ms for 1000 objects
  • ujson.dumps: 65ms for 1000 objects (35% faster)
  • orjson.dumps: 35ms for 1000 objects (3x faster)
  • Serialization speedup: 2-3x faster
  • Deserialization speedup: 1.5-2x faster
  • Memory usage: 30-40% less memory allocation

🔹 Compatibility Notes:

  • orjson returns bytes, not str (handled automatically by ORJSONResponse)
  • Slightly stricter than stdlib (fails on circular references)
  • Default behavior differs for datetime (ISO 8601 with timezone)
  • May need to adjust existing datetime handling if naive datetimes used
  • Does not support custom json methods (use default parameter instead)
  • Sort keys with OPT_SORT_KEYS if deterministic output needed

🔹 When Performance Matters Most:

  • Large list endpoints (GET /tools, GET /servers, GET /gateways) with 100+ items
  • Bulk export operations (export 1000+ entities)
  • Federation sync operations (tool catalog exchange)
  • Admin UI data loading (large tables, many records)
  • High-throughput API scenarios (1000+ req/s)
  • Real-time updates (SSE with frequent JSON events)

🔹 Datetime Serialization:

  • orjson serializes datetime to RFC 3339 format (ISO 8601 with timezone)
  • Naive datetimes treated as UTC by default
  • Use OPT_NAIVE_UTC option to force UTC for naive datetimes
  • Example: datetime(2025, 1, 1, 12, 0)"2025-01-01T12:00:00+00:00"

🔹 Migration Checklist:

  • ✅ Install orjson dependency
  • ✅ Create ORJSONResponse class
  • ✅ Configure FastAPI default response class
  • ✅ Test all endpoints for compatibility
  • ✅ Test datetime/UUID serialization
  • ✅ Benchmark performance improvement
  • ✅ Update documentation

🔗 Related Issues


📚 References

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestperformancePerformance related items

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions