-
Notifications
You must be signed in to change notification settings - Fork 421
Description
⚡ Performance - orjson JSON Serialization
Goal
Replace default JSON serialization with orjson for dramatically faster JSON encoding/decoding:
- Add orjson dependency to project
- Configure FastAPI to use orjson as default JSON serializer
- Update Pydantic models to leverage orjson optimizations
- Benchmark performance improvements for large payloads
- Ensure compatibility with existing API responses
This provides 2-3x faster JSON serialization/deserialization with significant improvements for large JSON responses (tool lists, server lists, bulk exports).
Why Now?
JSON serialization is a critical performance bottleneck for API servers:
- Performance: orjson is 2-3x faster than stdlib json, 1.5-2x faster than ujson
- Large Payloads: Massive improvement for responses with 100+ tools/servers/resources
- Low Overhead: Drop-in replacement, minimal code changes required
- Pydantic Compatible: Works seamlessly with FastAPI and Pydantic models
- Production Ready: Used by major companies (Reddit, Stripe, etc.)
- Type Safety: Strict RFC 8259 compliance, better error handling
📖 User Stories
US-1: API Client - Faster JSON Response Times
As an API Client
I want JSON responses to be serialized quickly
So that API calls complete faster and use less server CPU
Acceptance Criteria:
Given I am requesting a list of 500 tools via GET /tools
When the response is serialized with orjson
Then the serialization time should be 2-3x faster than stdlib json
And the response format should remain unchanged
And the Content-Type should be application/json
Given I am sending a POST request with complex JSON data
When the request is deserialized with orjson
Then parsing should be 2-3x faster than stdlib json
And Pydantic validation should work correctly
And no data should be lost or corruptedTechnical Requirements:
- Install orjson>=3.10.0
- Create ORJSONResponse class
- Configure FastAPI with default_response_class
- Verify no breaking changes in API behavior
US-2: Admin UI User - Faster Page Loads with JSON Data
As an Admin UI User
I want pages that load large datasets to be responsive
So that I can work efficiently without waiting
Acceptance Criteria:
Given I am viewing the tools page with 200+ tools
When the JSON response is generated
Then the page should load 20-40% faster than with stdlib json
And all tool data should display correctly
And no UI rendering errors should occur
Given I am performing a bulk export operation
When exporting 1000+ entities to JSON
Then the export should complete 2-3x faster
And the JSON file should be valid and completeTechnical Requirements:
- All endpoints use orjson serialization
- Large payloads (>100KB) show measurable improvement
- No breaking changes in UI JavaScript parsing
US-3: DevOps Engineer - Monitor JSON Serialization Performance
As a DevOps Engineer
I want to measure JSON serialization performance
So that I can validate the improvement from orjson
Acceptance Criteria:
Given I am running a load test against GET /tools
When I compare orjson vs stdlib json performance
Then requests/second should increase by 15-30%
And CPU usage should decrease by 10-20%
And response latency should decrease by 20-40%
Given I am monitoring production metrics
When I observe JSON serialization times
Then p95 serialization latency should be <5ms for typical responses
And no increase in error rates should occurTechnical Requirements:
- Benchmark script comparing stdlib json vs orjson
- Load testing with wrk or locust
- Metrics showing serialization time reduction
- No regression in API correctness
🏗 Architecture
JSON Serialization Flow
graph TD
A[FastAPI Endpoint] --> B{Default Response Class}
B -->|ORJSONResponse| C[orjson.dumps]
B -->|JSONResponse| D[stdlib json.dumps]
C --> E[Binary bytes]
D --> F[String]
E --> G[Encode to bytes]
F --> H[Encode to bytes]
G --> I[HTTP Response]
H --> I
style C fill:#90EE90
style D fill:#FFB6C1
style E fill:#90EE90
style F fill:#FFB6C1
Performance Comparison
graph LR
A[Serialize 1000 Objects] --> B[stdlib json: 100ms]
A --> C[ujson: 65ms]
A --> D[orjson: 35ms]
style B fill:#FFB6C1
style C fill:#FFE4B5
style D fill:#90EE90
Implementation Example
# mcpgateway/utils/orjson_response.py
from typing import Any
import orjson
from fastapi.responses import JSONResponse
class ORJSONResponse(JSONResponse):
"""
Custom JSON response class using orjson for faster serialization.
orjson is 2-3x faster than stdlib json and produces more compact output.
It handles datetime, UUID, and numpy types natively.
"""
media_type = "application/json"
def render(self, content: Any) -> bytes:
"""
Render content to JSON bytes using orjson.
Options:
- OPT_NON_STR_KEYS: Allow non-string dict keys (ints, etc.)
- OPT_SERIALIZE_NUMPY: Support numpy arrays if present
"""
return orjson.dumps(
content,
option=orjson.OPT_NON_STR_KEYS | orjson.OPT_SERIALIZE_NUMPY
)# mcpgateway/main.py
from mcpgateway.utils.orjson_response import ORJSONResponse
app = FastAPI(
title=settings.app_name,
version=__version__,
default_response_class=ORJSONResponse, # Use orjson for all responses
# ... other config
)# Pydantic Model Configuration (if needed)
from pydantic import BaseModel
import orjson
class ToolBase(BaseModel):
"""Base Pydantic model with orjson configuration."""
class Config:
# Configure Pydantic to use orjson for serialization
json_loads = orjson.loads
json_dumps = lambda v, *, default: orjson.dumps(v, default=default).decode()📋 Implementation Tasks
Phase 1: Dependencies & Setup ✅
- Add orjson Dependency
- Add
orjson>=3.10.0to pyproject.toml dependencies section - Run
make install-devto install orjson - Verify installation:
python -c "import orjson; print(orjson.__version__)" - Verify orjson is compiled (should be fast, Rust-based)
- Add
Phase 2: Create ORJSONResponse Class ✅
-
Create Custom Response Class
- Create new file: mcpgateway/utils/orjson_response.py
- Implement ORJSONResponse class extending JSONResponse
- Add render() method using orjson.dumps()
- Configure orjson options: OPT_NON_STR_KEYS | OPT_SERIALIZE_NUMPY
- Add comprehensive docstring explaining benefits and options
-
Add Unit Tests
- Create tests/unit/mcpgateway/utils/test_orjson_response.py
- Test serialization of simple objects (dict, list)
- Test serialization of datetime objects
- Test serialization of UUID objects
- Test serialization of None values
- Test serialization of nested objects
- Test that response is bytes (not str)
Phase 3: Configure FastAPI ✅
-
Update FastAPI Application
- Import ORJSONResponse in mcpgateway/main.py (top of file)
- Update FastAPI app initialization (around line 878)
- Add
default_response_class=ORJSONResponseparameter - Add comment explaining orjson performance benefits
-
Verify Endpoint Compatibility
- Test GET /health endpoint
- Test GET /tools endpoint (large response)
- Test GET /servers endpoint
- Test POST endpoints (request/response)
- Verify no breaking changes in response format
Phase 4: Router Updates ✅
-
Check Router Response Classes
- Review admin.py for explicit response_class parameters
- Review version.py for custom JSON responses
- Update any routers that override default response class
- Ensure consistency across all routers
-
Update API Router
- Verify APIRouter instances inherit default response class
- Check if any endpoints explicitly use JSONResponse
- Replace explicit JSONResponse with ORJSONResponse if needed
Phase 5: Pydantic Configuration ✅
-
Review Pydantic Model Configuration
- Check if base models in mcpgateway/models.py need orjson config
- Check if schema models in mcpgateway/schemas.py need orjson config
- Decide if Pydantic orjson integration is needed (often not required)
-
Test Pydantic Serialization
- Test Pydantic model .dict() method
- Test Pydantic model .json() method
- Test datetime serialization in Pydantic models
- Test UUID serialization in Pydantic models
- Verify no changes in serialization format
Phase 6: Testing & Validation ✅
-
Test All Endpoint Types
- Test GET endpoints (list and detail views)
- Test POST endpoints (create operations)
- Test PUT/PATCH endpoints (update operations)
- Test DELETE endpoints
- Test error responses (4xx, 5xx)
-
Test Edge Cases
- Test with None values in response
- Test with empty lists and dicts
- Test with deeply nested objects (>10 levels)
- Test with large strings (descriptions >10KB)
- Test with special characters and Unicode
- Test with binary data (base64 encoded)
-
Test Data Types
- Test datetime serialization (ISO 8601 format)
- Test UUID serialization (string format)
- Test Decimal serialization (if used)
- Test Enum serialization
- Test custom model serialization
Phase 7: Performance Benchmarking ✅
-
Create Benchmark Script
- Create scripts/benchmark_json_serialization.py
- Generate test data (100, 1000, 10000 objects)
- Measure stdlib json.dumps() time
- Measure orjson.dumps() time
- Calculate speedup percentage
- Output results in table format
-
Benchmark Large Payloads
- Measure GET /tools with 100+ tools (stdlib json)
- Measure GET /tools with 100+ tools (orjson)
- Measure GET /servers with 50+ servers
- Measure bulk export operations
- Document results in benchmark report
-
Load Testing
- Run wrk against GET /tools:
wrk -t4 -c100 -d30s http://localhost:4444/tools - Record requests/second, latency percentiles (p50, p95, p99)
- Measure CPU usage during test
- Compare with baseline (stdlib json) if available
- Document performance improvement
- Run wrk against GET /tools:
Phase 8: Documentation ✅
-
Update CLAUDE.md
- Add section on orjson configuration
- Document performance improvements (2-3x faster)
- Explain ORJSONResponse class usage
- Add troubleshooting notes
-
Update Code Comments
- Add inline comments in ORJSONResponse class
- Add comments in main.py explaining orjson config
- Document any behavioral differences from stdlib json
-
Create Benchmark Documentation
- Document benchmark results (tables and graphs)
- Document load testing results
- Add performance comparison charts
- Include recommendations for optimization
Phase 9: Quality Assurance ✅
-
Code Quality
- Run
make autoflake isort blackto format code - Run
make flake8and fix any issues - Run
make pylintand address warnings - Pass
make verifychecks
- Run
-
Testing
- Run
make doctest test- all tests pass - Run
make htmlcov- check code coverage - Add integration tests for orjson serialization
- Verify no regression in existing tests
- Run
✅ Success Criteria
- orjson installed and imported successfully
- ORJSONResponse class created and tested
- FastAPI configured to use orjson by default
- All existing endpoints work without breaking changes
- Pydantic models serialize correctly with orjson
- 2-3x performance improvement measured for large payloads
- Edge cases handled correctly (None, empty, nested, Unicode)
- Datetime and UUID serialization works correctly
- Benchmarks documented showing performance gains
- No regression in API behavior or response format
- Load testing shows measurable improvement
- Code coverage maintained or improved
🏁 Definition of Done
- orjson added to pyproject.toml and installed
- ORJSONResponse class implemented in mcpgateway/utils/orjson_response.py
- FastAPI app configured with default_response_class=ORJSONResponse
- Pydantic models reviewed and configured (if needed)
- All API endpoints tested and working
- Edge cases tested (None, empty, nested, Unicode, special types)
- Performance benchmarks completed and documented
- Load testing shows 15-30% throughput improvement
- Unit tests for ORJSONResponse (90%+ coverage)
- Integration tests pass
- Code passes
make verifychecks - Documentation updated (CLAUDE.md, code comments)
- Benchmark report created
- Ready for production deployment
📝 Additional Notes
🔹 orjson Features:
- Fast: 2-3x faster than stdlib json, uses Rust implementation
- Strict: RFC 8259 compliant, catches serialization errors early
- Compact: Produces smaller output than stdlib json
- Type Support: datetime, UUID, numpy arrays, dataclasses, Pydantic models
- Options: OPT_INDENT_2, OPT_SORT_KEYS, OPT_SERIALIZE_NUMPY, OPT_NAIVE_UTC
- Binary Output: Returns bytes directly (no string→bytes conversion)
🔹 Performance Comparison (typical):
- stdlib json.dumps: 100ms for 1000 objects
- ujson.dumps: 65ms for 1000 objects (35% faster)
- orjson.dumps: 35ms for 1000 objects (3x faster)
- Serialization speedup: 2-3x faster
- Deserialization speedup: 1.5-2x faster
- Memory usage: 30-40% less memory allocation
🔹 Compatibility Notes:
- orjson returns
bytes, notstr(handled automatically by ORJSONResponse) - Slightly stricter than stdlib (fails on circular references)
- Default behavior differs for datetime (ISO 8601 with timezone)
- May need to adjust existing datetime handling if naive datetimes used
- Does not support custom json methods (use default parameter instead)
- Sort keys with OPT_SORT_KEYS if deterministic output needed
🔹 When Performance Matters Most:
- Large list endpoints (GET /tools, GET /servers, GET /gateways) with 100+ items
- Bulk export operations (export 1000+ entities)
- Federation sync operations (tool catalog exchange)
- Admin UI data loading (large tables, many records)
- High-throughput API scenarios (1000+ req/s)
- Real-time updates (SSE with frequent JSON events)
🔹 Datetime Serialization:
- orjson serializes datetime to RFC 3339 format (ISO 8601 with timezone)
- Naive datetimes treated as UTC by default
- Use OPT_NAIVE_UTC option to force UTC for naive datetimes
- Example:
datetime(2025, 1, 1, 12, 0)→"2025-01-01T12:00:00+00:00"
🔹 Migration Checklist:
- ✅ Install orjson dependency
- ✅ Create ORJSONResponse class
- ✅ Configure FastAPI default response class
- ✅ Test all endpoints for compatibility
- ✅ Test datetime/UUID serialization
- ✅ Benchmark performance improvement
- ✅ Update documentation
🔗 Related Issues
- Part of Performance Optimization initiative
- Related to [Feature]🔐 Configurable Password Expiration with Forced Password Change on Login #1282 (Compression) - smaller JSON with compression
- Related to Epic 5 (Redis Caching) - caching reduces serialization needs
- Related to Epic 6 (Production Tuning) - overall performance optimization