Skip to content

Conversation

@kevalmahajan
Copy link
Member

@kevalmahajan kevalmahajan commented Oct 13, 2025

📌 Summary

Previously, the LLM chat system operated with a single Gunicorn worker. With the recent integration of Redis, session management, configuration storage, chat history, and other data are now centralized, enabling support for multiple worker processes and improved scalability.

  • Handles multiple users concurrently.
  • Full session, configurations, and chat history isolation per user.
  • Designed for multi-worker, distributed environments.
  • Prevents race conditions via Redis locks + TTLs.

🔧Detailed Implementation:

Class ChatHistoryManager
Purpose: Centralized Chat History Manages user chat histories with Redis backend and in-memory fallback.

Storage Modes:
Redis: For distributed, multi-worker environments (shared state).
In-memory (Python dict): For single-worker or Redis-unavailable scenarios.

Initialization Parameters:
redis: Optional Redis async client.
maxmessages: Max messages to store per user (default 50).
ttl: Time-to-live for Redis entries (default 3600 sec).

Key Methods:
gethistory(): Fetches chat history (handles JSON errors).
savehistory(): Saves and trims history to max message count.
appendmessage(): Adds a message (read-modify-write).
clearhistory(): Deletes a user's chat history.
getlangchainmessages(): Converts stored dicts to LangChain message objects.

Storage Key Format:
Redis keys: chat:history:{userid} (ensures user isolation).

Concurrent User Management:
Session Management Components:
activesessions: Maps userid ➝ MCPChatService (local only).
userconfigs: Maps userid ➝ configuration (local only).
Redis keys for distributed coordination:
user:config:{userid}: Serialized config.
active:session:{userid}: Current session owner (stores WORKER_ID).
session:lock:{userid}: Prevents race conditions during init.

Worker Coordination:
WORKER_ID: Uniquely identifies worker (from ENV, HOSTNAME, or PID).

  • Session Ownership Logic (getactivesession()):
  • If current worker owns session → refresh TTL and return it.
  • If session missing but owned → lock, recreate from config.
  • If no owner → lock and create new session.
  • If other worker owns → wait briefly for session availability.

Per-User Configuration:
Each user can have:

  • Different MCP server settings (URLs, protocols, auth tokens).
  • Custom LLM providers (OpenAI, Azure, Anthropic, etc.).

Built on /llmchat/connect request + env var fallbacks → stored locally + in Redis.

Chat History Isolation:
ChatHistoryManager uses per-user Redis key: chat:history:{userid}
Ensures strict user isolation even in shared Redis.

Session Lifecycle:

/connect:
Ends existing session → creates new with fresh config.

/disconnect:
Cleans up: shuts down service, clears history, removes Redis keys.
Redis ownership TTL ensures recovery from crashes (session becomes available after expiry).

🔁 Reproduction Steps

  1. set GUNICORN_WORKERS=4, then run using make serve
  2. Navigate to LLM Chat tab and configure server and LLM configuration and click on connect.
  3. Sometimes the request will worker and most of the times it wont as it would go to the wrong worker which does not have you configurations stored.

🧪 Verification

Check Command Status
Lint suite make lint
Unit tests make test
Coverage ≥ 90 % make coverage
Manual regression no longer fails steps / screenshots

📐 MCP Compliance (if relevant)

  • Matches current MCP spec
  • No breaking change to MCP clients

✅ Checklist

  • Code formatted (make black isort pre-commit)
  • No secrets/credentials committed

Signed-off-by: Keval Mahajan <[email protected]>
Copy link
Member

@crivetimihai crivetimihai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ CONDITIONAL APPROVAL - The PR is well-architected and solves a real problem, but I recommend:

  1. Add documentation for Redis setup requirements
  2. Consider: Adding integration tests for multi-worker scenarios
  3. Monitor: Session recreation behavior in production - it may need adjustment
  4. Consider: Exposing tunables (SESSION_TTL, LOCK_TTL) in config.py instead of just env vars

Have created a separate issue.

Potential Concerns ⚠️

  1. Complexity: The get_active_session() function is quite complex with multiple retry logic paths
  2. Session Recreation Logic: Auto-recreation from config could be surprising behavior
  3. Lock Timeouts: Default LOCK_TTL=30s and LOCK_RETRIES=10 may need tuning for high-load scenarios
  4. Testing: No visible unit tests for the complex coordination logic (though CI passes)
  5. Documentation: No update to .env.example or README for new Redis requirements

@crivetimihai crivetimihai merged commit eb7cd65 into main Oct 14, 2025
36 checks passed
@crivetimihai crivetimihai deleted the llmchat_add_session_consistency branch October 14, 2025 00:49
@kevalmahajan
Copy link
Member Author

  1. There are no extra variables required for Redis, we already have Redis configuration and cache_type selection. I am reusing the same configurations. Will be implementing similar for cache_type = "database" too for it is consistent with our previous configurations. For other variables, for eg TTL will add those in the config.py and corresponding documentation for it.
  2. Will add additional test cases for the new logic testing.

Thanks!!

p4yl04d3r pushed a commit to p4yl04d3r/mcp-context-forge that referenced this pull request Nov 19, 2025
* redis Implementation for chat session consistency

Signed-off-by: Keval Mahajan <[email protected]>

* Improved chat history manager - fix chat history/context with redis for session management

Signed-off-by: Keval Mahajan <[email protected]>

* linting

Signed-off-by: Keval Mahajan <[email protected]>

* fix tool name display in tool invocations

Signed-off-by: Keval Mahajan <[email protected]>

* safe disconnect before reconnection for same user

Signed-off-by: Keval Mahajan <[email protected]>

* reconnection clears previous chat history

Signed-off-by: Keval Mahajan <[email protected]>

* web linting

Signed-off-by: Keval Mahajan <[email protected]>

---------

Signed-off-by: Keval Mahajan <[email protected]>
Signed-off-by: p4yl04d3r <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants