feat(async): add async team mode for external event-driven multi-turn conversation #218

qdaxb · 2025-11-29T16:08:29Z

Summary

Add async team mode feature enabling external event-driven multi-turn agent conversations
Implement WAITING status for subtasks with webhook-based session recovery
Add GitHub/GitLab webhook endpoints and generic callback mechanism for CI/CD integration

Changes

Backend

Add WAITING status to SubtaskStatus enum
Add new fields to Subtask model: waiting_for, waiting_since, waiting_timeout, resume_count, max_resume_count
Create database migration for new subtask fields
Implement webhook endpoints:
- POST /api/webhooks/github - GitHub check_run/workflow_run events
- POST /api/webhooks/gitlab - GitLab Pipeline/MR hooks
- POST /api/webhooks/callback/{subtask_id}/{token} - Generic callback
Add AsyncResumeService for managing session recovery workflow
Add configuration options for async mode

Executor

Add output_parser.py for detecting waiting signals (git push, PR/MR creation)
Integrate output parsing into response_processor for automatic WAITING state detection
Add WAITING status to shared TaskStatus enum

Frontend

Add async to TeamMode type (ClaudeCode only, single-bot scenario)
Add WAITING to TaskStatus type
Update task context to handle WAITING status in refresh logic

Test Plan

Verify database migration runs successfully
Test webhook endpoint signature validation for GitHub/GitLab
Test output parser detection for various git push/PR patterns
Verify WAITING state transition and resume workflow
Test max_resume_count limit enforcement
Test waiting timeout handling
Verify frontend displays WAITING status correctly

Summary by CodeRabbit

New Features

Introduced async mode for teams—agents can pause execution and wait for external events
Subtasks now support waiting for external events (CI pipelines, approvals, external APIs) with automatic resumption
GitHub and GitLab webhook integration automatically resumes waiting tasks when CI pipelines complete
Generic webhook callback endpoint enables manual task resumption from external systems
Added WAITING status to track tasks awaiting external events
Tasks that timeout while waiting are automatically marked as failed

_{✏️ Tip: You can customize this high-level summary in your review settings.}

… conversation This commit implements the async team mode feature which enables external event-driven multi-turn agent conversations. Key changes include: Backend: - Add WAITING status to SubtaskStatus enum - Add new fields to Subtask model: waiting_for, waiting_since, waiting_timeout, resume_count, max_resume_count - Create database migration for new subtask fields - Implement GitHub/GitLab webhook endpoints for CI/CD event reception - Implement generic callback endpoint for custom integrations - Add AsyncResumeService for managing session recovery - Add configuration options: ASYNC_MODE_ENABLED, DEFAULT_MAX_RESUME_COUNT, DEFAULT_WAITING_TIMEOUT, GITHUB_WEBHOOK_SECRET, GITLAB_WEBHOOK_TOKEN Executor: - Add output_parser.py for detecting waiting signals (git push, PR/MR creation) - Integrate output parsing into response_processor for automatic WAITING detection - Add WAITING status to shared TaskStatus enum Frontend: - Add 'async' to TeamMode type - Add WAITING to TaskStatus type - Update team mode agent filters to support async mode (ClaudeCode only) - Update task context to handle WAITING status in refresh logic The async mode allows agents to: 1. Execute tasks and detect CI-triggering actions 2. Enter WAITING state when external events are expected 3. Resume execution when webhooks arrive with CI results 4. Support multiple resume cycles with configurable limits

coderabbitai · 2025-11-29T16:08:40Z

Walkthrough

This PR introduces async-mode support for subtasks, enabling external event-driven resumption via webhooks. It adds database schema changes, webhook handlers (GitHub, GitLab, generic callback), an async resume service, waiting-signal detection in agent output, and a new team mode option.

Changes

Cohort / File(s)	Summary
Database Migration `backend/alembic/versions/2a3b4c5d6e7f_add_subtask_async_mode_fields.py`	Adds WAITING status to SubtaskStatus enum; introduces columns waiting_for, waiting_since, waiting_timeout, resume_count, max_resume_count; creates index on (status, waiting_for); provides downgrade path with guards
Models & Schemas `backend/app/models/subtask.py`, `backend/app/schemas/subtask.py`	Adds WAITING enum member to SubtaskStatus; introduces five new async-mode columns/fields (waiting_for, waiting_since, waiting_timeout, resume_count, max_resume_count) across model and schema layers
Configuration `backend/app/core/config.py`	Adds five new settings: ASYNC_MODE_ENABLED, DEFAULT_MAX_RESUME_COUNT, DEFAULT_WAITING_TIMEOUT, GITHUB_WEBHOOK_SECRET, GITLAB_WEBHOOK_TOKEN
Webhook Package & Endpoints `backend/app/api/endpoints/webhooks/__init__.py`, `backend/app/api/endpoints/webhooks/github.py`, `backend/app/api/endpoints/webhooks/gitlab.py`, `backend/app/api/endpoints/webhooks/callback.py`	Creates webhooks package with router exports; implements GitHub webhook (signature verification, check_run/workflow_run/pull_request events); implements GitLab webhook (token verification, pipeline/merge-request/job events); implements generic callback endpoint with HMAC-based token generation/verification
Async Resume Service `backend/app/services/async_resume_service.py`	New service managing subtask waiting state transitions, webhook-driven resumption, resume counting, timeout checking, and notifications; integrates with database and notification systems
API Router Registration `backend/app/api/api.py`	Registers GitHub, GitLab, and callback webhook routers under /webhooks path
Agent Output Parsing & Integration `executor/agents/claude_code/output_parser.py`, `executor/agents/claude_code/response_processor.py`	New output parser module detects waiting signals (CI_PIPELINE, APPROVAL, EXTERNAL_API) via regex patterns and confidence scoring; response processor integrates detection logic to transition agent workbench to WAITING state on signal match
Frontend Team Mode & Status `frontend/src/features/settings/components/team-modes/index.ts`, `frontend/src/features/settings/components/team-modes/types.ts`, `frontend/src/types/api.ts`	Adds 'async' to TeamMode union, maps to ClaudeCode agent only, updates TaskStatus enum to include 'WAITING'
Frontend Task Refresh `frontend/src/features/tasks/contexts/taskContext.tsx`	Adds comment noting WAITING tasks may require periodic refresh due to external webhook events
Shared Status `shared/status.py`	Adds WAITING member to TaskStatus enum

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent<br/>(Executor)
    participant Parser as Output<br/>Parser
    participant Processor as Response<br/>Processor
    participant Service as Async<br/>Resume Service
    participant DB as Database
    participant Notify as Notification<br/>Service

    Agent->>Parser: output with waiting signal
    Parser->>Parser: detect_waiting_signal()<br/>pattern match + confidence
    alt Signal Detected & Above Threshold
        Parser-->>Processor: WaitingSignal
        Processor->>Service: set_waiting_state(subtask_id)
        Service->>DB: UPDATE subtask<br/>status=WAITING<br/>waiting_for, waiting_since
        DB-->>Service: ✓
        Service->>Notify: _send_notification<br/>event="subtask_waiting"
        Notify-->>Service: ✓
        Processor-->>Agent: return RUNNING<br/>(keep processing)
    else No Signal
        Processor-->>Agent: return COMPLETED
    end

sequenceDiagram
    participant Webhook as External<br/>Webhook<br/>(GitHub/GitLab)
    participant Handler as Webhook<br/>Handler
    participant Service as Async<br/>Resume Service
    participant DB as Database
    participant Notify as Notification<br/>Service

    Webhook->>Handler: POST event<br/>(repo, branch, status)
    Handler->>Handler: verify_signature()<br/>or verify_token()
    alt Verification Failed
        Handler-->>Webhook: 401 Unauthorized
    else Verification Success
        Handler->>Handler: extract_event_info()
        Handler->>Service: resume_from_webhook<br/>(repo, branch, payload)
        Service->>DB: SELECT * FROM subtasks<br/>WHERE status=WAITING<br/>AND waiting_for=ci_pipeline
        DB-->>Service: [subtasks]
        loop For Each Matching Subtask
            Service->>Service: match task by repo/branch<br/>(case-insensitive)
            alt Match Found
                Service->>Service: _resume_subtask()
                Service->>DB: UPDATE subtask<br/>status=RUNNING<br/>resume_count++<br/>clear waiting_*
                DB-->>Service: ✓
                Service->>Notify: _send_notification<br/>event="subtask_resumed"
                Notify-->>Service: ✓
            end
        end
        Handler-->>Webhook: 200 OK<br/>resumed_count
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Key areas requiring attention:
- backend/app/services/async_resume_service.py: Dense logic with multiple async operations, state transitions, database interactions, and notification handling
- executor/agents/claude_code/output_parser.py: Confidence calculation heuristics and regex pattern matching logic; ensure signal detection is accurate and not overly sensitive
- backend/app/api/endpoints/webhooks/github.py and gitlab.py: Webhook signature/token verification logic; event extraction and filtering; ensure error handling covers edge cases
- backend/app/api/endpoints/webhooks/callback.py: Token generation and verification with HMAC; validate token security and uniqueness
- Database migration (2a3b4c5d6e7f_add_subtask_async_mode_fields.py): Verify upgrade/downgrade SQL correctness, particularly enum transitions and data cleanup
- Integration with response_processor.py: Ensure waiting-signal detection doesn't interfere with normal result processing flow
- Settings and enum consistency across backend/frontend and shared modules

Poem

🐰 Whiskers twitch with async delight,
Webhooks flutter through the night,
Waiting signals dance and glow,
Subtasks pause, then resume their flow,
External events now guide our way,
Async hopping brightens the day! 🌙✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main addition: async team mode for external event-driven multi-turn conversation, which is the core feature across backend, executor, and frontend changes.
Docstring Coverage	✅ Passed	Docstring coverage is 96.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch wegent/async-team-mode

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Use absolute import path 'executor.agents.claude_code.output_parser' instead of relative 'agents.claude_code.output_parser' to avoid circular import through agents/__init__.py.

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (14)

executor/agents/claude_code/response_processor.py (1)
496-502: Use TaskStatus.WAITING.value for consistency.

The status string should use the enum value for type safety and consistency with other status reports in this file (e.g., TaskStatus.COMPLETED.value on line 530, TaskStatus.FAILED.value on line 549).
             state_manager.report_progress(
                 progress=50,  # Not fully complete yet
-                status="WAITING",
+                status=TaskStatus.WAITING.value,
                 message=f"Waiting for {waiting_for}",
                 extra_result=result_dict
             )
executor/agents/claude_code/output_parser.py (3)
43-71: Consider extracting magic numbers for confidence thresholds.

The base confidence values (0.7-0.95) are inline. For maintainability, consider defining named constants like HIGH_CONFIDENCE = 0.9 for commonly used thresholds. However, this is optional given the current scope.

166-181: Consider documenting the asymmetric confidence adjustment.

Positive indicators add 0.05 while negative indicators subtract 0.1. This conservative approach is sensible but worth a brief comment explaining the rationale (e.g., "penalize negatives more heavily to reduce false positives").

1-6: Remove unnecessary shebang for library module.

This file is a library module, not an executable script. The shebang on line 1 is unnecessary and flagged by static analysis (EXE001).
-#!/usr/bin/env python
 # SPDX-FileCopyrightText: 2025 Weibo, Inc.
 #
 # SPDX-License-Identifier: Apache-2.0
frontend/src/features/settings/components/team-modes/index.ts (1)
13-13: Duplicate TeamMode definition - consolidate with types.ts.

This duplicates the TeamMode type already exported from types.ts (line 7). Since line 11 exports everything from types.ts, consumers could import TeamMode from either location, risking inconsistency.

Remove this duplicate and rely on the re-export from types.ts:
-export type TeamMode = 'solo' | 'pipeline' | 'route' | 'coordinate' | 'collaborate' | 'async';
backend/alembic/versions/2a3b4c5d6e7f_add_subtask_async_mode_fields.py (1)
37-66: Consider using Alembic's op.add_column for better cross-database support.

Raw SQL with ADD COLUMN IF NOT EXISTS is MySQL 8.0+ specific. Using Alembic operations provides:

Cross-database compatibility

Better migration tracking

Type safety
-    op.execute("""
-    ALTER TABLE subtasks
-    ADD COLUMN IF NOT EXISTS waiting_for VARCHAR(50) NULL COMMENT 'Event type being waited for (e.g., ci_pipeline, approval)'
-    """)
+    op.add_column('subtasks', sa.Column('waiting_for', sa.String(50), nullable=True,
+        comment='Event type being waited for (e.g., ci_pipeline, approval)'))
However, op.add_column will fail if the column exists. If idempotency is required for re-runnable migrations, document this clearly.
backend/app/api/endpoints/webhooks/__init__.py (1)
5-9: Sort __all__ for consistency.

Static analysis indicates __all__ is not sorted. Per coding guidelines, apply isort-style sorting.
-from app.api.endpoints.webhooks.github import router as github_router
-from app.api.endpoints.webhooks.gitlab import router as gitlab_router
 from app.api.endpoints.webhooks.callback import router as callback_router
+from app.api.endpoints.webhooks.github import router as github_router
+from app.api.endpoints.webhooks.gitlab import router as gitlab_router
 
-__all__ = ["github_router", "gitlab_router", "callback_router"]
+__all__ = ["callback_router", "github_router", "gitlab_router"]
backend/app/api/endpoints/webhooks/github.py (2)
10-15: Remove unused import datetime.

The datetime module is imported but not used in this file.
 import hashlib
 import hmac
 import json
 import logging
-from datetime import datetime
 from typing import Any, Dict, Optional
156-161: Improve exception handling per best practices.

Use logging.exception to include traceback and chain exceptions with raise ... from.
     # Parse payload
     try:
         payload = json.loads(body)
     except json.JSONDecodeError:
-        logger.error("Failed to parse GitHub webhook payload")
-        raise HTTPException(status_code=400, detail="Invalid JSON payload")
+        logger.exception("Failed to parse GitHub webhook payload")
+        raise HTTPException(status_code=400, detail="Invalid JSON payload") from None
backend/app/api/endpoints/webhooks/callback.py (2)
39-47: Inconsistent variable naming and deterministic token concern.

The variable SECRET_key uses mixed casing; use secret_key for PEP 8 compliance. Additionally, since tokens are deterministic and never expire, once a subtask_id is known, the token remains valid indefinitely. Consider whether this is acceptable for your security model.
 def generate_callback_token(subtask_id: int) -> str:
     """
     Generate a secure callback token for a subtask.
     This should be stored with the subtask when entering WAITING state.
     """
     # Use HMAC with SECRET key and subtask_id to generate deterministic token
-    SECRET_key = settings.SECRET_KEY.encode("utf-8")
+    secret_key = settings.SECRET_KEY.encode("utf-8")
     message = f"callback:{subtask_id}".encode("utf-8")
-    return hmac.new(SECRET_key, message, hashlib.sha256).hexdigest()[:32]
+    return hmac.new(secret_key, message, hashlib.sha256).hexdigest()[:32]
84-91: Improve exception handling in JSON parsing.

Use logging.exception to capture the stack trace and chain the exception with raise ... from for better debugging.
     try:
         if body:
             payload = json.loads(body)
         else:
             payload = {}
     except json.JSONDecodeError:
-        logger.error("Failed to parse callback payload")
-        raise HTTPException(status_code=400, detail="Invalid JSON payload")
+        logger.exception("Failed to parse callback payload")
+        raise HTTPException(status_code=400, detail="Invalid JSON payload") from None
backend/app/api/endpoints/webhooks/gitlab.py (1)
139-143: Improve exception handling in JSON parsing.

Use logging.exception to capture the stack trace and chain the exception with raise ... from for better debugging. This is consistent with the pattern needed in callback.py.
     try:
         payload = json.loads(body)
     except json.JSONDecodeError:
-        logger.error("Failed to parse GitLab webhook payload")
-        raise HTTPException(status_code=400, detail="Invalid JSON payload")
+        logger.exception("Failed to parse GitLab webhook payload")
+        raise HTTPException(status_code=400, detail="Invalid JSON payload") from None
backend/app/services/async_resume_service.py (2)
262-266: Remove unnecessary f-string prefix.

Lines 265 and 363 use f-strings without any placeholders.
             # Send notification
             await self._send_notification(
                 subtask=subtask,
                 event="subtask.max_resume_reached",
-                description=f"Subtask failed: max resume count reached",
+                description="Subtask failed: max resume count reached",
             )
And on line 363:
                 await self._send_notification(
                     subtask=subtask,
                     event="subtask.waiting_timeout",
-                    description=f"Subtask failed: waiting timeout exceeded",
+                    description="Subtask failed: waiting timeout exceeded",
                 )
374-395: Improve exception handling in notification sending.

Catching broad Exception is acceptable here for resilience, but use logging.exception to include the full stack trace for debugging.
     async def _send_notification(
         self,
         subtask: Subtask,
         event: str,
         description: str,
     ) -> None:
         """Send webhook notification for subtask events."""
         try:
             notification = Notification(
                 user_name="system",
                 event=event,
                 id=str(subtask.id),
                 start_time=subtask.created_at.isoformat() if subtask.created_at else "",
                 end_time=datetime.now().isoformat(),
                 description=description,
                 status=subtask.status.value,
                 detail_url="",
             )
             await webhook_notification_service.send_notification(notification)
         except Exception as e:
-            logger.error(f"Failed to send notification: {e}")
+            logger.exception("Failed to send notification for subtask %s", subtask.id)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6959716 and ed0607f.

📒 Files selected for processing (17)

backend/alembic/versions/2a3b4c5d6e7f_add_subtask_async_mode_fields.py (1 hunks)
backend/app/api/api.py (2 hunks)
backend/app/api/endpoints/webhooks/__init__.py (1 hunks)
backend/app/api/endpoints/webhooks/callback.py (1 hunks)
backend/app/api/endpoints/webhooks/github.py (1 hunks)
backend/app/api/endpoints/webhooks/gitlab.py (1 hunks)
backend/app/core/config.py (1 hunks)
backend/app/models/subtask.py (2 hunks)
backend/app/schemas/subtask.py (3 hunks)
backend/app/services/async_resume_service.py (1 hunks)
executor/agents/claude_code/output_parser.py (1 hunks)
executor/agents/claude_code/response_processor.py (2 hunks)
frontend/src/features/settings/components/team-modes/index.ts (2 hunks)
frontend/src/features/settings/components/team-modes/types.ts (1 hunks)
frontend/src/features/tasks/contexts/taskContext.tsx (1 hunks)
frontend/src/types/api.ts (1 hunks)
shared/status.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (8)

**/*.{ts,tsx}