Skip to content

Conversation

@NicolasBelissent
Copy link
Contributor

@NicolasBelissent NicolasBelissent commented Oct 16, 2025

Summary by cubic

Adds a feedback collection tool and automatic implicit feedback to LangSmith so teams can capture both explicit user feedback and behavior signals from tool calls with minimal setup.

  • New Features

    • Added meta_collect_tool_feedback with strict validation and ToolSet auto-inclusion; supports custom base_url.
    • Implicit feedback to LangSmith: session tracking, quick refinement/task switch detection, suitability scoring, and safe payload sanitization.
    • Auto-init via LANGSMITH_API_KEY; configurable with STACKONE_IMPLICIT_FEEDBACK_ENABLED, STACKONE_IMPLICIT_FEEDBACK_PROJECT, STACKONE_IMPLICIT_FEEDBACK_TAGS; programmatic configure_implicit_feedback available.
    • Tool.execute and Tool.call accept options to attach feedback_session_id, feedback_user_id, feedback_metadata without affecting API requests.
  • Migration

    • If you subclass StackOneTool, update execute(...) to include the options parameter and pass it to super().execute(..., options=options).

NicolasBelissent and others added 2 commits October 14, 2025 17:41
Add meta_collect_tool_feedback tool for collecting user feedback about
StackOne tool performance. The tool is automatically included in the
toolset and can be invoked by AI agents after user consent.

Features:
- Feedback collection tool with Pydantic validation
- Automatic whitespace trimming and input cleaning
- Support for custom base URLs
- Integration with StackOneToolSet
- OpenAI and LangChain format support

Also includes implicit feedback system for automatic behavioral feedback
to LangSmith:
- Behavior analyzer for detecting quick refinements
- Session tracking for tool call history
- LangSmith client integration
- Configurable via environment variables

Breaking changes:
- Updated StackOneTool.execute() signature to include options parameter
- Updated meta tools execute methods to support options parameter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Copilot AI review requested due to automatic review settings October 16, 2025 13:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds implicit behavioral feedback instrumentation (LangSmith integration) and a user feedback collection tool to the SDK.

  • Introduces implicit feedback manager, analyzer, session tracking, sanitization utilities, and automatic dispatch from StackOneTool.execute.
  • Adds dedicated feedback meta tool (meta_collect_tool_feedback) and wires it into toolset plus README documentation.
  • Extends tool execution API to accept options for feedback context propagation.

Reviewed Changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/test_implicit_feedback.py Adds tests for behavior analyzer and implicit feedback manager with stub client.
tests/test_feedback.py Adds comprehensive validation, execution, integration, and live submission tests for feedback tool and implicit feedback options.
stackone_ai/toolset.py Injects feedback tool into loaded tools.
stackone_ai/models.py Adds execution options handling and implicit feedback dispatch to StackOneTool.execute/call.
stackone_ai/meta_tools.py Updates meta tools' execute signature to include options.
stackone_ai/implicit_feedback/utils.py Adds sanitization helpers for payloads.
stackone_ai/implicit_feedback/session.py Adds session tracker and event derivation logic.
stackone_ai/implicit_feedback/manager.py Implements singleton manager coordinating analysis and LangSmith publishing.
stackone_ai/implicit_feedback/langsmith_client.py Optional LangSmith client wrapper.
stackone_ai/implicit_feedback/data.py Defines dataclasses for tool call records, quality signals, and events.
stackone_ai/implicit_feedback/analyzer.py Implements behavior analysis (quick refinement, task switch, suitability score).
stackone_ai/implicit_feedback/init.py Exposes implicit feedback public API.
stackone_ai/feedback/tool.py Implements feedback collection tool with validation and integration.
stackone_ai/feedback/init.py Exposes feedback tool factory.
stackone_ai/init.py Re-exports implicit feedback configuration functions.
README.md Documents implicit feedback system, options usage, and feedback collection tool.

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

def execute(
self, arguments: str | JsonDict | None = None, *, options: JsonDict | None = None
) -> JsonDict:
return execute_filter(arguments)
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These overrides bypass StackOneTool.execute, so implicit feedback instrumentation and option propagation (feedback_session_id, etc.) are lost for meta tool calls. Either remove the override and let base execute run (after adapting underlying functions to accept the parsed kwargs), or forward options and manually invoke the implicit feedback manager similar to StackOneTool.execute before returning.

Suggested change
return execute_filter(arguments)
# Forward options and manually invoke implicit feedback instrumentation if available
result = execute_filter(arguments)
# If feedback instrumentation is available, invoke it here
# For example:
# if hasattr(self, "_implicit_feedback_manager") and options is not None:
# self._implicit_feedback_manager.handle_feedback(self, arguments, result, options)
# Propagate options if needed (e.g., feedback_session_id)
# If options are needed in the result, add them here
return result

Copilot uses AI. Check for mistakes.
Comment on lines +277 to 280
def execute(
self, arguments: str | JsonDict | None = None, *, options: JsonDict | None = None
) -> JsonDict:
return execute_tool(arguments)
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These overrides bypass StackOneTool.execute, so implicit feedback instrumentation and option propagation (feedback_session_id, etc.) are lost for meta tool calls. Either remove the override and let base execute run (after adapting underlying functions to accept the parsed kwargs), or forward options and manually invoke the implicit feedback manager similar to StackOneTool.execute before returning.

Copilot uses AI. Check for mistakes.
Comment on lines 49 to 92
def execute(
self, arguments: str | JsonDict | None = None, *, options: JsonDict | None = None
) -> JsonDict:
"""
Execute the feedback tool with enhanced validation.
Args:
arguments: Tool arguments as string or dict
options: Execution options
Returns:
Response from the API
Raises:
StackOneError: If validation or API call fails
"""
try:
# Parse input
if isinstance(arguments, str):
raw_params = json.loads(arguments)
else:
raw_params = arguments or {}

# Validate with Pydantic
parsed_params = FeedbackInput(**raw_params)

# Build validated request body
validated_arguments = {
"feedback": parsed_params.feedback,
"account_id": parsed_params.account_id,
"tool_names": parsed_params.tool_names,
}

# Use the parent execute method with validated arguments
return super().execute(validated_arguments, options=options)

except json.JSONDecodeError as exc:
raise StackOneError(f"Invalid JSON in arguments: {exc}") from exc
except ValueError as exc:
raise StackOneError(f"Validation error: {exc}") from exc
except Exception as error:
if isinstance(error, StackOneError):
raise
raise StackOneError(f"Error executing feedback tool: {error}") from error
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When validation fails (exceptions before calling super().execute), implicit feedback instrumentation in StackOneTool.execute is skipped, so failed attempts are not recorded. To preserve consistent telemetry, move validation ahead of building arguments but always delegate to a wrapper that calls super().execute within a try/finally that dispatches feedback (or re-use StackOneTool.execute for JSON parsing and only apply additional pydantic validation).

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +98
configure_implicit_feedback(
api_key="/path/to/langsmith.key",
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example passes a filesystem path as api_key; configure_implicit_feedback expects the actual LangSmith API key string, not a file path. Replace '/path/to/langsmith.key' with an API key value or adjust the example to show reading the key file first.

Suggested change
configure_implicit_feedback(
api_key="/path/to/langsmith.key",
# If your API key is stored in a file, read it first:
with open("/path/to/langsmith.key", "r") as f:
langsmith_api_key = f.read().strip()
configure_implicit_feedback(
api_key=langsmith_api_key,

Copilot uses AI. Check for mistakes.
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 17 files

Prompt for AI agents (all 6 issues)

Understand the root cause of the following 6 issues and fix them.


<file name="stackone_ai/implicit_feedback/utils.py">

<violation number="1" location="stackone_ai/implicit_feedback/utils.py:28">
Building list(value.items()) materializes the entire mapping before truncation, so large payloads are still fully copied; iterate lazily (e.g., via itertools.islice) to honor MAX_ITEMS.</violation>

<violation number="2" location="stackone_ai/implicit_feedback/utils.py:39">
Converting sequences with list(value) copies all elements before truncation, negating MAX_ITEMS and causing large-memory copies; iterate lazily instead.</violation>
</file>

<file name="stackone_ai/implicit_feedback/analyzer.py">

<violation number="1" location="stackone_ai/implicit_feedback/analyzer.py:43">
This guard prevents quick refinement detection whenever session_id is missing, even though the tracker keeps history for user-level sessions via the user_id fallback. Drop the session_id check so quick refinements are still surfaced for those records.</violation>
</file>

<file name="stackone_ai/meta_tools.py">

<violation number="1" location="stackone_ai/meta_tools.py:197">
Adding options to MetaExecuteTool.execute without forwarding it causes any supplied feedback metadata to be discarded instead of reaching the underlying tool or implicit feedback logging. Please propagate options through execute_tool and into tool.execute.</violation>
</file>

<file name="stackone_ai/models.py">

<violation number="1" location="stackone_ai/models.py:128">
Feedback keys must be removed from kwargs even when options already defines them; otherwise they still get forwarded to the API request and can break the call.</violation>
</file>

<file name="stackone_ai/implicit_feedback/manager.py">

<violation number="1" location="stackone_ai/implicit_feedback/manager.py:209">
create_run returns None, so run_id is always falsy and this early return prevents publishing any implicit feedback. Please assign a stable run_id yourself (e.g., include record.call_id in the payload and reuse it) before dispatching feedback.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

return sanitized

if isinstance(value, Sequence) and not isinstance(value, (bytes, bytearray, str)):
sanitized_sequence = [sanitize_value(item, depth=depth + 1) for item in list(value)[:MAX_ITEMS]]
Copy link

@cubic-dev-ai cubic-dev-ai bot Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting sequences with list(value) copies all elements before truncation, negating MAX_ITEMS and causing large-memory copies; iterate lazily instead.

Prompt for AI agents
Address the following comment on stackone_ai/implicit_feedback/utils.py at line 39:

<comment>Converting sequences with list(value) copies all elements before truncation, negating MAX_ITEMS and causing large-memory copies; iterate lazily instead.</comment>

<file context>
@@ -0,0 +1,49 @@
+        return sanitized
+
+    if isinstance(value, Sequence) and not isinstance(value, (bytes, bytearray, str)):
+        sanitized_sequence = [sanitize_value(item, depth=depth + 1) for item in list(value)[:MAX_ITEMS]]
+        if len(value) &gt; MAX_ITEMS:
+            sanitized_sequence.append(&quot;&lt;truncated&gt;&quot;)
</file context>

✅ Addressed in 81c14bc


if isinstance(value, Mapping):
sanitized: dict[str, Any] = {}
for key, item in list(value.items())[:MAX_ITEMS]:
Copy link

@cubic-dev-ai cubic-dev-ai bot Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building list(value.items()) materializes the entire mapping before truncation, so large payloads are still fully copied; iterate lazily (e.g., via itertools.islice) to honor MAX_ITEMS.

Prompt for AI agents
Address the following comment on stackone_ai/implicit_feedback/utils.py at line 28:

<comment>Building list(value.items()) materializes the entire mapping before truncation, so large payloads are still fully copied; iterate lazily (e.g., via itertools.islice) to honor MAX_ITEMS.</comment>

<file context>
@@ -0,0 +1,49 @@
+
+    if isinstance(value, Mapping):
+        sanitized: dict[str, Any] = {}
+        for key, item in list(value.items())[:MAX_ITEMS]:
+            key_str = str(key)
+            if key_str.lower() in SENSITIVE_KEYS:
</file context>

✅ Addressed in 81c14bc

def _detect_quick_refinement(
self, history: Sequence[ToolCallRecord], current: ToolCallRecord
) -> tuple[bool, float | None]:
if not current.session_id or not history:
Copy link

@cubic-dev-ai cubic-dev-ai bot Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guard prevents quick refinement detection whenever session_id is missing, even though the tracker keeps history for user-level sessions via the user_id fallback. Drop the session_id check so quick refinements are still surfaced for those records.

Prompt for AI agents
Address the following comment on stackone_ai/implicit_feedback/analyzer.py at line 43:

<comment>This guard prevents quick refinement detection whenever session_id is missing, even though the tracker keeps history for user-level sessions via the user_id fallback. Drop the session_id check so quick refinements are still surfaced for those records.</comment>

<file context>
@@ -0,0 +1,83 @@
+    def _detect_quick_refinement(
+        self, history: Sequence[ToolCallRecord], current: ToolCallRecord
+    ) -&gt; tuple[bool, float | None]:
+        if not current.session_id or not history:
+            return False, None
+
</file context>

✅ Addressed in 81c14bc


def execute(self, arguments: str | JsonDict | None = None) -> JsonDict:
def execute(
self, arguments: str | JsonDict | None = None, *, options: JsonDict | None = None
Copy link

@cubic-dev-ai cubic-dev-ai bot Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding options to MetaExecuteTool.execute without forwarding it causes any supplied feedback metadata to be discarded instead of reaching the underlying tool or implicit feedback logging. Please propagate options through execute_tool and into tool.execute.

Prompt for AI agents
Address the following comment on stackone_ai/meta_tools.py at line 197:

<comment>Adding options to MetaExecuteTool.execute without forwarding it causes any supplied feedback metadata to be discarded instead of reaching the underlying tool or implicit feedback logging. Please propagate options through execute_tool and into tool.execute.</comment>

<file context>
@@ -193,7 +193,9 @@ def __init__(self) -&gt; None:
 
-        def execute(self, arguments: str | JsonDict | None = None) -&gt; JsonDict:
+        def execute(
+            self, arguments: str | JsonDict | None = None, *, options: JsonDict | None = None
+        ) -&gt; JsonDict:
             return execute_filter(arguments)
</file context>
Fix with Cubic

merged_params = dict(params)
feedback_options = dict(options or {})
for key in cls._FEEDBACK_OPTION_KEYS:
if key in merged_params and key not in feedback_options:
Copy link

@cubic-dev-ai cubic-dev-ai bot Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback keys must be removed from kwargs even when options already defines them; otherwise they still get forwarded to the API request and can break the call.

Prompt for AI agents
Address the following comment on stackone_ai/models.py at line 128:

<comment>Feedback keys must be removed from kwargs even when options already defines them; otherwise they still get forwarded to the API request and can break the call.</comment>

<file context>
@@ -110,6 +120,15 @@ def __init__(
+        merged_params = dict(params)
+        feedback_options = dict(options or {})
+        for key in cls._FEEDBACK_OPTION_KEYS:
+            if key in merged_params and key not in feedback_options:
+                feedback_options[key] = merged_params.pop(key)
+        return merged_params, feedback_options
</file context>

✅ Addressed in d7dbf5c

if isinstance(run, Mapping):
run_id = run.get("id") or run.get("run_id") or run_id

if not run_id:
Copy link

@cubic-dev-ai cubic-dev-ai bot Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_run returns None, so run_id is always falsy and this early return prevents publishing any implicit feedback. Please assign a stable run_id yourself (e.g., include record.call_id in the payload and reuse it) before dispatching feedback.

Prompt for AI agents
Address the following comment on stackone_ai/implicit_feedback/manager.py at line 209:

<comment>create_run returns None, so run_id is always falsy and this early return prevents publishing any implicit feedback. Please assign a stable run_id yourself (e.g., include record.call_id in the payload and reuse it) before dispatching feedback.</comment>

<file context>
@@ -0,0 +1,226 @@
+        if isinstance(run, Mapping):
+            run_id = run.get(&quot;id&quot;) or run.get(&quot;run_id&quot;) or run_id
+
+        if not run_id:
+            return
+
</file context>

✅ Addressed in 81c14bc

- Add feedback tool that sends data to API endpoint
- No LangSmith integration - just API calls
- Simple validation with Pydantic
- Auto-included in StackOneToolSet
- 6 focused tests covering validation, execution, and integration
- Add feedback tool that sends data to API endpoint
- No LangSmith integration - just API calls
- Simple validation with Pydantic
- Auto-included in StackOneToolSet
- 6 focused tests covering validation, execution, and integration
- Remove implicit feedback imports from __init__.py
- Fix base class call method to not pass options parameter
- All 6 feedback tests now passing
- Remove implicit feedback imports and unreachable code
- Fix execute method signature to match base class
- All mypy checks now pass
- Remove unused start_time, call_params, end_time variables
- All linting checks now pass (Ruff + Mypy)
@NicolasBelissent
Copy link
Contributor Author

@claude review this pr

- Allow account_id to be string or list of strings
- Send same feedback to each account individually
- Return combined results with success/failure counts
- Add comprehensive tests for validation and execution
- All 9 tests passing, linting clean
- Adjust formatting of tool.execute calls in test_feedback.py for consistency
- Enhance code clarity by aligning dictionary entries
- All tests remain passing
- Split large validation test into focused, single-purpose tests
- Consolidate execution tests to avoid duplication
- Combine multiple account success/error scenarios into one test
- Remove redundant integration test class
- All 10 tests still passing, better organization
Copy link

@willleeney willleeney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NicolasBelissent NicolasBelissent merged commit 9179918 into main Oct 17, 2025
5 checks passed
@NicolasBelissent NicolasBelissent deleted the feat/implicit-feedback-tool branch October 17, 2025 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants