Skip to content

Suspicious Command Detection #34

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jhrozek opened this issue Nov 18, 2024 · 5 comments
Open

Suspicious Command Detection #34

jhrozek opened this issue Nov 18, 2024 · 5 comments

Comments

@jhrozek
Copy link
Contributor

jhrozek commented Nov 18, 2024

Suspicious Command Detection

Summary
Introduce a mechanism to detect and flag potentially suspicious commands generated by AI assistants / agents. This feature will prompt the user to double-check such commands before they are executed or accepted, particularly for fully agentic workflows (where commands might be auto-run).


Background & Motivation

  • Certain shell commands, while not even outright malicious, can pose security risks or unintended consequences if run blindly (e.g., curl | bash, nc -l, sudo).
  • CodeGate should help developers identify and confirm these commands before they’re executed, reducing the risk of accidentally introducing vulnerabilities or making catastrophic changes.
  • This is especially relevant in “agentic” scenarios, where an AI assistant or workflow might automatically execute commands without explicit user approval.

NOTE: As always, start small, simple and validate, the following acts as a guideline of where this could lead.

Requirements

  1. Suspicious Command Detection
    • Maintain a list of known suspicious patterns (e.g., curl | bash, nc -l, sudo).
    • Provide a mechanism (e.g., regex checks, pattern scanning) to identify these commands in AI-generated outputs.
    • Examples of suspicious commands/patterns:
      • curl | bash
      • nc -l
      • sudo
      • Changing environment variables (PATH, LD_LIBRARY_PATH, etc.)
      • File ownership/permission changes (chown, chmod)
      • Package installs like npm install, unless a positive vetting mechanism is in place
      • Destructive Patterns (rm -rf *, fork-bomb :(){ :|:& };:
  2. User Prompt or Flagging
    • For chat-based interactions, automatically flag suspicious commands in the conversation with a warning or prompt:
      “Are you sure you want to run this command? It may have system-wide effects.”
    • For fully agentic workflows, prevent automatic execution of flagged commands until the user explicitly confirms.
  3. Configuration & Customization
    • Allow users to customize or extend the suspicious commands list (add, remove, or override default patterns).
    • Provide a toggle to enable/disable suspicious command blocking entirely (for advanced users).
  4. Logging & Auditing
    • Log all instances where suspicious commands are detected, along with whether the user approved or declined execution.
    • Store logs locally for auditing and compliance purposes.
  5. Seamless Integration
    • Integrate with existing interception Pipeline logic so flagged commands can be audited in the DB and dashboard.
    • Ensure minimal latency or disruption to the typical developer workflow.

Implementation Ideas

  1. Regex-Based Detection
    • A local rule set of suspicious patterns (e.g., YAML or JSON config) keyed to relevant commands.
    • Simple “contains string” or regex scans on the generated command text.
  2. User Confirmation Dialog
    • For chat usage, display a warning or highlight the suspicious command snippet. perhaps like we do for secrets, but we will need to block instead as its the reverse path
    • For agentic flows, pause execution and present a modal or CLI prompt:
      “Command flagged: curl | bash. Confirm to proceed or skip.”
  3. Integration with Policy Enforcement
    • If the user’s policies forbid certain commands, auto-block or require multiple steps of confirmation (e.g., privileged operations).
  4. Future Enhancements
    • Expand detection to suspicious script constructs (e.g., “rm -rf /”), or advanced heuristics that leverage AI to detect anomalies.
    • Add a “whitelist mode” to automatically approve certain commands in controlled environments.

Acceptance Criteria

  1. Suspicious Patterns Defined
    • A default set of suspicious commands is included.
    • Users can add their own via a config or UI.
  2. Flagging & Approval Flow
    • Chat-based usage: Suspicious commands are highlighted with a caution or warning.
    • Agentic usage: Execution is blocked until the user explicitly confirms.
  3. Logging
    • All flagged commands are recorded in a local log with timestamps and user actions (approve/deny).
  4. Performance
    • The detection/flagging process does not introduce significant lag to normal AI interactions.
  5. Documentation
    • Clear instructions on how to manage suspicious command lists, enable/disable the feature, and interpret the logs.

Additional Notes

  • Security Considerations: This feature aims to reduce the chance of accidentally running high-risk commands, but it does not replace general best practices (e.g., running commands in a sandbox or dev environment first) and may not always capture all commands.
@lukehinds
Copy link
Contributor

lukehinds commented Dec 9, 2024

@lukehinds marked for roadmap planning

@lukehinds lukehinds added feature and removed feature labels Dec 15, 2024
@lukehinds lukehinds changed the title Catch suspicious shell commands in LLM outputs and warn about them Suspicious Command Detection Jan 11, 2025
@lukehinds lukehinds removed this from the 0.1.0-alpha.2 (public) milestone Jan 11, 2025
@lukehinds lukehinds self-assigned this Jan 13, 2025
@lukehinds lukehinds assigned poppysec and therealnb and unassigned lukehinds Jan 29, 2025
@lukehinds
Copy link
Contributor

I figure we need to do some re-planning around this work? I heard @jhrozek might believe some client work should land first?

@jhrozek
Copy link
Contributor Author

jhrozek commented Mar 3, 2025

I figure we need to do some re-planning around this work? I heard @jhrozek might believe some client work should land first?

If we want to catch only suspicious tool calls then we need a way to detect them, we need to land the branch we've been hacking on with @blkt first.

@therealnb
Copy link
Contributor

Note: #1151 landed. It is an open question if we want to disable again.

@lukehinds
Copy link
Contributor

Awaiting client work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants