Skip to content

Conversation

@billxinli
Copy link

@billxinli billxinli commented Dec 1, 2025

Summary

This PR adds telemetry functionality to the Socket CLI to track usage patterns, performance metrics, and errors. The implementation includes instrumentation across CLI commands, subprocess executions, and API interactions.

Telemetry Infrastructure

  • Organization-scoped tracking: All telemetry requires org context - cannot track without organization
  • Event batching: Configurable batch sizes with periodic flushing (500ms intervals)
  • Graceful degradation: Telemetry failures never block CLI execution
  • Session tracking: Unique session IDs per CLI invocation
  • Privacy-first: Comprehensive PII sanitization (tokens, file paths, package names)
  • Queue size limiting: Max 1,000 events to prevent memory leaks during API outages
  • Timeout protection: 2-second max flush time prevents hanging on exit

Event Types Tracked

  • CLI lifecycle: cli_start, cli_complete, cli_error
  • Subprocess execution: subprocess_start, subprocess_complete, subprocess_error
  • API interactions: api_request, api_response, api_error
  • Custom events: Generic event tracking with metadata support

PII Sanitization

  • API tokens: Redacts sktsec_* tokens and hex tokens
  • File paths: Replaces home directory with ~
  • Package names: Strips package arguments after wrapper CLIs
  • Sensitive flags: Redacts values after --api-token, --token, -t

Example Sanitization

Input:  ['node', 'socket', 'npm', 'install', '@my/private-pkg', '--token', 'sktsec_abc123']
Output: ['npm', 'install']  // Package name and token removed

Telemetry Configuration

  const TELEMETRY_SERVICE_CONFIG = {
    batch_size: 10,           // Events per batch
    flush_interval: 500,      // 0.5 second periodic flush
    flush_timeout: 2_000,     // 2 second max flush duration
    max_queue_size: 1_000,    // Memory leak protection
  }

Breaking Changes

None. Telemetry is opt-in via organization configuration and fails gracefully.


Note

Introduces org-scoped telemetry across the CLI, SDK, and package manager wrappers with sanitization, batching/flush, global error handling, and comprehensive tests; also updates ecosystems and bumps the SDK.

  • Telemetry Infrastructure:
    • Add utils/telemetry/* (integration, service, types) for org-scoped events, argv/error sanitization, session IDs, batching, periodic flush, and timeouts.
  • CLI:
    • Instrument src/cli.mts to track cli_start, cli_complete, cli_error; ensure finalizeTelemetry(); add handlers for uncaught exceptions and unhandled rejections.
  • Package Manager Wrappers (npm, npx, pnpm, yarn):
    • Track subprocess start/exit in cmd-*.mts and flush telemetry before process exit.
  • SDK Integration:
    • In utils/sdk.mts, add request/response hooks to emit api_request, api_response, api_error (skipping telemetry endpoints) with optional debug logging.
  • Ecosystem Updates:
    • Extend ALL_ECOSYSTEMS (e.g., alpm, qpkg, vscode) and comment out strict type check due to temporary registry/SDK mismatch.
  • Tests:
    • Add unit tests for CLI, SDK hooks, telemetry integration, and service (src/test/*.mts, utils/telemetry/*.test.mts).
  • Dependencies:
    • Bump @socketsecurity/sdk to 1.4.95 (lockfile updated).

Written by Cursor Bugbot for commit 1f673f5. Configure here.

Comment on lines +34 to +36
// Temporarily commented out due to dependency version mismatch.
// SDK has "alpm" but registry's EcosystemString doesn't yet.
// type MissingInEcosystemString = Exclude<PURL_Type, EcosystemString>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sdk synced the latest version of OAS spec.

@billxinli billxinli requested a review from jdalton December 1, 2025 16:59
hooks: {
onRequest: (info: RequestInfo) => {
// Skip tracking for telemetry submission endpoints to prevent infinite loop.
const isTelemetryEndpoint = info.url.includes('/telemetry')
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sdk is calling the hooks on some endpoint and not others. I am not sure if this is the intended pattern. (Should only some endpoints and functions gets hooks and not others? If so, that could simplify this awkward logic.)

@billxinli billxinli marked this pull request as ready for review December 9, 2025 20:13
@socket-security-staging
Copy link

socket-security-staging bot commented Dec 9, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatednpm/​@​socketsecurity/​sdk@​1.4.94 ⏵ 1.4.95100100100100100

View full report

@socket-security
Copy link

socket-security bot commented Dec 9, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatednpm/​@​socketsecurity/​sdk@​1.4.94 ⏵ 1.4.95100100100100100

View full report

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment @cursor review or bugbot run to trigger another review on this PR

@billxinli
Copy link
Author

@cursor review

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment @cursor review or bugbot run to trigger another review on this PR

await trackSubprocessError(command, startTime, error, exitCode)
} else if (exitCode === 0) {
await trackSubprocessComplete(command, startTime, exitCode)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Signal-terminated subprocesses not tracked in telemetry

The trackSubprocessExit function only tracks events when exitCode !== null && exitCode !== 0 (error) or exitCode === 0 (complete). When a subprocess is killed by a signal, Node.js sets code to null and signalName to the signal. This case falls through without tracking any telemetry event, leaving a gap in subprocess tracking for signal-terminated processes.

Fix in Cursor Fix in Web


// eslint-disable-next-line n/no-process-exit
process.exit(1)
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Async handlers for process events may not complete

Using async handlers with process.on('uncaughtException') and process.on('unhandledRejection') is problematic because Node.js doesn't wait for async operations to complete before the handler returns. If the async telemetry calls (trackCliError, finalizeTelemetry) fail or take time, the process may exit before they complete. Additionally, any errors thrown within these async handlers won't be caught, potentially causing secondary unhandled rejections.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the telemetry will await to be flushed before being exited with 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants