Feature/rule engine redesign #685

YakirOren · 2026-01-14T07:35:27Z

Summary by CodeRabbit

New Features
- Multi-context monitoring across Kubernetes, Host, and Standalone environments.
- Host and Standalone monitoring options added (configurable flags).
- Context-aware rule evaluation so rules run based on event source.
- Improved host discovery via a virtual-host representation to enable host-scoped checks.
Improvements
- Enhanced cloud metadata lookup with provider fallbacks (DigitalOcean, GCP, Azure).
Documentation
- Added design doc detailing the multi-context rule engine.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Copilot

Pull request overview

This PR implements a comprehensive rule engine redesign to support multiple execution contexts beyond Kubernetes: host-level monitoring and standalone container monitoring. The changes introduce a context detection system that automatically identifies the source of events (K8s, Host, or Standalone) and routes them appropriately through the rule engine.

Changes:

Introduces a context detection framework with registry and detector system to classify events by source (K8s, Host, Standalone)
Adds configuration options for host and standalone monitoring with context-specific rule filtering
Enables host monitoring by creating a virtual container representing the host's mount namespace
Extends cloud metadata fetching with fallback support for DigitalOcean, GCP, and Azure metadata services

Reviewed changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
pkg/contextdetection/types.go	Defines core types for context detection (EventSourceContext, ContextInfo, ContextDetector interfaces)
pkg/contextdetection/registry.go	Implements mount namespace registry for mapping containers to their execution contexts
pkg/contextdetection/detectors/*.go	Implements detector chain (K8s, Host, Standalone) with priority-based context detection
pkg/rulemanager/rule_manager.go	Integrates context detection into rule evaluation pipeline with context-aware rule filtering
pkg/rulemanager/rulepolicy.go	Adds RuleAppliesToContext function for tag-based context filtering
pkg/rulemanager/ruleadapters/creator.go	Enriches rule failures with context-specific fields (hostname, container details)
pkg/rulemanager/types/failure.go	Extends GenericRuleFailure with SourceContext field
pkg/ebpf/events/enriched_event.go	Adds SourceContext and MountNamespaceID fields to track event origins
pkg/utils/datasource_event.go	Adds nil-safe handling for field accessors and fallback path resolution
pkg/containerwatcher/v2/container_watcher_collection.go	Creates virtual host container for host-level event processing
pkg/containerwatcher/v2/containercallback.go	Skips shared data setup for virtual host container
pkg/containerwatcher/v2/tracers/*.go	Adds "operator.LocalManager.host" parameter to all tracers for host monitoring
pkg/cloudmetadata/metadata.go	Implements cloud provider fallback detection for DigitalOcean, GCP, and Azure
pkg/config/config.go	Adds HostMonitoringEnabled and StandaloneMonitoringEnabled configuration flags
configuration/config.json	Sets default values for host and standalone monitoring flags
cmd/main.go	Initializes MntnsRegistry and passes to RuleManager

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/utils/datasource_event.go

pkg/rulemanager/rule_manager.go

pkg/contextdetection/detectors/host.go

pkg/rulemanager/rulepolicy.go

pkg/cloudmetadata/metadata.go

pkg/rulemanager/ruleadapters/creator.go

pkg/containerwatcher/v2/container_watcher_collection.go

pkg/rulemanager/containercallbacks.go

Copilot

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated 14 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/contextdetection/detectors/manager.go

pkg/contextdetection/registry.go

pkg/rulemanager/rulepolicy.go

pkg/cloudmetadata/metadata.go

pkg/contextdetection/detectors/host.go

pkg/utils/datasource_event.go

Copilot

Pull request overview

Copilot reviewed 38 out of 38 changed files in this pull request and generated 9 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/contextdetection/detectors/standalone.go

pkg/contextdetection/registry.go

pkg/rulemanager/rule_manager.go

pkg/rulemanager/ruleadapters/creator.go

pkg/cloudmetadata/metadata.go

pkg/containerwatcher/v2/container_watcher_collection.go

pkg/rulemanager/rulepolicy.go

pkg/rulemanager/containercallbacks.go

coderabbitai · 2026-01-16T07:11:33Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds a multi-context detection and enrichment system (Kubernetes, Host, Standalone), a mount-namespace registry, context-aware rule evaluation and gating, host-as-container virtual host monitoring, tracer local-manager parameters, cloud metadata IMDS fallback, and related config flags and event accessor resilience.

Changes

Cohort / File(s)	Summary
Context Detection Framework `pkg/contextdetection/types.go`, `pkg/contextdetection/registry.go`, `pkg/contextdetection/detectors/*.go`	New ContextInfo/ContextDetector interfaces and EventSourceContext constants; MntnsRegistry implementation; K8s, Host, Standalone detectors; DetectorManager orchestrating prioritized detection.
Event Enrichment & Rule Manager `pkg/ebpf/events/enriched_event.go`, `pkg/rulemanager/rule_manager.go`, `pkg/rulemanager/containercallbacks.go`, `pkg/rulemanager/ruleadapters/creator.go`, `pkg/rulemanager/rulepolicy.go`, `pkg/rulemanager/types/failure.go`	EnrichedEvent gains SourceContext and MountNamespaceID; RuleManager now accepts mntnsRegistry and detectorManager, enriches events with context, applies context-aware rule filtering/gating, persists SourceContext to rule failures, and maps Host/Standalone metadata into runtime alerts.
Container & Host Monitoring `pkg/containerwatcher/v2/container_watcher_collection.go`, `pkg/containerwatcher/v2/containercallback.go`	Added GetHostAsContainer() to synthesize a host container and register it when HostMonitoringEnabled; container callback skips shared-data setup for virtual host (PID==1); publishes add event to trigger detection.
Tracer Local Manager Support `pkg/containerwatcher/v2/tracers/...`	All tracers now pass parameter `"operator.LocalManager.host":"true"` to RunGadget to tolerate nil container-collection under local manager. Minor behavioral guard change in OpenTracer callback.
Cloud Metadata Fallback `pkg/cloudmetadata/metadata.go`	Adds HTTP helpers, metadata timeout, and provider-specific fetchers (DigitalOcean, GCP, Azure) as IMDS fallback chain with improved error wrapping/logging.
Configuration & Documentation `configuration/config.json`, `pkg/config/config.go`, `docs/RULE_ENGINE_MULTI_CONTEXT_REDESIGN.md`	Adds `HostMonitoringEnabled` and `StandaloneMonitoringEnabled` flags (default false) and extensive design doc for multi-context redesign.
Initialization & Utilities `cmd/main.go`, `pkg/utils/datasource_event.go`	cmd/main.go imports contextdetection and creates/passes mntnsRegistry into CreateRuleManager (signature updated). Datasource event utilities add nullFieldAccessor and missing-field handling to avoid panics on absent fields.

Sequence Diagram

sequenceDiagram
    participant CW as ContainerWatcher
    participant DM as DetectorManager
    participant MR as MntnsRegistry
    participant RM as RuleManager
    participant RE as RuleEvaluator

    CW->>DM: DetectContext(container)
    DM->>DM: K8sDetector.Detect()
    alt K8s found
        DM-->>CW: K8sContextInfo
    else Host found
        DM-->>CW: HostContextInfo
    else Standalone found
        DM-->>CW: StandaloneContextInfo
    end

    CW->>MR: Register(mntnsID, ContextInfo)
    MR-->>CW: OK

    Note over CW,RM: Event reported with MountNamespaceID
    CW->>RM: ReportEnrichedEvent(event)
    RM->>MR: Lookup(event.MountNamespaceID)
    MR-->>RM: ContextInfo
    RM->>RM: enrichEventWithContext()
    RM->>RE: Evaluate(event, context)
    RE->>RE: RuleAppliesToContext() & monitoring enabled check
    RE-->>RM: Matched rule results

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰
I sniffed the mounts and hopped around,
K8s, Host, Standalone — contexts found,
Registries mapped each namespace song,
Rules now listen where events belong,
A happy rabbit hops—enriched and proud!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Feature/rule engine redesign' accurately summarizes the main change, which is a comprehensive redesign of the rule engine to support multi-context execution (Kubernetes, Host, Standalone) with context detection and mount namespace registry capabilities.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@docs/RULE_ENGINE_MULTI_CONTEXT_REDESIGN.md`:
- Around line 29-31: The Go code blocks contain hard tab characters (e.g., lines
defining Kubernetes/Host/Standalone as EventSourceContext and other blocks at
the ranges noted); replace all hard tabs inside those fenced code blocks with
spaces so markdownlint MD010 is satisfied. Locate the Go snippets that define
EventSourceContext constants and similar blocks (symbols like
EventSourceContext, Kubernetes, Host, Standalone and other constant/grouped
lines at the noted ranges) and convert indentation and separators to use spaces
instead of \t throughout each code fence.

In `@pkg/containerwatcher/v2/container_watcher_collection.go`:
- Around line 109-128: The AddContainer return value is not checked: after
calling cw.containerCollection.AddContainer(virtualHostContainer) you must
handle failure by checking its error, logging an explicit error (use
logger.L().Error with helpers.Error(err) and include virtualHostContainer
identifiers like Mntns and Runtime.ContainerPID), and not invoking
cw.containerCallback or logging the success Info when AddContainer failed; if
the enclosing function returns an error propagate/return that error, otherwise
disable/skipping host-monitoring steps (e.g., clear cw.cfg.HostMonitoringEnabled
or set a failure flag) so the system does not silently degrade. Ensure you
modify the block around GetHostAsContainer, cw.containerCollection.AddContainer,
cw.containerCallback and the logger calls accordingly.

♻️ Duplicate comments (4)

pkg/cloudmetadata/metadata.go (2)

137-140: DigitalOcean validation heuristic may produce false positives.

The string containment checks could match unrelated content. A more robust approach would be to attempt fetching a required endpoint (e.g., id) directly and validate based on success.

166-169: Empty check is fragile and needs updating if fields are added.

Consider adding a helper method like IsEmpty() on CloudMetadata or using reflection for maintainability.

pkg/contextdetection/registry.go (1)

31-34: Unused hostMntns field and missing SetHostMntns method.

This was already flagged in a previous review. The hostMntns field is declared but never used, and the referenced SetHostMntns method in the comment doesn't exist.

pkg/rulemanager/rule_manager.go (1)

167-174: Rule filtering for non-K8s contexts is incomplete (acknowledged TODO).

As noted in past reviews, non-K8s contexts load all rules via CreateAllRules() without filtering. This could impact performance in environments with many rules. The TODO acknowledges this limitation.

Consider adding a debug/warning log when using unfiltered rules to help operators understand potential performance implications.

🧹 Nitpick comments (13)

pkg/cloudmetadata/metadata.go (1)
92-117: Consider reusing http.Client for connection pooling.

Creating a new http.Client on each call to fetchHTTPMetadata prevents connection reuse and adds overhead. http.Client is safe for concurrent use and should be reused.
♻️ Suggested refactor

Define a package-level client:
 const (
 	azureApiVersion = "2021-12-13"
 	metadataTimeout = 2 * time.Second
 )

+var metadataHTTPClient = &http.Client{
+	Timeout: metadataTimeout,
+}
Then use it in the function:
 func fetchHTTPMetadata(ctx context.Context, url string, headers map[string]string) (string, error) {
-	client := &http.Client{
-		Timeout: metadataTimeout,
-	}
 	req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
 	if err != nil {
 		return "", err
 	}
 	for k, v := range headers {
 		req.Header.Set(k, v)
 	}
-	resp, err := client.Do(req)
+	resp, err := metadataHTTPClient.Do(req)
pkg/containerwatcher/v2/containercallback.go (1)
63-67: Make virtual-host detection explicit and verify the PID sentinel.
ContainerPID == 1 is a magic number; if any legitimate container can report PID 1 on a supported runtime/hostPID setup, shared data won’t be set and downstream enrichment may break. Please verify the sentinel across runtimes and consider centralizing it (const/helper or explicit host-context flag) for clarity.
♻️ Suggested refactor
+const virtualHostPID = 1
...
- if notif.Container.Runtime.ContainerPID == 1 {
+ if notif.Container.Runtime.ContainerPID == virtualHostPID {
pkg/containerwatcher/v2/tracers/iouring.go (1)
85-88: LGTM!

Consistent with the PR-wide pattern for enabling local-manager aware gadget execution.

Consider extracting the common parameter key and value into a package-level constant or helper function to reduce duplication across all tracers:
♻️ Optional: Extract common params
// In a shared location (e.g., tracers/common.go)
const LocalManagerHostParam = "operator.LocalManager.host"

func DefaultGadgetParams() map[string]string {
    return map[string]string{
        LocalManagerHostParam: "true",
    }
}
This would centralize the parameter definition and make future updates easier across all tracers.
pkg/containerwatcher/v2/tracers/open.go (1)

72-75: Refactor config initialization pattern to avoid implicit ordering dependency.

The cfg field is assigned in IsEnabled() (line 104) but used in Start() (line 73), creating a fragile ordering dependency. While TracerManager ensures IsEnabled() is called before Start(), this pattern is inconsistent with all other tracers and breaks if Start() is called directly. Store the config during tracer initialization in NewOpenTracer() instead of in IsEnabled().

pkg/contextdetection/detectors/k8s.go (2)

10-15: Workload and WorkloadUID fields are declared but never populated.

The K8sContextInfo struct declares Workload and WorkloadUID fields, but the Detect() method (lines 42-45) only populates Namespace and PodName. If these fields are intended for future use, consider adding a TODO comment. Otherwise, remove them to avoid confusion.

25-31: The name field is unused.

The name field is initialized in NewK8sDetector() but never accessed. Consider removing it or utilizing it for logging/debugging purposes.

pkg/contextdetection/registry.go (1)

73-80: Minor: Same check-then-act pattern in Unregister.

Similar to Register, the Has() + Delete() pattern isn't atomic. However, this is less impactful here since deleting a non-existent key is typically idempotent. The main effect is that the debug log might be skipped in a race scenario.
pkg/rulemanager/ruleadapters/creator.go (2)
337-343: Consider logging or handling the hostname error.

When os.Hostname() fails, the error is silently ignored and k8sDetails.NodeName remains empty. While this may be acceptable for non-critical metadata, consider logging the error at debug level for troubleshooting purposes.
Proposed improvement
 	case contextdetection.Host:
 		ruleFailure.SetSourceContext(contextdetection.Host)
 		// For Host context, use NodeName to store the hostname
 		hostname, err := os.Hostname()
-		if err == nil {
+		if err != nil {
+			logger.L().Debug("setContextSpecificFields - failed to get hostname", helpers.Error(err))
+		} else {
 			k8sDetails.NodeName = hostname
 		}
345-361: Redundant field population for Standalone context.

For Standalone context, this code sets Image and ContainerName from triggerEvent (lines 354-360), but setRuntimeAlertK8sDetails (called earlier at line 88) already populates these same fields from triggerEvent at lines 264-286. The checks for empty strings prevent overwriting, but these assignments will never execute since the fields are already set.

This redundancy could cause confusion about the source of truth for these fields. Consider either:

Removing the redundant assignments here, or

Adding a comment explaining this is intentional fallback logic
pkg/contextdetection/detectors/manager.go (1)
21-34: Detector ordering is hardcoded despite Priority() methods.

The detectors have Priority() methods (K8s=0, Host=1, Standalone=2), but the ordering in NewDetectorManager is hardcoded rather than sorted by priority. This creates potential inconsistency if priorities change.

Consider either:

Sorting detectors by Priority() after initialization, or

Removing unused Priority() methods from detectors
Option 1: Sort by priority
 func NewDetectorManager(registry contextdetection.Registry) *DetectorManager {
 	dm := &DetectorManager{
 		registry: registry,
 		detectors: []contextdetection.ContextDetector{
 			NewK8sDetector(),
 			NewHostDetector(),
 			NewStandaloneDetector(),
 		},
 	}
+	
+	// Sort detectors by priority (lower = higher priority)
+	sort.Slice(dm.detectors, func(i, j int) bool {
+		return dm.detectors[i].Priority() < dm.detectors[j].Priority()
+	})

 	return dm
 }
pkg/contextdetection/detectors/standalone.go (1)
17-25: WorkloadID logic can be simplified.

The current logic is correct but verbose. The past review comment suggested a cleaner approach:
Simplified WorkloadID implementation
 func (s *StandaloneContextInfo) WorkloadID() string {
-	if s.ContainerID == "" {
-		return s.ContainerName
-	}
-	if s.ContainerName != "" {
-		return s.ContainerName + ":" + s.ContainerID
-	}
-	return s.ContainerID
+	switch {
+	case s.ContainerName == "":
+		return s.ContainerID
+	case s.ContainerID == "":
+		return s.ContainerName
+	default:
+		return s.ContainerName + ":" + s.ContainerID
+	}
 }
pkg/rulemanager/rule_manager.go (1)
187-190: Silent early exit when monitoring is disabled.

When monitoring is disabled for a context, the function returns silently without logging. Consider adding a debug log to help operators understand why events are being dropped.
Add debug logging for disabled monitoring
 	// Early exit if monitoring is disabled for this context - skip rule evaluation
 	if !rm.isMonitoringEnabledForContext(enrichedEvent.SourceContext) {
+		logger.L().Debug("RuleManager - monitoring disabled for context, skipping rule evaluation",
+			helpers.String("context", string(enrichedEvent.SourceContext.Context())))
 		return
 	}
pkg/utils/datasource_event.go (1)
86-171: Return an error from Set on missing fields.

Set currently returns nil, making writes to absent fields appear successful. All other error-returning methods on nullFieldAccessor (e.g., PutUint8, PutString, Rename) return errFieldNotFound, so Set should too for consistency and to avoid silent failures.
♻️ Proposed change
-func (n *nullFieldAccessor) Set(data datasource.Data, value []byte) error { return nil }
+func (n *nullFieldAccessor) Set(data datasource.Data, value []byte) error { return errFieldNotFound }

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2681b78 and 20ace02.

📒 Files selected for processing (38)

cmd/main.go
configuration/config.json
docs/RULE_ENGINE_MULTI_CONTEXT_REDESIGN.md
pkg/cloudmetadata/metadata.go
pkg/config/config.go
pkg/containerwatcher/v2/container_watcher_collection.go
pkg/containerwatcher/v2/containercallback.go
pkg/containerwatcher/v2/tracers/bpf.go
pkg/containerwatcher/v2/tracers/capabilities.go
pkg/containerwatcher/v2/tracers/dns.go
pkg/containerwatcher/v2/tracers/exec.go
pkg/containerwatcher/v2/tracers/exit.go
pkg/containerwatcher/v2/tracers/fork.go
pkg/containerwatcher/v2/tracers/hardlink.go
pkg/containerwatcher/v2/tracers/http.go
pkg/containerwatcher/v2/tracers/iouring.go
pkg/containerwatcher/v2/tracers/kmod.go
pkg/containerwatcher/v2/tracers/network.go
pkg/containerwatcher/v2/tracers/open.go
pkg/containerwatcher/v2/tracers/ptrace.go
pkg/containerwatcher/v2/tracers/randomx.go
pkg/containerwatcher/v2/tracers/ssh.go
pkg/containerwatcher/v2/tracers/symlink.go
pkg/containerwatcher/v2/tracers/syscall.go
pkg/containerwatcher/v2/tracers/unshare.go
pkg/contextdetection/detectors/host.go
pkg/contextdetection/detectors/k8s.go
pkg/contextdetection/detectors/manager.go
pkg/contextdetection/detectors/standalone.go
pkg/contextdetection/registry.go
pkg/contextdetection/types.go
pkg/ebpf/events/enriched_event.go
pkg/rulemanager/containercallbacks.go
pkg/rulemanager/rule_manager.go
pkg/rulemanager/ruleadapters/creator.go
pkg/rulemanager/rulepolicy.go
pkg/rulemanager/types/failure.go
pkg/utils/datasource_event.go

🧰 Additional context used

🧬 Code graph analysis (12)

pkg/rulemanager/rulepolicy.go (2)

pkg/rulemanager/types/v1/types.go (1)

Rule (20-35)

pkg/contextdetection/types.go (3)

ContextInfo (15-18)

EventSourceContext (7-7)

Kubernetes (10-10)

pkg/rulemanager/types/failure.go (1)

pkg/contextdetection/types.go (1)

EventSourceContext (7-7)

pkg/contextdetection/detectors/standalone.go (1)

pkg/contextdetection/types.go (3)

EventSourceContext (7-7)

Standalone (12-12)

ContextInfo (15-18)

pkg/ebpf/events/enriched_event.go (2)

pkg/utils/events.go (1)

K8sEvent (21-29)

pkg/contextdetection/types.go (1)

ContextInfo (15-18)

pkg/rulemanager/ruleadapters/creator.go (5)

pkg/objectcache/applicationprofilecache_interface.go (1)

ApplicationProfileCache (9-14)

pkg/objectcache/networkneighborhoodcache_interface.go (1)

NetworkNeighborhoodCache (8-12)

pkg/rulemanager/types/failure.go (1)

GenericRuleFailure (15-28)

pkg/ebpf/events/enriched_event.go (1)

EnrichedEvent (21-36)

pkg/contextdetection/types.go (3)

Kubernetes (10-10)

Host (11-11)

Standalone (12-12)

pkg/containerwatcher/v2/container_watcher_collection.go (1)

pkg/objectcache/shared_container_data.go (1)

Container (27-27)

pkg/contextdetection/registry.go (1)

pkg/contextdetection/types.go (1)

ContextInfo (15-18)

pkg/rulemanager/containercallbacks.go (1)

pkg/contextdetection/detectors/standalone.go (1)

StandaloneContextInfo (8-11)

pkg/contextdetection/detectors/k8s.go (1)

pkg/contextdetection/types.go (3)

EventSourceContext (7-7)

Kubernetes (10-10)

ContextInfo (15-18)

pkg/rulemanager/rule_manager.go (9)

pkg/contextdetection/registry.go (1)

Registry (17-28)

pkg/contextdetection/detectors/manager.go (2)

DetectorManager (16-19)

NewDetectorManager (23-34)

pkg/rulemanager/ruleadapters/creator.go (1)

NewRuleFailureCreator (49-57)

pkg/rulemanager/rulepolicy.go (1)

RuleAppliesToContext (38-59)

pkg/contextdetection/types.go (4)

Kubernetes (10-10)

ContextInfo (15-18)

Host (11-11)

Standalone (12-12)

pkg/utils/k8s.go (1)

CreateK8sPodID (20-22)

pkg/rulemanager/types/v1/types.go (1)

Rule (20-35)

pkg/ebpf/events/enriched_event.go (1)

EnrichedEvent (21-36)

pkg/utils/events.go (1)

EnrichEvent (31-49)

pkg/contextdetection/detectors/host.go (1)

pkg/contextdetection/types.go (3)

EventSourceContext (7-7)

Host (11-11)

ContextInfo (15-18)

pkg/utils/datasource_event.go (1)

pkg/utils/events.go (1)

EventType (190-190)

🪛 markdownlint-cli2 (0.18.1)

docs/RULE_ENGINE_MULTI_CONTEXT_REDESIGN.md

29-29: Hard tabs
Column: 1

(MD010, no-hard-tabs)

30-30: Hard tabs
Column: 1

(MD010, no-hard-tabs)

31-31: Hard tabs
Column: 1

(MD010, no-hard-tabs)

44-44: Hard tabs
Column: 1

(MD010, no-hard-tabs)

45-45: Hard tabs
Column: 1

(MD010, no-hard-tabs)

59-59: Hard tabs
Column: 1

(MD010, no-hard-tabs)

60-60: Hard tabs
Column: 1

(MD010, no-hard-tabs)

61-61: Hard tabs
Column: 1

(MD010, no-hard-tabs)

71-71: Hard tabs
Column: 1

(MD010, no-hard-tabs)

72-72: Hard tabs
Column: 1

(MD010, no-hard-tabs)

74-74: Hard tabs
Column: 1

(MD010, no-hard-tabs)

75-75: Hard tabs
Column: 1

(MD010, no-hard-tabs)

76-76: Hard tabs
Column: 1

(MD010, no-hard-tabs)

77-77: Hard tabs
Column: 1

(MD010, no-hard-tabs)

78-78: Hard tabs
Column: 1

(MD010, no-hard-tabs)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-and-push-image
GitHub Check: pr-created / test / Create cross-platform build

🔇 Additional comments (52)

pkg/cloudmetadata/metadata.go (5)

19-22: LGTM!

The constants are well-defined. The Azure API version has been updated to 2021-12-13, and the 2-second timeout is appropriate for metadata service calls.

61-90: LGTM!

The fallback mechanism is well-structured and extensible. The error wrapping with %w properly preserves the original error context for debugging.

119-125: LGTM!

The helper correctly extracts the last path segment and handles the empty string edge case.

174-197: LGTM!

The GCP metadata fetcher correctly uses the required Metadata-Flavor: Google header and validates instance presence via machine-type. The use of getLastPathPart properly normalizes the full resource paths.

199-224: LGTM!

The Azure metadata fetcher correctly uses the required Metadata: true header and the format=text parameter. The API version 2021-12-13 is appropriately recent.

pkg/containerwatcher/v2/tracers/symlink.go (1)

71-74: LGTM!

The addition of the operator.LocalManager.host parameter aligns with the PR-wide pattern to enable local-manager aware gadget execution. The inline comment clearly documents the purpose.

pkg/containerwatcher/v2/tracers/syscall.go (1)

68-72: LGTM!

Clean integration of the operator.LocalManager.host parameter into the existing params map while preserving the eBPF map fetch configuration.

pkg/containerwatcher/v2/tracers/kmod.go (1)

71-74: LGTM!

Consistent with the PR-wide pattern for enabling local-manager aware gadget execution.

pkg/containerwatcher/v2/tracers/hardlink.go (1)

71-74: LGTM!

Consistent with the PR-wide pattern for enabling local-manager aware gadget execution.

pkg/containerwatcher/v2/tracers/exec.go (1)

71-75: LGTM!

The addition of the operator.LocalManager.host parameter alongside the existing operator.oci.ebpf.paths parameter is consistent with the broader PR pattern to enable local-manager-aware gadget execution. The inline comment clearly documents the purpose.

pkg/containerwatcher/v2/tracers/network.go (1)

85-89: LGTM!

The params map correctly combines the existing operator.oci.annotate configuration with the new operator.LocalManager.host flag, maintaining backward compatibility while enabling local-manager-aware execution.

pkg/containerwatcher/v2/tracers/bpf.go (1)

71-74: LGTM!

The addition of the operator.LocalManager.host parameter is consistent with the other tracer implementations in this PR, enabling local-manager-aware gadget execution.

pkg/containerwatcher/v2/tracers/randomx.go (1)

68-71: LGTM!

The addition of the operator.LocalManager.host parameter is consistent with the other tracer implementations in this PR.

pkg/containerwatcher/v2/tracers/capabilities.go (1)

70-75: Good addition of local-manager parameter.

This keeps RunGadget params non-nil and aligns with local-manager handling expectations.

pkg/containerwatcher/v2/tracers/http.go (1)

85-88: LGTM: params map added for local-manager host.

pkg/containerwatcher/v2/tracers/ssh.go (1)

72-75: Looks good — local-manager param added.

pkg/containerwatcher/v2/tracers/exit.go (1)

67-70: LGTM: local-manager params added.

pkg/containerwatcher/v2/tracers/dns.go (1)

76-79: Good: keeps existing params and adds local-manager host.

pkg/containerwatcher/v2/tracers/ptrace.go (1)

67-70: LGTM!

The params map addition follows the consistent pattern established across other tracers in this PR, enabling local-manager-aware gadget execution without errors when container-collection is nil.

pkg/containerwatcher/v2/tracers/fork.go (1)

67-70: LGTM!

Consistent with the local-manager host parameter pattern applied across tracers.

pkg/containerwatcher/v2/tracers/unshare.go (1)

71-74: LGTM!

Consistent with the local-manager host parameter pattern applied across tracers.

pkg/containerwatcher/v2/container_watcher_collection.go (1)

175-194: LGTM!

The function is clean and well-documented. It correctly retrieves the mount namespace for PID 1 and creates a minimal synthetic container representation suitable for host-level event processing.

pkg/contextdetection/types.go (1)

1-23: LGTM!

Clean and minimal interface definitions that establish a well-structured foundation for the context detection system. The ContextDetector interface with Priority() enables flexible, priority-based detection ordering, and the ContextInfo abstraction cleanly separates context data from detection logic.

pkg/contextdetection/detectors/k8s.go (2)

33-48: LGTM!

The Detect method has proper nil checks and validates that both PodName and Namespace are non-empty before returning a valid context. The defensive programming approach is appropriate.

50-52: LGTM!

Priority 0 correctly establishes K8s detection as the highest priority detector.

pkg/contextdetection/registry.go (2)

67-71: LGTM!

The Lookup method is a clean delegation to the underlying SafeMap.Load().

51-58: Race condition between Has() and Set() in Register(), and between Has() and Delete() in Unregister().

The check-then-act patterns are not atomic. In concurrent scenarios, two goroutines could both pass the Has() check and proceed to Set() (or Delete()), potentially causing data races or silent overwrites. While SafeMap operations are individually thread-safe, the compound operations are not.

Unfortunately, goradd/maps.SafeMap does not provide atomic check-and-set methods like LoadOrStore. To fix this, either:

Protect the compound operations with an external sync.Mutex, or

Accept this limitation if registration/unregistration only occurs during initialization when concurrency is not a concern (and document this assumption)

configuration/config.json (1)

18-19: LGTM!

The new configuration flags hostMonitoringEnabled and standaloneMonitoringEnabled are correctly added with boolean false defaults, matching the corresponding Go struct fields.

pkg/config/config.go (2)

69-70: LGTM!

The new configuration fields are properly declared with correct mapstructure tags that match the JSON configuration keys.

184-185: Config flags are properly utilized in the codebase.

Both HostMonitoringEnabled and StandaloneMonitoringEnabled are consumed as expected:

HostMonitoringEnabled gates behavior in pkg/containerwatcher/v2/container_watcher_collection.go:110

Both flags are accessed in pkg/rulemanager/rule_manager.go:287-289

The defaults are correctly set and the flags are actively used to control behavior.

cmd/main.go (2)

32-32: LGTM!

The contextdetection package import is correctly added to support the new mount namespace registry functionality.

294-297: LGTM! Proper integration of the mount namespace registry.

The mntnsRegistry is correctly instantiated within the runtime detection block (guarded by cfg.EnableRuntimeDetection) and properly passed to CreateRuleManager, where it is immediately used to initialize the DetectorManager. This ensures the registry is only created when runtime detection is enabled, while the else branch correctly creates mocks instead.

pkg/ebpf/events/enriched_event.go (1)

7-7: Context metadata fields are well-integrated.

The new SourceContext and MountNamespaceID fields align with the enrichment flow, and their zero values are safe until populated.

Also applies to: 22-35

pkg/rulemanager/containercallbacks.go (2)

66-84: Context detection + registry registration looks solid.

The fallback to standalone and conditional registry registration keep enrichment resilient without blocking container tracking.

100-102: Registry cleanup on removal is a good safeguard.

Unregistering the mount namespace avoids stale context mappings after container teardown.

pkg/rulemanager/rulepolicy.go (1)

37-58: Context tag filtering logic is clear and backward compatible.

Defaulting to Kubernetes when no context tags are present matches the intended behavior and keeps existing rules safe.

pkg/rulemanager/types/failure.go (1)

27-28: No action required: RuleFailure interface expansion is complete.

GenericRuleFailure is the only concrete implementation of the RuleFailure interface in the codebase, and it already includes both GetSourceContext() (line 172) and SetSourceContext() (line 176) methods. The mock implementations (mockExporter, MockExporter, ExporterMock) in test files only accept RuleFailure as a parameter type in SendRuleAlert methods; they do not implement the interface themselves. No additional updates are needed.

pkg/rulemanager/ruleadapters/creator.go (2)

322-327: Nil context defaults to Kubernetes - LGTM.

The backward-compatible default to Kubernetes context when SourceContext is nil is appropriate for maintaining existing behavior.

114-118: Nil check pattern is consistent with other setters.

The early return pattern for nil triggerEvent aligns with the same pattern in setCloudServices, setBaseRuntimeAlert, and setRuntimeAlertK8sDetails. This addresses the nil pointer dereference concern raised in past reviews.

pkg/contextdetection/detectors/manager.go (1)

39-67: DetectContext implementation is sound.

The method correctly:

Validates nil input with appropriate error

Iterates detectors in priority order

Returns first matching context

Falls back to Standalone with warning log

The design appropriately handles the case where no detector explicitly matches.

pkg/contextdetection/detectors/host.go (2)

30-36: Hostname error silently ignored but has fallback.

If os.Hostname() fails, hostName will be empty, but WorkloadID() (lines 18-23) falls back to "host" when HostName is empty. This is acceptable behavior, though you may want to log the error for debugging.

38-53: Host detection logic correctly identifies virtual host container.

The logic properly:

Excludes containers with K8s metadata (lines 43-45)

Identifies host context via ContainerPid() == 1 (line 47)

This addresses the past review concern about misclassifying standalone containers as host context, since standalone containers will have regular PIDs, not PID 1.

pkg/contextdetection/detectors/standalone.go (1)

35-46: Standalone detector correctly acts as catch-all.

The Detect method unconditionally returns true for non-nil containers, which is the intended behavior as the fallback detector after K8s and Host detectors fail to match. This design is consistent with the DetectorManager fallback logic.

pkg/rulemanager/rule_manager.go (4)

144-165: Context determination logic is sound.

The logic correctly determines the context type and retrieves workload details:

K8s context: uses pod/namespace to look up WLID

Host/Standalone: uses SourceContext.WorkloadID()

The nil check at line 158 is defensive but technically redundant since lines 156 branch on !isK8sContext which already implies SourceContext != nil (from line 144).

259-276: Context enrichment logic is correct.

The enrichEventWithContext method properly:

Extracts mount namespace ID via type assertion

Sets MountNamespaceID on the event

Looks up context from registry and populates SourceContext

Includes appropriate debug logging

278-296: Monitoring config check handles all contexts correctly.

The isMonitoringEnabledForContext method properly:

Defaults to true for nil context (backward compatible)

Checks HostMonitoringEnabled for Host context

Checks StandaloneMonitoringEnabled for Standalone context

Always enables Kubernetes monitoring

The default case returning true (line 294) is a safe fallback for any future context types.

197-205: Context-aware profile dependency check is appropriate.

The logic correctly skips profile dependency checks for non-K8s contexts since application profiles are K8s-specific. The condition isK8sContext && !profileExists && rule.ProfileDependency == armotypes.Required properly gates this check.

pkg/utils/datasource_event.go (5)

3-21: No review notes for the import additions.

173-187: Nice guard for missing fields in the accessor cache.

This prevents nil dereference when a field is absent while preserving caching behavior.

256-265: LGTM for accessor-based comm lookup.

430-437: Good fallback to fname when fpath is empty.

576-590: Accessor-based PID lookup looks good.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-16T07:18:53Z

docs/RULE_ENGINE_MULTI_CONTEXT_REDESIGN.md

+	Kubernetes EventSourceContext = "kubernetes"
+	Host       EventSourceContext = "host"
+	Standalone EventSourceContext = "standalone"


⚠️ Potential issue | 🟡 Minor

Fix markdownlint MD010 hard tabs in code blocks.

Markdownlint flags hard tabs in the Go snippets. Replace tabs with spaces to keep doc linting clean.

🧹 Example fix

- Kubernetes EventSourceContext = "kubernetes" + Kubernetes EventSourceContext = "kubernetes"

Also applies to: 44-45, 59-61, 71-78

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

29-29: Hard tabs
Column: 1

(MD010, no-hard-tabs)

30-30: Hard tabs
Column: 1

(MD010, no-hard-tabs)

31-31: Hard tabs
Column: 1

(MD010, no-hard-tabs)

🤖 Prompt for AI Agents

In `@docs/RULE_ENGINE_MULTI_CONTEXT_REDESIGN.md` around lines 29 - 31, The Go code blocks contain hard tab characters (e.g., lines defining Kubernetes/Host/Standalone as EventSourceContext and other blocks at the ranges noted); replace all hard tabs inside those fenced code blocks with spaces so markdownlint MD010 is satisfied. Locate the Go snippets that define EventSourceContext constants and similar blocks (symbols like EventSourceContext, Kubernetes, Host, Standalone and other constant/grouped lines at the noted ranges) and convert indentation and separators to use spaces instead of \t throughout each code fence.

coderabbitai · 2026-01-16T07:18:53Z

pkg/containerwatcher/v2/container_watcher_collection.go

+	// Create virtual host container if host monitoring enabled
+	if cw.cfg.HostMonitoringEnabled {
+		virtualHostContainer, err := GetHostAsContainer()
+		if err != nil {
+			logger.L().Warning("ContainerManager - failed to create virtual host container",
+				helpers.Error(err))
+		} else {
+			cw.containerCollection.AddContainer(virtualHostContainer)
+
+			// Manually trigger callbacks to ensure context detection runs
+			cw.containerCallback(containercollection.PubSubEvent{
+				Type:      containercollection.EventTypeAddContainer,
+				Container: virtualHostContainer,
+			})
+
+			logger.L().Info("ContainerManager - virtual host container created",
+				helpers.String("mntns", fmt.Sprintf("%d", virtualHostContainer.Mntns)),
+				helpers.String("pid", fmt.Sprintf("%d", virtualHostContainer.Runtime.ContainerPID)))
+		}
+	}


⚠️ Potential issue | 🟠 Major

Unchecked error from AddContainer and silent degradation.

Two issues:

Line 116: The return value from cw.containerCollection.AddContainer(virtualHostContainer) is not checked. If adding the container fails, the code proceeds to trigger callbacks for a container that wasn't actually added.

As noted in a previous review, when host monitoring is explicitly enabled via configuration but fails, only a warning is logged. This silent degradation may leave operators unaware that their requested feature is not functioning.

🔧 Proposed fix

// Create virtual host container if host monitoring enabled if cw.cfg.HostMonitoringEnabled { virtualHostContainer, err := GetHostAsContainer() if err != nil { - logger.L().Warning("ContainerManager - failed to create virtual host container", + logger.L().Error("ContainerManager - failed to create virtual host container", helpers.Error(err)) - } else { - cw.containerCollection.AddContainer(virtualHostContainer) + return fmt.Errorf("creating virtual host container: %w", err) + } + + if err := cw.containerCollection.AddContainer(virtualHostContainer); err != nil { + logger.L().Error("ContainerManager - failed to add virtual host container to collection", + helpers.Error(err)) + return fmt.Errorf("adding virtual host container to collection: %w", err) + } - // Manually trigger callbacks to ensure context detection runs - cw.containerCallback(containercollection.PubSubEvent{ - Type: containercollection.EventTypeAddContainer, - Container: virtualHostContainer, - }) + // Manually trigger callbacks to ensure context detection runs + cw.containerCallback(containercollection.PubSubEvent{ + Type: containercollection.EventTypeAddContainer, + Container: virtualHostContainer, + }) - logger.L().Info("ContainerManager - virtual host container created", - helpers.String("mntns", fmt.Sprintf("%d", virtualHostContainer.Mntns)), - helpers.String("pid", fmt.Sprintf("%d", virtualHostContainer.Runtime.ContainerPID))) - } + logger.L().Info("ContainerManager - virtual host container created", + helpers.String("mntns", fmt.Sprintf("%d", virtualHostContainer.Mntns)), + helpers.String("pid", fmt.Sprintf("%d", virtualHostContainer.Runtime.ContainerPID))) }

🤖 Prompt for AI Agents

In `@pkg/containerwatcher/v2/container_watcher_collection.go` around lines 109 - 128, The AddContainer return value is not checked: after calling cw.containerCollection.AddContainer(virtualHostContainer) you must handle failure by checking its error, logging an explicit error (use logger.L().Error with helpers.Error(err) and include virtualHostContainer identifiers like Mntns and Runtime.ContainerPID), and not invoking cw.containerCallback or logging the success Info when AddContainer failed; if the enclosing function returns an error propagate/return that error, otherwise disable/skipping host-monitoring steps (e.g., clear cw.cfg.HostMonitoringEnabled or set a failure flag) so the system does not silently degrade. Ensure you modify the block around GetHostAsContainer, cw.containerCollection.AddContainer, cw.containerCallback and the logger calls accordingly.

# Conflicts: # pkg/rulemanager/types/failure.go

Update SourceContext values to use consistent terminology: - "kubernetes" for Kubernetes context - "host" for Host context - "standalone" for Standalone/Docker container context Aligns with contextdetection.Standalone enum naming and improves clarity that this context supports any container runtime, not just Docker.

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

# Conflicts: # pkg/config/config.go

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>

…ategy and timeout settings Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@pkg/cloudmetadata/metadata.go`:
- Around line 81-89: The current return always wraps only the original IMDS
error (err) even when fallbacks fail for other reasons; update the loop over
fallbacks (the fallbacks slice and fb.fetch calls) to capture the last non-nil
fallback error (e.g., lastFallbackErr) or aggregate errors as you iterate, then
when returning use fmt.Errorf to include both the IMDS error and the fallback
failure context (e.g., "failed to get cloud metadata from IMDS or fallbacks:
imds: %v; fallback: %w" or similar), so the final error preserves the IMDS error
plus the final fallback error from fb.fetch for easier troubleshooting.
- Around line 93-96: The http.Client in fetchHTTPMetadata currently uses the
default transport which may proxy metadata requests; create and assign a custom
transport with Proxy set to nil (e.g., transport := &http.Transport{Proxy: nil})
and set client.Transport = transport so metadata traffic bypasses any
HTTP_PROXY/NO_PROXY environment settings; keep the existing Timeout on the
http.Client and preserve any other transport fields if needed.

In `@pkg/utils/datasource_event.go`:
- Line 94: The nullFieldAccessor.Set method currently returns nil which hides
attempts to write missing fields; change the implementation of
nullFieldAccessor.Set to return the same sentinel error used by the Put* methods
(errFieldNotFound) so callers receive a consistent error when writing to a
non-existent field; update the Set function (symbol: nullFieldAccessor.Set) to
return errFieldNotFound instead of nil so behavior matches the Put* methods.

♻️ Duplicate comments (3)

pkg/utils/datasource_event.go (1)

430-442: Previous nil-check concern is now addressed.

The fallback to fname when fpath is empty is now safe with the missingFieldAccessor implementation. Both accessor calls will return empty strings (rather than panicking) if the fields don't exist, and the fallback logic correctly tries fname as an alternative.

pkg/contextdetection/registry.go (1)

31-34: Unused hostMntns field.

The hostMntns field is declared but never used. The comment on line 40 references a SetHostMntns method that doesn't exist. Either implement the intended functionality or remove the unused field.

docs/RULE_ENGINE_MULTI_CONTEXT_REDESIGN.md (1)

29-31: Hard tabs in code blocks trigger markdownlint MD010.

The Go code snippets use hard tabs for indentation which violates markdownlint rules. Replace tabs with spaces for consistency.

🧹 Nitpick comments (8)

pkg/containerwatcher/v2/tracers/hardlink.go (1)
71-74: LGTM!

Consistent with the PR-wide pattern for local manager support.

Consider extracting shared params to reduce duplication.

The same params map is duplicated across all tracers. You could define a package-level constant or helper function to reduce repetition:
// In a shared location (e.g., tracers/common.go)
var defaultGadgetParams = map[string]string{
    "operator.LocalManager.host": "true",
}
This is a minor improvement and can be deferred if preferred.
pkg/containerwatcher/v2/containercallback.go (1)
63-67: Consider adding symmetric handling in RemoveContainer path.

The early return for ContainerPID == 1 correctly skips K8s-dependent shared data setup for the virtual host container. However, the RemoveContainer path at Line 78 will still call DeleteSharedContainerData for virtual host containers, attempting to delete data that was never set.

While this is likely benign (a no-op delete), adding a symmetric check would improve clarity and consistency:
♻️ Suggested improvement
 	case containercollection.EventTypeRemoveContainer:
 		logger.L().Debug("ContainerWatcher.containerCallback - remove container event received",
 			helpers.String("container ID", notif.Container.Runtime.ContainerID),
 			helpers.String("k8s workload", k8sContainerID),
 			helpers.String("ContainerImageDigest", notif.Container.Runtime.ContainerImageDigest),
 			helpers.String("ContainerImageName", notif.Container.Runtime.ContainerImageName))
 		cw.metrics.ReportContainerStop()
+		// Skip shared data cleanup for virtual host container (was never set)
+		if notif.Container.Runtime.ContainerPID == 1 {
+			return
+		}
 		cw.objectCache.K8sObjectCache().DeleteSharedContainerData(notif.Container.Runtime.ContainerID)
pkg/contextdetection/detectors/host.go (1)
30-36: Consider logging when hostname retrieval fails.

The error from os.Hostname() is silently discarded. While the fallback to empty string with WorkloadID() returning "host" is reasonable, logging the error would aid debugging in environments where hostname resolution fails unexpectedly.
♻️ Suggested improvement
 func NewHostDetector() *HostDetector {
-	hostName, _ := os.Hostname()
+	hostName, err := os.Hostname()
+	if err != nil {
+		logger.L().Warning("HostDetector - failed to get hostname, using fallback",
+			helpers.Error(err))
+	}
 	return &HostDetector{
 		name:     "HostDetector",
 		hostName: hostName,
 	}
 }
pkg/contextdetection/registry.go (1)
51-58: Potential race condition between Has() and Set().

The check-then-act pattern (Has() followed by Set()) has a TOCTOU (time-of-check to time-of-use) window. Two concurrent goroutines could both pass the Has() check and proceed to Set(), resulting in a missed warning log and potential data overwrite.

Consider using an atomic operation if the SafeMap supports it, or accept that duplicate registrations may occasionally slip through the warning.
♻️ Option: Use SetIfAbsent pattern if available

If maps.SafeMap supports LoadOrStore or similar atomic operations, use that instead:
// Pseudo-code - depends on SafeMap API
if existed := r.entries.SetIfAbsent(mntns, info); existed {
    logger.L().Warning("MntnsRegistry - mount namespace already registered, skipping", ...)
    return nil
}
Otherwise, document that the warning log is best-effort and duplicates may occasionally be accepted.
pkg/contextdetection/detectors/standalone.go (1)
17-25: WorkloadID logic is functionally correct but could be cleaner.

The current logic works but has some redundancy. After the check on line 18, ContainerID is guaranteed non-empty, so the condition on line 21 (ContainerName != "") could be the primary check.
♻️ Optional simplification
 func (s *StandaloneContextInfo) WorkloadID() string {
-	if s.ContainerID == "" {
-		return s.ContainerName
-	}
-	if s.ContainerName != "" {
-		return s.ContainerName + ":" + s.ContainerID
+	if s.ContainerName == "" {
+		return s.ContainerID
 	}
-	return s.ContainerID
+	if s.ContainerID == "" {
+		return s.ContainerName
+	}
+	return s.ContainerName + ":" + s.ContainerID
 }
pkg/contextdetection/detectors/manager.go (2)

16-19: Unused registry field in DetectorManager.

The registry field is stored but never used within DetectorManager. The actual registration is handled by the caller in containercallbacks.go. Consider removing this field if it's not needed, or document why it's stored for future use.

59-66: Unreachable fallback code given current detector implementations.

Since StandaloneDetector.Detect() always returns true for non-nil containers, the warning and fallback on lines 60-66 are currently unreachable. The for loop will always find a match via the StandaloneDetector.

This defensive code isn't harmful, but the warning message could be misleading during debugging. Consider either:

Removing this fallback since StandaloneDetector guarantees a match, or

Keeping it as a safety net but updating the comment to explain it's a defensive measure

pkg/rulemanager/rule_manager.go (1)

167-174: TODO: Rule filtering for non-K8s contexts remains unimplemented.

For non-K8s contexts, CreateAllRules() loads all rules without context-based filtering. While RuleAppliesToContext (line 198) filters per-rule based on "context:" tags (and defaults most rules to Kubernetes-only), pre-filtering at retrieval time would avoid loading unnecessary rules.

Consider adding a method like CreateRulesByContext(context) to the rule creator interface that filters rules during retrieval. Alternatively, leverage the existing CreateRulesByTags() method to filter by context tags where rules define them. This becomes important as the rule set grows.

coderabbitai · 2026-01-19T08:37:00Z

pkg/cloudmetadata/metadata.go

+	for _, fb := range fallbacks {
+		if meta, ferr := fb.fetch(ctx); ferr == nil && meta != nil {
+			logger.L().Info(fmt.Sprintf("retrieved cloud metadata from %s metadata service", fb.name))
+			return meta, nil
+		}
+	}
+
+	// Wrap the underlying error with additional context so logs make it clearer why metadata is missing.
+	return nil, fmt.Errorf("failed to get cloud metadata from IMDS or fallbacks: %w", err)


⚠️ Potential issue | 🟡 Minor

Preserve fallback failure context in the returned error.

Right now the final error always wraps the original IMDS failure, even if fallbacks fail for different reasons. Capturing the last fallback error (or aggregating) will make troubleshooting much clearer.

🛠️ Proposed fix (keep the IMDS error but include fallback failure)

@@ - for _, fb := range fallbacks { - if meta, ferr := fb.fetch(ctx); ferr == nil && meta != nil { - logger.L().Info(fmt.Sprintf("retrieved cloud metadata from %s metadata service", fb.name)) - return meta, nil - } - } - - // Wrap the underlying error with additional context so logs make it clearer why metadata is missing. - return nil, fmt.Errorf("failed to get cloud metadata from IMDS or fallbacks: %w", err) + var lastErr error + for _, fb := range fallbacks { + if meta, ferr := fb.fetch(ctx); ferr == nil && meta != nil { + logger.L().Info(fmt.Sprintf("retrieved cloud metadata from %s metadata service", fb.name)) + return meta, nil + } else if ferr != nil { + lastErr = fmt.Errorf("%s: %w", fb.name, ferr) + } + } + + if lastErr != nil { + return nil, fmt.Errorf("failed to get cloud metadata from IMDS (%v) or fallbacks (%w)", err, lastErr) + } + return nil, fmt.Errorf("failed to get cloud metadata from IMDS or fallbacks: %w", err)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for _, fb := range fallbacks {

if meta, ferr := fb.fetch(ctx); ferr == nil && meta != nil {

logger.L().Info(fmt.Sprintf("retrieved cloud metadata from %s metadata service", fb.name))

return meta, nil

}

}

// Wrap the underlying error with additional context so logs make it clearer why metadata is missing.

return nil, fmt.Errorf("failed to get cloud metadata from IMDS or fallbacks: %w", err)

var lastErr error

for _, fb := range fallbacks {

if meta, ferr := fb.fetch(ctx); ferr == nil && meta != nil {

logger.L().Info(fmt.Sprintf("retrieved cloud metadata from %s metadata service", fb.name))

return meta, nil

} else if ferr != nil {

lastErr = fmt.Errorf("%s: %w", fb.name, ferr)

}

}

if lastErr != nil {

return nil, fmt.Errorf("failed to get cloud metadata from IMDS (%v) or fallbacks (%w)", err, lastErr)

}

return nil, fmt.Errorf("failed to get cloud metadata from IMDS or fallbacks: %w", err)

🤖 Prompt for AI Agents

In `@pkg/cloudmetadata/metadata.go` around lines 81 - 89, The current return always wraps only the original IMDS error (err) even when fallbacks fail for other reasons; update the loop over fallbacks (the fallbacks slice and fb.fetch calls) to capture the last non-nil fallback error (e.g., lastFallbackErr) or aggregate errors as you iterate, then when returning use fmt.Errorf to include both the IMDS error and the fallback failure context (e.g., "failed to get cloud metadata from IMDS or fallbacks: imds: %v; fallback: %w" or similar), so the final error preserves the IMDS error plus the final fallback error from fb.fetch for easier troubleshooting.

coderabbitai · 2026-01-19T08:37:00Z

pkg/cloudmetadata/metadata.go

+func fetchHTTPMetadata(ctx context.Context, url string, headers map[string]string) (string, error) {
+	client := &http.Client{
+		Timeout: metadataTimeout,
+	}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Read the metadata.go file to understand the full context cat -n pkg/cloudmetadata/metadata.go | head -120

Repository: kubescape/node-agent

Length of output: 4780

🌐 Web query:

Go http.Client default proxy behavior HTTP_PROXY environment variable

💡 Result:

Short answer

The default http.Client uses http.DefaultTransport, whose Transport.Proxy is set to http.ProxyFromEnvironment — so by default Go will use HTTP_PROXY / HTTPS_PROXY and NO_PROXY (also lowercase) from the environment. [1][2]

Key details

Proxy selection: requests use the proxy var matching the request scheme (HTTPS_PROXY takes precedence for https); NO_PROXY excludes hosts. [1][2]

Special cases: ProxyFromEnvironment treats localhost/loopback as non-proxied by default. [2]

Security: do not trust HTTP_PROXY in CGI environments without mitigation (the httpoxy issue — incoming Proxy header can set HTTP_PROXY). Sanitize environment in CGI contexts. [3]

If you want code to rely on environment proxies explicitly, set Transport.Proxy = http.ProxyFromEnvironment on your client’s Transport. [2]

Sources

net/http documentation — ProxyFromEnvironment. [1]

net/http DefaultTransport (Proxy: ProxyFromEnvironment) and golang.org/x/net/http/httpproxy docs. [2]

httpoxy / Go vulnerability guidance (GO-2022-0761). [3]

🌐 Web query:

Go http.Transport Proxy nil disable proxies metadata

💡 Result:

Yes — setting Transport.Proxy to nil disables proxies. By default net/http.Transport.Proxy is http.ProxyFromEnvironment; if Proxy is nil or returns a nil *url.URL, no proxy is used. Example:

Default (uses environment):
tr := &http.Transport{} // ProxyFromEnvironment is used by default when nil on a default Client

Explicitly disable proxies:
tr := &http.Transport{ Proxy: nil }

Or explicitly return nil:
tr := &http.Transport{ Proxy: func(*http.Request) (*url.URL, error) { return nil, nil } }

References: net/http docs and Transport source. [1][2]

Disable proxies for metadata traffic.

http.Client without a custom transport defaults to http.ProxyFromEnvironment, which respects HTTP_PROXY and NO_PROXY environment variables. Metadata service endpoints (169.254.169.254 for AWS, 169.254.170.2 for Azure, etc.) are not automatically in NO_PROXY by default. If a proxy is configured without explicitly excluding metadata hosts, metadata traffic—including credentials—may be proxied through an untrusted proxy, risking credential leakage or service failure. Set transport.Proxy = nil to disable proxying for these requests.

🔒 Proposed fix (disable proxies for metadata calls)

@@ - client := &http.Client{ - Timeout: metadataTimeout, - } + transport := http.DefaultTransport.(*http.Transport).Clone() + transport.Proxy = nil + client := &http.Client{ + Timeout: metadataTimeout, + Transport: transport, + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func fetchHTTPMetadata(ctx context.Context, url string, headers map[string]string) (string, error) {

client := &http.Client{

Timeout: metadataTimeout,

}

func fetchHTTPMetadata(ctx context.Context, url string, headers map[string]string) (string, error) {

transport := http.DefaultTransport.(*http.Transport).Clone()

transport.Proxy = nil

client := &http.Client{

Timeout: metadataTimeout,

Transport: transport,

}

🤖 Prompt for AI Agents

In `@pkg/cloudmetadata/metadata.go` around lines 93 - 96, The http.Client in fetchHTTPMetadata currently uses the default transport which may proxy metadata requests; create and assign a custom transport with Proxy set to nil (e.g., transport := &http.Transport{Proxy: nil}) and set client.Transport = transport so metadata traffic bypasses any HTTP_PROXY/NO_PROXY environment settings; keep the existing Timeout on the http.Client and preserve any other transport fields if needed.

coderabbitai · 2026-01-19T08:37:00Z

pkg/utils/datasource_event.go

+func (n *nullFieldAccessor) FullName() string                             { return "" }
+func (n *nullFieldAccessor) Size() uint32                                 { return 0 }
+func (n *nullFieldAccessor) Get(data datasource.Data) []byte              { return nil }
+func (n *nullFieldAccessor) Set(data datasource.Data, value []byte) error { return nil }


⚠️ Potential issue | 🟡 Minor

Inconsistent error return in Set method.

The Set method returns nil while all Put* methods (lines 157-169) return errFieldNotFound. This inconsistency could mask errors when callers attempt to write to a missing field.

Proposed fix

-func (n *nullFieldAccessor) Set(data datasource.Data, value []byte) error { return nil } +func (n *nullFieldAccessor) Set(data datasource.Data, value []byte) error { return errFieldNotFound }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (n *nullFieldAccessor) Set(data datasource.Data, value []byte) error { return nil }

func (n *nullFieldAccessor) Set(data datasource.Data, value []byte) error { return errFieldNotFound }

🤖 Prompt for AI Agents

In `@pkg/utils/datasource_event.go` at line 94, The nullFieldAccessor.Set method currently returns nil which hides attempts to write missing fields; change the implementation of nullFieldAccessor.Set to return the same sentinel error used by the Put* methods (errFieldNotFound) so callers receive a consistent error when writing to a non-existent field; update the Set function (symbol: nullFieldAccessor.Set) to return errFieldNotFound instead of nil so behavior matches the Put* methods.

matthyx

well done @YakirOren

matthyx mentioned this pull request Jan 15, 2026

Fixes needed for host-agent #682

Closed

matthyx requested a review from Copilot January 15, 2026 09:30

Copilot started reviewing on behalf of matthyx January 15, 2026 09:30 View session

Copilot AI reviewed Jan 15, 2026

View reviewed changes

matthyx requested a review from Copilot January 15, 2026 11:30

Copilot started reviewing on behalf of matthyx January 15, 2026 11:31 View session

Copilot AI reviewed Jan 15, 2026

View reviewed changes

matthyx mentioned this pull request Jan 15, 2026

Rule Engine Multi-Context Redesign #683

Closed

matthyx requested a review from Copilot January 15, 2026 14:54

Copilot started reviewing on behalf of matthyx January 15, 2026 14:55 View session

Copilot AI reviewed Jan 15, 2026

View reviewed changes

matthyx force-pushed the feature/rule-engine-redesign branch from 308f941 to 20ace02 Compare January 16, 2026 07:11

coderabbitai bot reviewed Jan 16, 2026

View reviewed changes

matthyx added this to KS PRs tracking Jan 16, 2026

YakirOren and others added 15 commits January 19, 2026 09:26

rule engine redesign

02ba92f

support both ExecutionContexts field and context tags

90ddcd0

move context conversion logic into RuleAppliesToContext

30aac39

remove redundant comments

73a680e

alert structure

956b837

# Conflicts: # pkg/rulemanager/types/failure.go

use eventsourcecontext type

12892ae

use context type

2a2cea2

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

add configuration flags for host and standalone monitoring

473d1f0

# Conflicts: # pkg/config/config.go

rename func

d45021f

create Registry interface for mount namespace management

3bcaf9d

add virtual host container

17f034a

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

add changes from #682

bd1fc95

Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>

implement review comments

248d952

Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>

refactor: enhance cloud metadata retrieval with improved fallback str…

8137200

…ategy and timeout settings Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>

YakirOren and others added 2 commits January 19, 2026 09:28

change to use the Registry interface

3787d23

Signed-off-by: Yakir Oren <yakiroren@gmail.com>

add design document for multi-context rule engine

f483f66

Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>

matthyx force-pushed the feature/rule-engine-redesign branch from 20ace02 to f483f66 Compare January 19, 2026 08:28

matthyx added the release Create release label Jan 19, 2026

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

matthyx approved these changes Jan 19, 2026

View reviewed changes

matthyx merged commit 66ead06 into main Jan 19, 2026
26 of 27 checks passed

matthyx deleted the feature/rule-engine-redesign branch January 19, 2026 08:58

matthyx moved this to To Archive in KS PRs tracking Jan 19, 2026

	func (n *nullFieldAccessor) Set(data datasource.Data, value []byte) error { return nil }
	func (n *nullFieldAccessor) Set(data datasource.Data, value []byte) error { return errFieldNotFound }

Feature/rule engine redesign #685

Feature/rule engine redesign #685

Uh oh!

Conversation

YakirOren commented Jan 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

matthyx left a comment

Choose a reason for hiding this comment

YakirOren commented Jan 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 16, 2026 •

edited

Loading