Description
Context
We have a small number of Lambda Functions whose purpose is to instrument third-party applications, sending metrics to Datadog on their behalf. Some of these applications do not have DD integrations, while others do not support custom metrics.
These Functions are tagged in AWS with their own metadata, e.g. team
, business_unit
, etc. They are not intended to emit that metadata as part of the metrics they produce -- e.g. a Function instrumenting ServiceX would be expected to produce metrics whose tags correspond only to ServiceX.
In practice, we see that any AWS Tags on a Function show up as tags on timestamped custom metrics emitted via the Datadog Lambda Layer's extension_thread_stats
, a ThreadStatsWriter
instantiated here.
- There IS NO DOCUMENTATION for how this Function Tag collection/injection occurs in the serverless agent
- The DD_EXCLUDE_EC2_TAGS variable does NOT affect the behavior.
- The specifics of the serverless agent (extension) are unimportant to our use case, as we're emitting timestamped metrics, which follow the codepath for
extension_thread_stats
linked above. (Details of the emit/flush code paths for which happen to be in Lossiness submitting timestamped custom metrics #514)
Expected Behavior
- MyFunction has AWS Tags
foo=bar
andbaz=quux
- MyFunction emits custom metrics via the DataDog Lambda Layer's lambda_metric
- Function Tags are added to my metric tags (in a documented and configurable manner)
- Function Tags can be overridden with custom tags
# expected tags: ["foo:bar", "baz:custom_val"]
lambda_metric(
"metric.key",
1.0,
tags=["baz:custom_val"],
timestamp=int(time.time()),
)
Actual Behavior
- MyFunction has AWS Tags
foo=bar
andbaz=quux
- MyFunction emits custom metrics, specifying a tag
baz=custom_val
as above - The emitted metrics will have multiple tag values:
baz=custom_val,quux
- The emitted metric points can be queried via either value of the tag
A detailed example follows in the reproduction below
Steps to Reproduce the Problem
For the sake of clarity, I've sanitized out the business logic, and used a simple test:
- The Function has an AWS Tag
test_tag_key=from_tags
- Note: AWS Tags are universal to a Function, and cannnot be scoped to specific FunctionVersions)
- The Function code emits a metric, attempting to override
test_tag_key
with the valuefrom_code
- The Function code also tags its metric points with a short form of the execution context ID.
- This is a terrible idea in production (due to cardinality)
- It's used to clearly illustrate persistence of the problem across multiple Lambda microVMs
- The Function was first executed without specifying
test_tag_key
, to establish that something injects the Function Tag's value
- Two subsequent runs specified
tags=["test_tag_key:from_code", ...]
- A whitespace change to the Function code produced a new execution context (microvm), demonstrating that the problem persists across cold starts
Function Handler (collapsed)
import os
import time
from datadog_lambda.metric import lambda_metric
from datadog_lambda.wrapper import datadog_lambda_wrapper
@datadog_lambda_wrapper
def main(event, context, *args, **kwargs):
"""
# Layer provides DD Python Libs
arn:aws:lambda:us-east-1:464622532012:layer:Datadog-Python310:104
# Extension provides serverless agent (unused, due to timestamps)
arn:aws:lambda:us-east-1:464622532012:layer:Datadog-Extension:67
"""
# 2023/01/01/[$LATEST]45efb027ec0049cda3de89c8837f509c -> take first 6 of the execution context UUID
short_ctx = os.environ.get("AWS_LAMBDA_LOG_STREAM_NAME", "UNSET").rsplit("]", 1)[-1][0:6]
tag_key = "test_tag_key"
----
# First run: do not specify "test_tag_key" -- this establishes "baseline" behavior
# We expect the DD metrics to have the tags ["test_tag_key:from_tags", "ctx:abc123"]
metric_tags = [f"ctx:{short_ctx}"]
# Subsequent runs: override "test_tag_key" -- expected tags ["test_tag_key:from_code", "ctx:abc123"]
metric_tags = [f"ctx:{short_ctx}",f"{tag_key}:from_code"]
----
lambda_metric(
"metric.key",
1.0,
tags=metric_tags,
timestamp=int(time.time()),
)
return short_ctx
Specifications
- Datadog Lambda Layer version: v104 (
arn:aws:lambda:us-east-1:464622532012:layer:Datadog-Python310:104
) - Datadog Extension version (unused, due to timestamps): v67 (
arn:aws:lambda:us-east-1:464622532012:layer:Datadog-Extension:67
) - Python version: Discovered on Python3.10
Stacktrace
If only.