fix: call `patch_all` before importing handler code. #598

purple4reina · 2025-05-14T19:17:08Z

What does this PR do?

Move location of where we call ddtrace.patch_all, ensuring it is always called before the handler is imported.

Motivation

Customer reported issue (see https://datadoghq.atlassian.net/browse/SLES-2262) of not seeing spans or distributed tracing from their confluent_kafka calls. Here is a highly simplified version of their lambda handler:

from confluent_kafka import Producer

producer = Producer({'bootstrap.servers': 'mybroker1,mybroker2'})

def handle(event, context):
    producer.produce('mytopic', 'hello world!')
    producer.flush()

datadog-lambda calls ddtrace.patch_all() after the customer handler is imported. To see this look at handler.py and wrapper.py. Here, patch_all() is currently called when initializing the DatadogWrapper in wrapper.py. This only happens after all customer code is imported in handler.py. To demonstrate, here's a commented version of a abridged version of our handler.py file:

from importlib import import_module

import os
from time import time_ns

from datadog_lambda.tracing import emit_telemetry_on_exception_outside_of_handler
from datadog_lambda.wrapper import datadog_lambda_wrapper			# <--- wrapper imported
from datadog_lambda.module_name import modify_module_name

... other stuff ...

try:
    handler_load_start_time_ns = time_ns()
    handler_module = import_module(modified_mod_name)				# <--- handler cold imported
    handler_func = getattr(handler_module, handler_name)
except Exception as e:
    emit_telemetry_on_exception_outside_of_handler(
        e,
        modified_mod_name,
        handler_load_start_time_ns,
    )
    raise

handler = datadog_lambda_wrapper(handler_func)						# <--- patch_all called

The calling of patch_all() after their handler code is imported is causing the producer to not get any instrumentation applied. We can see this by inspecting the producer's type.

print(producer.__class__.__name__)  # prints "Producer" but should be "TracedProducer"

💭 So wait a minute 💭, why is this only a problem now? This call to patch_all() was added over 5 years ago, why has no one reported this until now!?

This has to do with the nature of how ddtrace does it's patching. It's individual to each contrib module patched and how the customer uses it.

Interestingly, if you were to inspect the producer type in a different way, we see a different result:

import confluent_kafka
from confluent_kafka import Producer

print(confluent_kafka.Producer.__name__)  # prints "TracedProducer"
print(Producer.__name__)                  # prints "Producer"

Why is this? Because confluent_kafka.Producer accesses the producer by reference whereas Producer has initialized and saved the producer as the non-traced class.

Testing Guidelines

Additional Notes

⚠️ Important note ⚠️ This PR has the added consequence that customers will now see spans created by patched module calls made on the global level (ie on cold start). Previously these spans were not created at all, because patch_all() hadn't been called until after the handler code was fully imported. For example, this code will now produce a span for the requests http call made on the global level during cold start.

import requests

resp = requests.get('https://example.com')
print(resp.status_code)

def handler(event, context):
    pass

The only problem is (and here's the ⚠️ warning) that these newly created spans will always be orphaned. This is because of the way in which we manage cold start tracing. During cold start we are unable to determine the trace id because we have not yet started the root trace span nor have we been able to receive any inbound distributed tracing headers.

It should be possible to correctly parent these new orphaned spans. However, that is outside the scope of this PR because it will be difficult and significant undertaking. We can cross that bridge when we get there.

Types of Changes

Bug fix
New feature
Breaking change
Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

This PR's description is comprehensive
This PR contains breaking changes that are documented in the description
This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
This PR impacts documentation, and it has been updated (or a ticket has been logged)
This PR's changes are covered by the automated tests
This PR collects user input/sensitive content into Datadog
This PR passes the integration tests (ask a Datadog member to run the tests)

mabdinur · 2025-05-19T13:03:28Z

datadog_lambda/wrapper.py

@@ -45,6 +45,10 @@
    extract_http_status_code_tag,
 )

+# Patch third-party libraries for tracing, must be done before importing any
+# handler code.
+patch_all()


We are planning to deprecate ddtrace.patch_all(...) and instead encourage folks to use import ddtrace.auto (which executes a similar set of operations as ddtrace-run).

I don't think we can make this change in this PR due to the overhead of loading all ddtrace products and entrypoints but it's something we should investigate

Thanks Munir. It's definitely something on our radar. Last I benchmarked the difference between patch_all and ddtrace.auto, it was significant.

datadog-datadog-prod-us1 · 2025-06-10T22:01:20Z

datadog_lambda/config.py

+    return [val.strip() for val in val.split(",") if val.strip()]
+
+
+class Config:


⚪ Code Quality Violation

Class Config should have an init method (...read more)

Ensure that a class has an __init__ method. This check is bypassed when the class is a data class (annotated with @dataclass).

purple4reina · 2025-06-11T01:50:19Z

tests/integration/snapshots/logs/async-metrics_python310.log

@@ -119,7 +119,7 @@ HTTP GET https://www.datadoghq.com/ Headers: ["Accept-Encoding:gzip, deflate","A
        "trace_id": "XXXX",
        "parent_id": "XXXX",
        "span_id": "XXXX",
-        "service": "integration-tests-python",
+        "service": "requests",


Integration test changes are all related to the fact that we're now properly patching the requests package before importing it, which wasn't happening previously.

joeyzhao2018

LGTM assuming we will do a major release.

purple4reina requested a review from a team as a code owner May 14, 2025 19:17

purple4reina requested review from joeyzhao2018, duncanista and apiarian-datadog May 14, 2025 19:24

mabdinur reviewed May 19, 2025

View reviewed changes

purple4reina mentioned this pull request Jun 10, 2025

Consolidate env reading to single config object. #600

Merged

11 tasks

purple4reina force-pushed the rey.abolofia/patch-all branch from f9b4a4a to 7bc8751 Compare June 10, 2025 22:00

purple4reina requested a review from a team as a code owner June 10, 2025 22:00

datadog-datadog-prod-us1 bot reviewed Jun 10, 2025

View reviewed changes

purple4reina changed the base branch from main to rey.abolofia/config-object June 10, 2025 22:02

purple4reina force-pushed the rey.abolofia/patch-all branch 2 times, most recently from 235a526 to f9cc587 Compare June 11, 2025 01:46

purple4reina commented Jun 11, 2025

View reviewed changes

Base automatically changed from rey.abolofia/config-object to main June 11, 2025 15:37

purple4reina added 4 commits June 11, 2025 08:39

Call patch_all before importing handler code.

bb9a74e

Remove tests for patch_all.

070bc90

Move patch_all to init.

c39861c

Update integration tests.

8dd2266

purple4reina force-pushed the rey.abolofia/patch-all branch from f9cc587 to 8dd2266 Compare June 11, 2025 15:40

purple4reina mentioned this pull request Jun 17, 2025

release v6.111.0 #620

Closed

11 tasks

joeyzhao2018 approved these changes Jun 17, 2025

View reviewed changes

purple4reina merged commit d72ebaa into main Jun 30, 2025
61 checks passed

purple4reina deleted the rey.abolofia/patch-all branch June 30, 2025 21:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: call `patch_all` before importing handler code. #598

fix: call `patch_all` before importing handler code. #598

Uh oh!

purple4reina commented May 14, 2025

Uh oh!

mabdinur May 19, 2025

Uh oh!

purple4reina May 19, 2025

Uh oh!

datadog-datadog-prod-us1 bot Jun 10, 2025

Uh oh!

purple4reina Jun 11, 2025

Uh oh!

joeyzhao2018 left a comment

Uh oh!

Uh oh!

Uh oh!

		return [val.strip() for val in val.split(",") if val.strip()]


		class Config:

fix: call patch_all before importing handler code. #598

fix: call patch_all before importing handler code. #598

Uh oh!

Conversation

purple4reina commented May 14, 2025

What does this PR do?

Motivation

Testing Guidelines

Additional Notes

Types of Changes

Check all that apply

Uh oh!

mabdinur May 19, 2025

Choose a reason for hiding this comment

Uh oh!

purple4reina May 19, 2025

Choose a reason for hiding this comment

Uh oh!

datadog-datadog-prod-us1 bot Jun 10, 2025

Choose a reason for hiding this comment

⚪ Code Quality Violation

Uh oh!

purple4reina Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

joeyzhao2018 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fix: call `patch_all` before importing handler code. #598

fix: call `patch_all` before importing handler code. #598