Skip to content

feat: Enable sqs -> lambda support for DSM #604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

michael-zhao459
Copy link
Collaborator

@michael-zhao459 michael-zhao459 commented May 29, 2025

What does this PR do?

Allow DSM to support sqs -> lambda

Motivation

Lambda support requested by users of DSM

Testing Guidelines

Wrote a unit test inside the test_wrapper.py code that ensures context is properly being set
throughout the entire pipeline

Additional Notes

In the test I wrote a patched version of get_datastreams_context, see DataDog/dd-trace-py#13526 for the change, forced to write patched function as no new release of tracer code yet. Will remove once there is a release of tracer code.

Types of Changes

  • Bug fix
  • New feature
  • Breaking change
  • Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

  • This PR's description is comprehensive
  • This PR contains breaking changes that are documented in the description
  • This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
  • This PR impacts documentation, and it has been updated (or a ticket has been logged)
  • This PR's changes are covered by the automated tests
  • This PR collects user input/sensitive content into Datadog
  • This PR passes the integration tests (ask a Datadog member to run the tests)

@michael-zhao459 michael-zhao459 marked this pull request as ready for review May 30, 2025 18:14
@michael-zhao459 michael-zhao459 requested review from a team as code owners May 30, 2025 18:14
Copy link
Contributor

@purple4reina purple4reina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-zhao459 the way your code works looks just fine. But as written right now, it's pretty inefficient, recalculates work already done, and needs some refactoring. Here's what I'm thinking.

We have a feature already implemented called "inferred spans", which similarly looks at the event type and does some stuff with the event payload based on the type. I'm thinking we'll want to create a mechanism similar to this.

Take a look in datadog_lambda/tracing.py at the create_inferred_span method and where it's called in datadog_lambda/wrapper.py. I'm thinking we'll want to add a method called process_dsm (or something like that) and call it from about the same spot we call create_inferred_span.

I can walk you through all of this during our pairing time.

@@ -563,6 +563,204 @@ def return_type_test(event, context):
self.assertEqual(result, test_result)
self.assertFalse(MockPrintExc.called)

@patch.dict(os.environ, {"DD_DATA_STREAMS_ENABLED": "true"})
def test_datadog_lambda_wrapper_dsm_sqs_context_pathway_verification(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can vastly simplify these tests. This might warrant another pairing session, but I'll let you take a stab at it on your own first. Feel free to schedule something with me if you wanna go over it together.

There are two different files which you changed, and therefore two different test files that will need updating: tests/test_wrapper.py and a new file tests/test_dsm.py.

test_wrapper.py

For the test_wrapper.py file, we simply need to test that, based on the env vars DD_DATA_STREAMS_ENABLED and DD_TRACE_ENABLED we either do or do not call set_dsm_context with the proper args. We'll push all of the verification of what happens inside of set_dsm_context to the test_dsm.py file.

I'm a bit fan of pytest, which isn't yet imported and used in this file, which allows you to reuse the same code over and over again to create "parametrized" tests. You can accomplish this same thing using unittest (as this file already uses), though it would mean creating 4 different test methods.

# test_wrapper.py

import pytest

_test_set_dsm_context = (
		("true", "true", True),
		("true", "false", False),
		("false", "true", False),
		("false", "false", False),
)

@pytest.mark.parametrize("trace_enabled,dsm_enabled,should_call", _test_set_dsm_context)
def test_set_dsm_context(trace_enabled, dsm_enabled, should_call, monkeypatch):

	# use monkeypatch to set env vars DD_TRACE_ENABLED and DD_DATA_STREAMS_ENABLED
	
	# use monkeypatch to create a mock for `set_dsm_context`, you can also use mock.patch

	@wrapper.datadog_lambda_wrapper
	def lambda_handler(event, context):
		return "ok"

	result = lambda_handler(sqs_event, get_mock_context())
	assert result == "ok"

	if should_call: 
		# not sure of the api here, so this is just made up
		assert set_dsm_context_patch.called_with == (sqs_event, EventSource(EventSourceType.SQS))
	else:
		# again, not sure about api
		assert set_dsm_context.not_called

test_dsm.py

In the test_dsm.py file, this is where you'll assert to make sure that the set_dsm_context works as expected. You'll want to include several tests:

  1. Sending an event source of anything other than SQS will do nothing
  2. Sending an event with no Records will do nothing
  3. For each Record in the event, dsm does the setting of context as expected

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lemme take a stab at these today, I will definitely schedule a meeting if I get stuck on any of these parts

@michael-zhao459 michael-zhao459 force-pushed the michael.zhao/lambda-support branch from 81c76b6 to dda744e Compare June 4, 2025 18:09
self.mock_set_dsm_context.assert_not_called()

del os.environ["DD_DATA_STREAMS_ENABLED"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look great! Super easy to read and follow.


def test_non_sqs_event_source_does_nothing(self):
"""Test that non-SQS event sources don't trigger DSM context setting"""
event = {"Records": [{"body": "test"}]}
Copy link
Contributor

@purple4reina purple4reina Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Just to make this a tad less confusing, this event object sure looks like an sqs event to me. The test of course passes bc we look at the event source, not the event for its type. That said, can we make this event something like {}, just to make sure that it also looks not sqs?


for event in events_with_no_records:
_dsm_set_sqs_context(event)
self.mock_data_streams_processor.assert_not_called()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice.

}

mock_event_source = MagicMock()
mock_event_source.equals.return_value = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking toward the future, where you'll be adding support for sns and kinesis, etc. Once those are in place, you're not gonna want event_source.equals(whatever) to always return True.

Basically, this code works just great right now. But your future self is gonna have to change it, so why not make it work right the first time.

Instead, it's pretty easy to create a workable event source object. Instead of a mock, what about something like:

event_source = _EventSource(EventTypes.SQS)

set_dsm_context(sqs_event, event_source)

Make that change to remove all the mock_event_source objects in all these tests.

@michael-zhao459
Copy link
Collaborator Author

/merge

@dd-devflow
Copy link

dd-devflow bot commented Jun 5, 2025

View all feedbacks in Devflow UI.

2025-06-05 18:14:53 UTC ℹ️ Start processing command /merge


2025-06-05 18:15:03 UTC ℹ️ MergeQueue: waiting for PR to be ready

This merge request is not mergeable yet, because of pending checks/missing approvals. It will be added to the queue as soon as checks pass and/or get approvals.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.


2025-06-05 18:21:26 UTC ⚠️ MergeQueue: This merge request was unqueued

[email protected] unqueued this merge request

@michael-zhao459
Copy link
Collaborator Author

/merge -c

@dd-devflow
Copy link

dd-devflow bot commented Jun 5, 2025

View all feedbacks in Devflow UI.

2025-06-05 18:21:19 UTC ℹ️ Start processing command /merge -c

@purple4reina
Copy link
Contributor

Running integration tests in this PR: #608

@purple4reina
Copy link
Contributor

All tests passed on the other PR. Merging.

@purple4reina purple4reina merged commit 6beb65d into DataDog:main Jun 5, 2025
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants