Skip to content

Enable compressed payloads on ThreadStats object #86

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 17, 2021

Conversation

csssuf
Copy link
Contributor

@csssuf csssuf commented Sep 17, 2020

What does this PR do?

Enable submitting compressed payloads to the Datadog API.

Motivation

We use lambdas to scrape metrics which we can't easily collect by other means. These metric lambdas can collect large numbers of metrics quickly, and occasionally submit too many metrics within the same flush interval for the Datadog API to handle. This is similar to the issue in DataDog/datadogpy#465. The solution implemented to fix that issue was compression of payloads submitted to the DD API.

Testing Guidelines

./scripts/run_tests.sh

Additional Notes

n/a

Types of Changes

  • Bug fix
  • New feature
  • Breaking change
  • Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

  • This PR's description is comprehensive
  • This PR contains breaking changes that are documented in the description
  • This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
  • This PR impacts documentation, and it has been updated (or a ticket has been logged)
  • This PR's changes are covered by the automated tests
  • This PR collects user input/sensitive content into Datadog
  • This PR passes the integration tests (ask a Datadog member to run the tests)

@csssuf csssuf requested a review from a team as a code owner September 17, 2020 20:56
@agocs agocs changed the base branch from master to main December 17, 2020 02:17
@tianchu
Copy link
Collaborator

tianchu commented Mar 16, 2021

Hi @csssuf, I'm closing the PR due to the lack of activity here. Feel free to reopen if this problem still persists and you would like to move forward. I also want to mention a new way (using the Lambda extension) to submit metrics directly from your Lambda functions (without using the forwarder) https://docs.datadoghq.com/serverless/datadog_lambda_library/extension/.

@tianchu tianchu closed this Mar 16, 2021
@jeid64
Copy link

jeid64 commented Mar 16, 2021

Please reopen this issue, it has a lack of activity as the datadog maintainers didn't respond. @tianchu the PR is ready to merge.

@tianchu tianchu reopened this Mar 16, 2021
@tianchu
Copy link
Collaborator

tianchu commented Mar 16, 2021

Thanks for the confirmation that this PR is still relevant! we will prioritize some testing before merging. Sorry for the long wait 🙇

@csssuf csssuf force-pushed the enable-compressed-payloads branch from 38723f1 to 4fe6909 Compare March 16, 2021 17:28
@csssuf csssuf force-pushed the enable-compressed-payloads branch from 4fe6909 to 513172e Compare March 16, 2021 17:29
@tianchu
Copy link
Collaborator

tianchu commented Mar 17, 2021

I could successfully reproduce the issue by submitting tons of data points within 10s, and enabling compress_payload did fix the issue. I'm going to merge this PR and release in the next version. Thanks for your contribution!

However, I do want to point out, although compress_payload does help workaround the issue, submitting a really large payload does add a noticeable overhead to the function duration. Not sure if this is the case, but if you are looping through a huge list and submitting a data point for each item, it would be way more efficient if you aggregate those data points in the memory and then submit one aggregated value at the end of the loop.

# inefficient
for i in range(1000000):
    lambda_metric(
        metric_name='my.metric',
        value=i+1,
        tags=['tag1:value1']
    )
    
# efficient
sum = 0
for i in range(1000000):
    sum = sum + i + 1
lambda_metric(
    metric_name='my.metric',
    value=sum,
    tags=['tag1:value1']
)

@csssuf
Copy link
Contributor Author

csssuf commented Mar 17, 2021

🎉 thanks!

Not sure if this is the case, but if you are looping through a huge list and submitting a data point for each item, it would be way more efficient if you aggregate those data points in the memory and then submit one aggregated value at the end of the loop.

Unfortunately, since we're submitting several distinct metrics with a large number of unique tag sets for each metric, aggregating that way wouldn't do anything for us - we do only submit a single value per unique (metric, tagset) pair. These lambdas are also only used for scraping and submitting metrics, so a long runtime isn't a major concern for us.

@tianchu tianchu merged commit 951972a into DataDog:main Mar 17, 2021
@csssuf csssuf deleted the enable-compressed-payloads branch March 17, 2021 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants