Skip to content

Idempotency in progress timeout #1281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
scottgerring opened this issue Jul 12, 2023 · 1 comment
Closed

Idempotency in progress timeout #1281

scottgerring opened this issue Jul 12, 2023 · 1 comment
Assignees
Labels
bug Something isn't working priority:1 Critical - need urgent attention, must be fixed and resolved ASAP triage

Comments

@scottgerring
Copy link
Contributor

scottgerring commented Jul 12, 2023

Two expirations are stored on the DDB table:

  • expiration - how long to cache the results of the operation. User-tunable ( seconds since epoch )
  • in_progress_expiration - expire records that timed out during lambda execution. Set by powertools to the remainingExecutionTime. ( milliseconds since epoch)

When a request comes in, we try and conditionally upsert the DDB record - if the in_progress_expiration time is before now(), we can overwrite it, else we can't. The bug is here:

new AbstractMap.SimpleEntry<>(":now", AttributeValue.builder().n(String.valueOf(now.getEpochSecond())).build()),

.conditionExpression("attribute_not_exists(#id) OR #expiry < :now OR (attribute_exists(#in_progress_expiry) AND #in_progress_expiry < :now AND #status = :inprogress)")

We set now once, but we use it in a comparison where the units should be seconds, and another where the units should be milliseconds.

The upshot of this is that if a handler times out within an idempotent operation, we'll start seeing exceptions thrown for subsequent invocations.

What were you trying to accomplish?

Expected Behavior

Subsequent executions after a function timeout should update the DDB table and complete.

Current Behavior

Possible Solution

  1. Provide nowMs and now in the DDB conditional update
  2. Increase test coverage
@scottgerring scottgerring added bug Something isn't working triage labels Jul 12, 2023
@scottgerring scottgerring self-assigned this Jul 12, 2023
@jeromevdl jeromevdl added the priority:1 Critical - need urgent attention, must be fixed and resolved ASAP label Jul 17, 2023
@github-actions
Copy link
Contributor

This is now released under 1.16.1 version!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:1 Critical - need urgent attention, must be fixed and resolved ASAP triage
Projects
Status: Shipped
Development

No branches or pull requests

2 participants