Skip to content
This repository was archived by the owner on Aug 8, 2025. It is now read-only.

Conversation

@jonascheng
Copy link
Contributor

Since our team would leverage AWS Full Jitter algorithm, I extended this library to support it.

What I did includes:

  • New aws_expo function
  • Revised README.md to explain AWS Full Jitter with example.
  • Added test cases

@bgreen-litl
Copy link
Member

Thanks! I'd definitely like to support the backoff + jitter algorithms recommended in the AWS blog post you linked. I've been looking at your PR and I have some initial thoughts.

Your implementation looks good and is probably the most straightforward mapping of their algorithm to the current backoff API. I think it's unfortunate but that you're correct that you can't use the current jitter keyword arg that's available on the on_exception and on_predicate decorators to implement this. Since the point was for the jitter function to be customizable, this seems like a bit of a design flaw on my part. So, I have a proposal for a small change in the jitter function signature that I think will meet your use case without having to sidestep what was supposed to be the supported way to jitter. I think with this change the jittering could be specified by the current jitter keyword arg in the decorator functions.

The idea is to make the jitter function signature such that it accepts (optionally for backward compatibility) a value argument which is the amount of time to sleep prior to considering any jitter. With this change, we can define jitter functions which correspond to the algorithms defined in the AWS blog post:

def full_jitter(value):
    return -random.uniform(0, value)

def equal_jitter(value):
    return -random.uniform(0, value / 2)

Notice that those are returning a negative jitter value which will subtract from the full non-jittered sleep amount returned from the generator. I believe the end result is equivalent.

Now your example from the README can be implemented as:

@backoff.on_exception(backoff.expo,
                      requests.exceptions.RequestException,
                      jitter=backoff.full_jitter,
                      max_tries=8)
def get_url(url):
    return requests.get(url)

There is one remaining difference from the blog post (and your aws_expo function) and the backoff.expo generator which is the semantics of the base argument. In backoff.expo, base refers to the mathematical exponential base. For example with the expression 2^7, 2 is the base. I think that with the AWS blog implementation, they are hard coding base 2 and multiplying by an additional parameter as well - which confusingly for us - they are calling "base". That's what it looks like to me anyway - does that sound right to you?

If so, I think we can implement their algorithm exactly if we allow one additional keyword arg to backoff.expo. (I'm not sure what it should be called, but it can't be called "base".) The implementation then would look like this:

def expo(base=2, what_aws_calls_base=1, max_value=None):
    n = 0
    while True:
        a = what_aws_calls_base * base ** n
        if max_value is None or a < max_value:
            yield a
            n += 1
        else:
            yield max_value

And then your example becomes:

@backoff.on_exception(backoff.expo,
                      requests.exceptions.RequestException,
                      jitter=backoff.full_jitter,
                      what_aws_calls_base=0.5,
                      max_tries=8)
def get_url(url):
    return requests.get(url)

Of course, what_aws_case_calls_base needs a better name, but I think that's cleaner than sidestepping the current jitter function, and having to disable it with lambda x: 0 and all that. Again, I think your strategy makes sense given the current API, so what I'm proposing is about correcting the design flaw that the jitter function can't take the duration value into account.

I am interested to hear your thoughts, and again, thank for you the PR.

@jonascheng
Copy link
Contributor Author

Sorry for late response due to some interruption from my daily work.

Thanks for your suggestion to make code elegant, and I'm glad that you'd like me to make it better.

I'd think new code looks below if I understand correctly. jitter could be one of random.random, full_jitter, equal_jitter.

v = next(wait)
seconds = v + jitter(v)

However random() does not take additional parameter, how could I make this work as expected?

The simple idea is, I could wrap random.random inside a new jitter function

def random_jitter(value):
    return random.random();

Is there any good suggestion? Thanks!

@bgreen-litl
Copy link
Member

Now I've been away for a couple days and am catching up.

After some more thought, I was thinking it may be slicker to change the contract of the jitter function such that it is expected to return the jittered value rather than the delta. It just seems logical that if you're taking the full value as an argument, you'd return the altered version of it. In other words, the code would look like:

seconds = jitter(next(wait))

The functions corresponding to the AWS jitter functions then become:

def full_jitter(value):
    return random.uniform(0, value)

def equal_jitter(value):
    return random.uniform(value / 2, value)

However, we need to maintain backward compatibility, so whatever changes made have to still work with any already implemented 0-argument jitter function. I was looking at using the python inspect module to perform the necessary introspection, but it turns out that doesn't work with built-in functions such as (the default) random.random. The best remaining way to do it may simply be to call it with the arg and catch TypeError:

seconds = next(wait)
try:
    seconds = jitter(seconds)
except TypeError:
    # support deprecated nullary jitter function signature
    # which returns a delta rather than a jittered value
    seconds += jitter()

I also think it might be a good idea to make full_jitter the default as it seems from the AWS post that it is likely to work a lot better than the current default (random.random) milliseconds jitter.

The one hesitation I have with this idea is I'm still not sure it's possible to implement the 'Decorrelated Jitter' version of the algorithm which maybe in some circumstances would be more desirable than full_jitter? If it is possible to support it, then I'd like to include it as one of the default options along with full_jitter. Any thoughts here?

Thanks again for prompting all this.

@jonascheng
Copy link
Contributor Author

I have the same idea as your suggestion, and I revised the code again.

So far, I don't have idea to implement 'Decorrelated Jitter' either.

Would you please review the PR and let me know if it's acceptable to you?

Thanks and look forward to your comments.

README.md Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  1. Currently jitter isn't documented in the README at all (although it is explained in the docstrings for the on_exception and on_predicate functions). Now that full_jitter will be the default, I think this little bit of docs should really be called Jitter and be a subsection of the main Examples heading. I don't want to overly nitpick the docs and prevent this PR from getting in though. If you prefer to temporarily remove this section for now that's fine too. It's something I can revise as we go on in subsequent commits.
  2. I didn't mention this before because I didn't want to complicate the review, but currently the markdown docs are actually defined in the docstring for backoff.py. Using the docs Makefile target (make docs) generates the README.md file. The reason is I originally wanted the same documentation available via help(backoff) in interactive python. However, I think the docs have pretty clearly outgrown this now and there is more that we want in the README than we need in the docstring. I will work on changing this, so it's probably fine to continue to add the new docs directly to the README, and I'll work on simplifying the docstring so that it doesn't try to cover everything.

@bgreen-litl
Copy link
Member

Thanks for this work. I really appreciate it.

I put some comments in-line. There's a few things I would want to change before the PR goes in, but they're not all blocking. Some of the comments, especially doc stuff can be iterated on later based on your preference.

… that the wait generator (which can be custom) throws a TypeError for some reason, the error might be masked.
@jonascheng
Copy link
Contributor Author

I removed README.md from the PR and moved next(wait) outside try block.

backoff_tests.py Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to change... can now be jitter=None I think. The idea is to take the jitter out of the equation for this test.

@bgreen-litl
Copy link
Member

I think we're looking pretty good now. I added some more comments to the tests though. I apologize for not noticing those sooner. Basically, in most (all?) cases where we were passing jitter=lambda: 0 the intent was to disable jitter altogether for the sake of the test. These should now be jitter=None. The exception to this is the cases where the test function is explicitly testing a particular jittering mechanism.

@jonascheng
Copy link
Contributor Author

Got it, and these changes were the purpose to verify my changes don't cause side-effect.
I revered changes and again I'm sorry to commit unnecessary changes.

There are only 5 test cases specifying different jitter function:

  • test_on_exception_success_random_jitter for random jitter
  • test_on_exception_success_full_jitter for full jitter
  • test_on_exception_success_equal_jitter for equal jitter
  • test_on_exception_success_0_arg_jitter, and test_on_predicate_success_0_arg_jitter for backward compatibility.

@bgreen-litl
Copy link
Member

Hi, I have manually merged your changes into a new release-1.2 branch. I'll do some documentation work there and make sure I am 100% sure of the API before merging back to master and doing a backoff 1.2 release. Thanks for your help and for pointing out that AWS post for me.

@jonascheng
Copy link
Contributor Author

You're welcome, I'd also like to thank you for sharing this effort saving library to our team. :)

@bgreen-litl
Copy link
Member

@jonascheng just a head's up that I've released backoff 1.2 with your changes. Full jitter is now the default algorithm. backoff==1.2 should now be pip installable. Thanks again.

@jonascheng
Copy link
Contributor Author

Good to know and many thanks! 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants