-
Notifications
You must be signed in to change notification settings - Fork 645
WIP: Rate limit the publish crate endpoint #1676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A start to rate limiting the publish crate endpoint. Submitting to get early feedback on whether this is the right direction to go. --- **Why not use middleware?** I originally wanted something declarative, and thought middleware would be the ideal place for rate limiting. It looks like our framework does not have per-route middleware, and routing does not happen until after all of the middleware has run. I didn't see a clean way (other than parsing the URL a second time and maintaining two parallel lists of URL routes) to make this work. **Can we still do declarative rate limiting?** We can. In `src/router.rs`, I ended up with code like ``` api_router.push("/crates/new", c(krate::publish::publish).rl(rate_limit::RateLimitCategory::Publish)); ``` However, the initial use case is to rate limit new crates differently from new versions, and that will be imperative anyway, so I didn't pursue this any further. If this is valuable for other endpoints that don't require in-controller logic, we can add it later. --- Left to do: - [ ] Diesel migration to create a table - [ ] Complete the RateLimiterPostgres implementation - [ ] Select which limiter to use based on the environment - [ ] Write tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll review more in depth when I have time, but a few notes from a brief skim:
- This is more generic than we need. nginx has very solid built in rate limiting, and I don't see us moving away from it for anything other than publishing new crates right now. I don't think we need to replace it.
- Since this is only going to be for publishing new crates, I think we're fine storing in the database, rather than in memory
- I'd rather rate limit on user id instead of IP for now, since that's harder to rotate. I'm envisioning something more complex long term, but to start just user id is fine. (It looks like you're doing that)
- We will want to support per-user overrides for max_amount
- So the table that stores the buckets won't actually have
max_amount
on it at all. Either a user override exists in a separate table, or it doesn't and we use the global value. We want to store the user overrides separately since the buckets will be cleared out periodically.
- So the table that stores the buckets won't actually have
- The code will be way simpler if we don't try to abstract this around handling more than we need to. We don't need a trait for this, or most of the settings. Just a struct that has
refill_time
on it, and a function likefn take_token(&self, &User, &PgConnection) -> CargoResult<()>
which refills/updates, subtracts a token, or returns an error if they've hit the rate limit
/// How often we refill | ||
pub refill_time: Duration, | ||
/// The number of tokens that are added during a refill. | ||
pub refill_amount: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to make this configurable. It'll always be 1 for the foreseeable future.
@@ -259,6 +259,31 @@ impl fmt::Display for Unauthorized { | |||
} | |||
} | |||
|
|||
#[derive(Debug, Clone, Copy)] | |||
pub struct TooManyRequests; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have a retry after field.
Thanks for the review! Also, I won’t be able to continue working on this until next week probably, so if anybody is already working on this, go right ahead! I won’t be offended. |
Thanks for working on this @bryanburgers. This is a bit of a funky feature in that it's high priority for us, and it's something that can quickly shift in what we on the operations side want. As we've discussed here and privately, I've gone ahead with this on my own since we want to land this soon. Thanks again for taking a swing at this, and I hope this doesn't discourage you from tackling some of our other open issues. Closing in favor of #1681 |
A start to rate limiting the publish crate endpoint. Submitting to get
early feedback on whether this is the right direction to go.
Why not use middleware?
I originally wanted something declarative, and thought middleware would
be the ideal place for rate limiting. It looks like our framework does
not have per-route middleware, and routing does not happen until after
all of the middleware has run. I didn't see a clean way (other than
parsing the URL a second time and maintaining two parallel lists of URL
routes) to make this work.
Can we still do declarative rate limiting?
We can. In
src/router.rs
, I ended up with code likeHowever, the initial use case is to rate limit new crates differently
from new versions, and that will be imperative anyway, so I didn't
pursue this any further.
If this is valuable for other endpoints that don't require in-controller
logic, we can add it later.
Left to do: