Skip to content

WIP: Rate limit the publish crate endpoint #1676

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

bryanburgers
Copy link
Contributor

A start to rate limiting the publish crate endpoint. Submitting to get
early feedback on whether this is the right direction to go.


Why not use middleware?

I originally wanted something declarative, and thought middleware would
be the ideal place for rate limiting. It looks like our framework does
not have per-route middleware, and routing does not happen until after
all of the middleware has run. I didn't see a clean way (other than
parsing the URL a second time and maintaining two parallel lists of URL
routes) to make this work.

Can we still do declarative rate limiting?

We can. In src/router.rs, I ended up with code like

api_router.push("/crates/new",
  c(krate::publish::publish).rl(rate_limit::RateLimitCategory::Publish));

However, the initial use case is to rate limit new crates differently
from new versions, and that will be imperative anyway, so I didn't
pursue this any further.

If this is valuable for other endpoints that don't require in-controller
logic, we can add it later.


Left to do:

  • Diesel migration to create a table
  • Complete the RateLimiterPostgres implementation
  • Select which limiter to use based on the environment
  • Write tests

A start to rate limiting the publish crate endpoint. Submitting to get
early feedback on whether this is the right direction to go.

---

**Why not use middleware?**

I originally wanted something declarative, and thought middleware would
be the ideal place for rate limiting. It looks like our framework does
not have per-route middleware, and routing does not happen until after
all of the middleware has run. I didn't see a clean way (other than
parsing the URL a second time and maintaining two parallel lists of URL
routes) to make this work.

**Can we still do declarative rate limiting?**

We can. In `src/router.rs`, I ended up with code like

```
api_router.push("/crates/new",
  c(krate::publish::publish).rl(rate_limit::RateLimitCategory::Publish));
```

However, the initial use case is to rate limit new crates differently
from new versions, and that will be imperative anyway, so I didn't
pursue this any further.

If this is valuable for other endpoints that don't require in-controller
logic, we can add it later.

---

Left to do:

- [ ] Diesel migration to create a table
- [ ] Complete the RateLimiterPostgres implementation
- [ ] Select which limiter to use based on the environment
- [ ] Write tests
@bryanburgers
Copy link
Contributor Author

@sgrif @jtgeibel I took a stab at an in-app rate limiter. Before I get too deep, is this the right direction?

Copy link
Contributor

@sgrif sgrif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll review more in depth when I have time, but a few notes from a brief skim:

  • This is more generic than we need. nginx has very solid built in rate limiting, and I don't see us moving away from it for anything other than publishing new crates right now. I don't think we need to replace it.
  • Since this is only going to be for publishing new crates, I think we're fine storing in the database, rather than in memory
  • I'd rather rate limit on user id instead of IP for now, since that's harder to rotate. I'm envisioning something more complex long term, but to start just user id is fine. (It looks like you're doing that)
  • We will want to support per-user overrides for max_amount
    • So the table that stores the buckets won't actually have max_amount on it at all. Either a user override exists in a separate table, or it doesn't and we use the global value. We want to store the user overrides separately since the buckets will be cleared out periodically.
  • The code will be way simpler if we don't try to abstract this around handling more than we need to. We don't need a trait for this, or most of the settings. Just a struct that has refill_time on it, and a function like fn take_token(&self, &User, &PgConnection) -> CargoResult<()> which refills/updates, subtracts a token, or returns an error if they've hit the rate limit

/// How often we refill
pub refill_time: Duration,
/// The number of tokens that are added during a refill.
pub refill_amount: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to make this configurable. It'll always be 1 for the foreseeable future.

@@ -259,6 +259,31 @@ impl fmt::Display for Unauthorized {
}
}

#[derive(Debug, Clone, Copy)]
pub struct TooManyRequests;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a retry after field.

@bryanburgers
Copy link
Contributor Author

Thanks for the review! Also, I won’t be able to continue working on this until next week probably, so if anybody is already working on this, go right ahead! I won’t be offended.

@sgrif
Copy link
Contributor

sgrif commented Mar 19, 2019

Thanks for working on this @bryanburgers. This is a bit of a funky feature in that it's high priority for us, and it's something that can quickly shift in what we on the operations side want. As we've discussed here and privately, I've gone ahead with this on my own since we want to land this soon. Thanks again for taking a swing at this, and I hope this doesn't discourage you from tackling some of our other open issues.

Closing in favor of #1681

@sgrif sgrif closed this Mar 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants