Skip to content

Require a User-Agent header, take 2 #1696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2019

Conversation

sgrif
Copy link
Contributor

@sgrif sgrif commented Mar 25, 2019

We deployed this back in October, and quickly realized that old versions
of Cargo (Rust 1.15 and earlier) didn't set a User-Agent header.
Notably, the version of Cargo used for bootstrapping the compiler back
then was this old, so we accidentally broke the build for rustc.

We still see build traffic for these old versions, and we likely will
never be willing to break cargo build for old versions of Cargo.
However, as we discussed back in early November, we're fine with doing
this for all other endpoints.

This change will break versions of Cargo prior to Rust 1.16 for all
operations that talk to crates.io, except for cargo build. We no
longer receive any traffic for publish, yank, unyank, or owners from
these old versions. (cargo search hits an endpoint that is also hit by
bots, so I can't say for sure if it's coming from cargo or random
crawlers, which is kinda the point of this requirement in the first
place)

@sgrif sgrif force-pushed the sg-require-ua-take-2 branch from 6286042 to f92b898 Compare March 25, 2019 23:19
@bors
Copy link
Contributor

bors commented Mar 28, 2019

☔ The latest upstream changes (presumably #1699) made this pull request unmergeable. Please resolve the merge conflicts.

let (_app, anon) = TestApp::init().empty();

let mut req = anon.request_builder(Method::Get, "/api/v1/crates");
req.header("User-Agent", "");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this testing when a User-Agent header exists, but is the empty string? If so, do we need a test where the header doesn't exist as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, conduit-test doesn't give us a way to unset a header. We're setting the user agent to a non-empty string when the requests are first constructed so we don't have to set it for every test besides these two. We could refactor things to apply the default user agent later, so we have a place to construct a request without it here, but I've left it this way for two reasons:

  • Explicitly setting req.header("User-Agent", "") makes it much more clear what this is actually testing
  • It'd be very odd if we were treating a header with no value differently than a header not being sent at all (the only case I'm aware of where there's a semantic difference is when sending an HTTP/1.1 -> HTTP/2 upgrade request, which specifies that an HTTP2-Settings header must be present)

With that in mind, do you still think it's worth the additional test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I don't. 👍 LGTM.

If testing without the header was easy to do, I'd like to see it. But I'm with you that it would be very odd if the framework was treating a non-existent header differently than an empty header.

I just brought it up because I've been bitten in different places (not HTTP headers) where non-existent != empty string.

We deployed this back in October, and quickly realized that old versions
of Cargo (Rust 1.15 and earlier) didn't set a User-Agent header.
Notably, the version of Cargo used for bootstrapping the compiler back
then was this old, so we accidentally broke the build for rustc.

We still see build traffic for these old versions, and we likely will
never be willing to break `cargo build` for old versions of Cargo.
However, as we discussed back in early November, we're fine with doing
this for all other endpoints.

This change will break versions of Cargo prior to Rust 1.16 for all
operations that talk to crates.io, except for `cargo build`. We no
longer receive any traffic for publish, yank, unyank, or owners from
these old versions. (`cargo search` hits an endpoint that is also hit by
bots, so I can't say for sure if it's coming from cargo or random
crawlers, which is kinda the point of this requirement in the first
place)
@sgrif sgrif force-pushed the sg-require-ua-take-2 branch from f92b898 to 44aeb47 Compare April 2, 2019 16:41
@bryanburgers
Copy link
Contributor

Pulled down and tested locally. Everything seems to working when using curl -H 'User-Agent:'.

bryan ~/projects/cratesio/ pr/1696 $ curl -i --head localhost:8888/api/v1/crates/bryan -H "User-Agent: test/1"
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Set-Cookie: cargo_session=nFPRnOh/tCaZyaOecttwdaQd8auK817pCMXxVWQ0a+o=; HttpOnly; Path=/
Content-Length: 8621

bryan ~/projects/cratesio/ pr/1696 $ curl -i --head localhost:8888/api/v1/crates/bryan -H "User-Agent:"
HTTP/1.1 403 Forbidden
Content-Length: 631
Set-Cookie: cargo_session=nFPRnOh/tCaZyaOecttwdaQd8auK817pCMXxVWQ0a+o=; HttpOnly; Path=/

bryan ~/projects/cratesio/ pr/1696 $ curl -i --head localhost:8888/api/v1/crates/bryan/0.1.0/download -H "User-Agent: test/1"
HTTP/1.1 302 Found
Location: /crates/bryan/bryan-0.1.0.crate
Set-Cookie: cargo_session=nFPRnOh/tCaZyaOecttwdaQd8auK817pCMXxVWQ0a+o=; HttpOnly; Path=/

bryan ~/projects/cratesio/ pr/1696 $ curl -i --head localhost:8888/api/v1/crates/bryan/0.1.0/download -H "User-Agent:"
HTTP/1.1 302 Found
Location: /crates/bryan/bryan-0.1.0.crate
Set-Cookie: cargo_session=nFPRnOh/tCaZyaOecttwdaQd8auK817pCMXxVWQ0a+o=; HttpOnly; Path=/

Note that this is testing the non-existence of the header. From curl's man-page:

-H, --header


(HTTP) Extra header to include in the request when sending HTTP to a server. You may specify any number of extra headers. Note that if you should add a custom header that has the same name as one of the internal ones curl would use, your externally set header will be used instead of the internal one. This allows you to make even trickier stuff than curl would normally do. You should not replace internally set headers without knowing perfectly well what you're doing. Remove an internal header by giving a replacement without content on the right side of the colon, as in: -H "Host:". If you send the custom header with no-value then its header must be terminated with a semicolon, such as -H "X-Custom-Header;" to send "X-Custom-Header:".

@jtgeibel
Copy link
Member

jtgeibel commented May 2, 2019

@bors r+

@bors
Copy link
Contributor

bors commented May 2, 2019

📌 Commit 44aeb47 has been approved by jtgeibel

bors added a commit that referenced this pull request May 2, 2019
Require a User-Agent header, take 2

We deployed this back in October, and quickly realized that old versions
of Cargo (Rust 1.15 and earlier) didn't set a User-Agent header.
Notably, the version of Cargo used for bootstrapping the compiler back
then was this old, so we accidentally broke the build for rustc.

We still see build traffic for these old versions, and we likely will
never be willing to break `cargo build` for old versions of Cargo.
However, as we discussed back in early November, we're fine with doing
this for all other endpoints.

This change will break versions of Cargo prior to Rust 1.16 for all
operations that talk to crates.io, except for `cargo build`. We no
longer receive any traffic for publish, yank, unyank, or owners from
these old versions. (`cargo search` hits an endpoint that is also hit by
bots, so I can't say for sure if it's coming from cargo or random
crawlers, which is kinda the point of this requirement in the first
place)
@bors
Copy link
Contributor

bors commented May 2, 2019

⌛ Testing commit 44aeb47 with merge ae04832...

@bors
Copy link
Contributor

bors commented May 2, 2019

☀️ Test successful - checks-travis
Approved by: jtgeibel
Pushing ae04832 to master...

@bors bors merged commit 44aeb47 into rust-lang:master May 2, 2019
@sgrif sgrif deleted the sg-require-ua-take-2 branch May 14, 2019 16:22
yan12125 added a commit to yan12125/nvchecker that referenced this pull request Jul 5, 2019
yan12125 added a commit to yan12125/nvchecker that referenced this pull request Jul 6, 2019
yan12125 added a commit to yan12125/nvchecker that referenced this pull request Jul 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants