-
Notifications
You must be signed in to change notification settings - Fork 522
Handle absolute, asterisk, and authority-form request targets #1470
Conversation
Will follow up with comparison benchmarks when they finish running. |
PERFFFF |
// TODO Validate that target is a correct host[:port] string. | ||
// Reject as 400 if not. This is just a quick scan for invalid characters | ||
// but doesn't check that the target fully matches the URI spec. | ||
for (var i = 0; i < target.Length; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fear repeated scanning of strings can be bad for performance numbers. I think we'll end up moving more validation into the parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last time we talked about this, we agreed we we're okay with sub-optimal performance on edge cases. This code shouldn't be hit unless we have a bad client or malformed requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last time we talked about this
How long ago was that?
I'll wait for the benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (var i = 0; i < target.Length; i++) | ||
{ | ||
var ch = (char)target[i]; | ||
if (!char.IsLetterOrDigit(ch) && ch != ':' && ch != '.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using IsLetterOrDigit, can you use a subset of this:
We should move this a HttpUtilities
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The "IsValidTokenChar" method includes characters not allowed in the request target for authority-form, like "*".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I'm not saying we should use it. I said a subset into HttpUtilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, thx. yes, this can move to the utils class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should rename IsValidTokenChar
to IsValidMethodChar
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and move to HttpUtlities
var ch = (char)target[i]; | ||
if (!char.IsLetterOrDigit(ch) && ch != ':' && ch != '.') | ||
{ | ||
if (Log.IsEnabled(LogLevel.Information)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we checking IsEnabled all over the code before exceptions now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the extra work to avoid 😉 push it to right of throw and combine the two
private BadHttpRequestException GetRequestLineRejection(Span<byte> line)
{
const int MaxRequestLineError = 32;
if (Log.IsEnabled(LogLevel.Information))
{
var line = line.GetAsciiStringEscaped(MaxRequestLineError);
return BadHttpRequestException.GetException(RequestRejectionReason.InvalidRequestLine, line);
}
else
{
return BadHttpRequestException.GetException(RequestRejectionReason.InvalidRequestLine);
}
}
private void ThrowRequestLineRejection(Span<byte> line)
{
throw GetRequestLineRejection(line)
}
Then
if (!char.IsLetterOrDigit(ch) && ch != ':' && ch != '.')
{
ThrowRequestLineRejection(line);
}
Keep the main body tight with only code that's mostly always executed; push everything else out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you have to pass a bucket of params to do it 😝
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -162,6 +185,10 @@ public KestrelHttpParser(IKestrelTrace log) | |||
RejectRequestLine(span); | |||
} | |||
} | |||
else if (ch == 0) | |||
{ | |||
RejectRequestLine(span); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we checking this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fail early when possible. \0 is never acceptable on the request line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is we need to do that in a bunch of places and this probably doesn't handle all of those cases. Not just 0 but a bunch of other characters.
PR
dev
|
That doesn't look good. Profile and see where the time is spent |
Looking into it. Correctness comes before perf, but I'll profile more to see what I can gain back |
Uri.Create. |
|
||
RawTarget = target.GetAsciiStringNonNullCharacters(); | ||
|
||
if (!Uri.TryCreate(RawTarget, UriKind.Absolute, out var uri)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to use Uri.TryCreate
? Maybe we should roll our own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uri
is a garbage hole for performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is about correctness, not perf. For context, we talked about this in the first iteration of absolute-URI support. We're aware this is slow, but it's not on the hot path. In fact, it's not really on any commonly used code path. The HTTP spec requires we handle requests like GET http://localhost/path HTTP/1.1
, but in practice, clients never do. Instead they use origin-form, GET /path HTTP/1.1
. We debated whether to bother at all, but since our goal is edge-server capabilities, we will comply with the HTTP spec, including its edge-cases. Fast URI parsing is not a high priority.
// | | queryEnd | ||
// | | |versionStart | ||
// | | || | ||
// http://host/path?query=1 HTTP/1.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pathStart
should be at the first /
in the actual path. For the special cases http://host
and http://host?query
, make a span with /
and pass that to OnStartLine
.
{ | ||
if (target.Length == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Target can't be empty here can it? Doesn't the parser check for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should. This could be a debug.assert instead.
@@ -66,6 +68,27 @@ public KestrelHttpParser(IKestrelTrace log) | |||
span = startLineBuffer.ToSpan(); | |||
} | |||
|
|||
// pathStart | |||
// | pathEnd and queryStart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need pathEnd and queryStart? I couldn't think of a scenario when they are different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if path had been decoded in place and query not moved (e.g. path could get shorter) - though this doesn't currently happen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have removed pathEnd
in #1463
Here are the new comparisons with the #1471 fix in benchmarks. dev
PR
|
5a75fd9
to
c67a838
Compare
🆙 📅 had a bunch of merge conflicts with #1463. Re-running benchmarks. (Will squash before merging) |
dev
PR
|
Wait, you made it faster? |
@@ -157,7 +162,7 @@ public KestrelHttpParser(IKestrelTrace log) | |||
RejectRequestLine(data, length); | |||
} | |||
|
|||
handler.OnStartLine(method, httpVersion, targetBuffer, pathBuffer, query, customMethod, pathEncoded); | |||
handler.OnStartLine(method, httpVersion, targetBuffer, pathBuffer, query, customMethod, new Span<byte>(data, length), pathEncoded); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declare a local
All but the PlaintextAbsoluteUri benchmark which is almost cut in half. But that's going from 800k RPS of incorrect handling to 480k RPS correct handling of the request. |
@@ -17,6 +23,14 @@ private BadHttpRequestException(string message, int statusCode) | |||
|
|||
internal int StatusCode { get; } | |||
|
|||
internal void SetAdditionalHeaders(FrameResponseHeaders headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit strange having logic in exception and calling methods on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer subclass of BadHttpRequestException
with RequiredMethod
property and downcast check in ProduceEnd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BadHttpRequestException is sealed :-/
@@ -1179,6 +1185,15 @@ public void RejectRequest(RequestRejectionReason reason, string value) | |||
throw BadHttpRequestException.GetException(reason, value); | |||
} | |||
|
|||
private void RejectRequestLine(Span<byte> span) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename span to requestLine
|
||
namespace Microsoft.AspNetCore.Server.Kestrel | ||
{ | ||
public sealed class BadHttpRequestException : IOException | ||
{ | ||
// used in rare cases | ||
private HttpMethod? _requiredMethod; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
readonly
and add second ctor.
@@ -1179,6 +1185,15 @@ public void RejectRequest(RequestRejectionReason reason, string value) | |||
throw BadHttpRequestException.GetException(reason, value); | |||
} | |||
|
|||
private void RejectRequestLine(Span<byte> span) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will change with #1469.
@@ -1216,8 +1232,51 @@ protected void ReportApplicationError(Exception ex) | |||
Log.ApplicationError(ConnectionId, ex); | |||
} | |||
|
|||
public void OnStartLine(HttpMethod method, HttpVersion version, Span<byte> target, Span<byte> path, Span<byte> query, Span<byte> customMethod, bool pathEncoded) | |||
public void OnStartLine(HttpMethod method, HttpVersion version, Span<byte> target, Span<byte> path, Span<byte> query, Span<byte> customMethod, Span<byte> line, bool pathEncoded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1469 introduces rejection of request targets, so you won't have to pass the entire line here anymore.
|
||
// The authority-form of request-target is only used for CONNECT | ||
// requests (https://tools.ietf.org/html/rfc7231#section-4.3.6). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: blank line
{ | ||
Unknown = -1, | ||
Http = 0, | ||
HttpSecure = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Https
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How strong is your preference for this? I'm chose the name to make it super obvious that it's http + TLS, not plain-old http.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use https everywhere else
@@ -109,6 +109,11 @@ public KestrelHttpParser(IKestrelTrace log) | |||
|
|||
pathEncoded = true; | |||
} | |||
else if (ch == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this being checked here now? Should blow up when trying to get ASCII string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fail fast. The null character is never valid in the start line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are validating these during materialization of values, fail fast is less important then less checks in hot paths
@@ -15,7 +15,13 @@ public static class HttpUtilities | |||
public const string Http10Version = "HTTP/1.0"; | |||
public const string Http11Version = "HTTP/1.1"; | |||
|
|||
public const string HttpUriScheme = "http://"; | |||
public const string HttpSecureUriScheme = "https://"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HttpsUriScheme
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was intentionally trying to emphasize the "S" of https by expanding it to Secure. It's easy to miss that extra character when reviewing/writing code.
private void SetErrorResponseException(BadHttpRequestException ex) | ||
{ | ||
SetErrorResponseHeaders(ex.StatusCode); | ||
if (ex.AllowedHeader != default(StringValues)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StringValues.IsNullOrEmpty
- Works better with changes coming in PR #1470
private BadHttpRequestException(string message, int statusCode) | ||
:this(message, statusCode, null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: : this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Improves compliance with RFC 7230 on the expected handling of requests that have URI or asterisk in the request target. This means rejecting asterisk requests that are not OPTIONS and rejecting authority-form requests taht are not CONNECT. This also means the server will handle the path and query on targets with absolute URIs as request-targets.
eef051c
to
49b328d
Compare
Squashed, formatted, benchmarked, etc. Awaiting sign off. |
- Works better with changes coming in PR #1470
- Works better with changes coming in PR #1470
Improves compliance with RFC 7230 on the expected handling of requests that have URI or asterisk in the request target.
This means rejecting asterisk requests that are not OPTIONS and rejecting authority-form requests that are not CONNECT. "http 405 method not allowed" response requires you set "Allow: " with acceptable methods
This also means the server will handle the path and query on targets with absolute URIs as request-targets.
Resolves #666
Resolves #1279