Skip to content
This repository was archived by the owner on Mar 19, 2019. It is now read-only.

Do not unescape forward slashes in the request path (#146). #161

Merged
merged 1 commit into from
Dec 22, 2015

Conversation

cesarblum
Copy link
Contributor

  • Set RequestUriBuilder.UseCookedRequestUrl to false to trigger raw parsing.

  • The parsing behavior seemed wrong. It was decoding the path and re-encoding it right after. I changed this to that _requestUriString builds a decoded string. %2F is skipped in the decoding process.

  • The handling of %uXXXX doesn't seem to work. I tried adding a test for that case and http.sys hangs on it. The Wikipedia entry for percent encoding contains this note:

    There exists a non-standard encoding for Unicode characters: %uxxxx, where xxxx is a UTF-16 code unit represented as four hexadecimal digits. This behavior is not specified by any RFC and has been rejected by the W3C.
    Do not un-escape forward slashes in the request path #146

cc @Tratcher @troydai @davidfowl

@Tratcher
Copy link
Member

You can strip out handling for %uXXXX

@@ -1,4 +1,4 @@
<?xml version="1.0" encoding="utf-8"?>
<?xml version="1.0" encoding="utf-8"?>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert.

@cesarblum cesarblum force-pushed the cesarbs/dont-unescape-forward-slash branch from 0e0e9f9 to efd3b53 Compare December 21, 2015 23:01
@cesarblum
Copy link
Contributor Author

  • Added tests for requests with malformed paths (e.g. /foo%)
  • Removed Latin-1 percent-encoded character decoding
  • Removed logic related to UseCookedRequestUrl

}

return result;
return _cookedUriPath; // TODO: is this the right fallback?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at it in the debugger, this looks like the right thing to do. It's the URL as unescaped by http.sys. We want our escaping to take precedence, but if it fails we have http.sys's to fall back to (that would escape %2F however).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you found a test case that will trigger this fallback?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. Any bad path seems to be caught by http.sys and we never end up in this code. But if we were to hit this situation, having the cooked Uri seems like an appropriate fallback.

@Tratcher
Copy link
Member

Have you compared the un-escaping algorithm used here to the one in kestrel?

@cesarblum
Copy link
Contributor Author

If anything, WebListener's behavior is more robust since it returns a 400 on malformed paths. When Kestrel sees % followed by something it can't decode, it only copies the bytes as is to the "decoded" URL i.e. it does not signal an error for malformed paths. Then somewhere along the stack the pending %s are escaped to %25. I didn't dig enough to find out who is doing that escaping.

@cesarblum cesarblum force-pushed the cesarbs/dont-unescape-forward-slash branch 2 times, most recently from 202f60c to c06ec7f Compare December 22, 2015 22:22
@@ -87,6 +88,8 @@ public class RequestTests
[InlineData("/basepath/", "/basepath/subpath", "/basepath", "/subpath")]
[InlineData("/base path/", "/base%20path/sub path", "/base path", "/sub path")]
[InlineData("/base葉path/", "/base%E8%91%89path/sub%E8%91%89path", "/base葉path", "/sub葉path")]
[InlineData("/basepath/", "/basepath/sub%2Fpath", "/basepath", "/sub%2Fpath")]
[InlineData("/basepath/", "/basepath/sub%", "/basepath", "/sub%")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is misleading. The % is escaped by the client and sent as %25. That doesn't happen for the other test cases because they're recognized as valid escape sequences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's especially confusing when the next test says the same input returns 400.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

@Tratcher
Copy link
Member

:shipit:

@cesarblum cesarblum force-pushed the cesarbs/dont-unescape-forward-slash branch from d87f0d5 to d9e06f8 Compare December 22, 2015 23:05
@cesarblum cesarblum merged commit d9e06f8 into dev Dec 22, 2015
@cesarblum cesarblum deleted the cesarbs/dont-unescape-forward-slash branch December 22, 2015 23:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants