Skip to content

URLComponents.string should percent-encode colons in first path segment if needed #1117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 13, 2025

Conversation

jrflat
Copy link
Contributor

@jrflat jrflat commented Jan 10, 2025

The following code can recursively call .path(), causing an infinite loop.

let str = "./not a scheme:"
let baseStr = "base"
let base = URL(string: baseStr)!
let url = URL(string: str, relativeTo: base)!
print(url.path()) // Infinite loop

The issue is that self.path() calls .absoluteURL.path() if self has a base URL. .absoluteURL should never have a base URL, so this shouldn't recurse. However, if URL(string: absoluteString) fails like it does in this example, then .absoluteURL returns self, causing the recursion.

In the example above, merging the relative and base paths gives us ./not%20a%20scheme: since they are both relative. Then, we remove the dot segments as specified by the RFC algorithm, giving us not%20a%20scheme:. We assign this to a urlComponents.percentEncodedPath, then when we're finished building the components, we call urlComponents.string to get the absolute string.

URLComponents.string should always return a parsable URL string, or nil. In this case, the string that's returned is not%20a%20scheme:, which is not valid.

This PR fixes the issue by:

  1. Percent-encoding colons in the first path segment when generating URLComponents.string, but only if the colons could be mistaken as a scheme separator.
  2. Calling absoluteURL.relativePath() instead to ensure that we never recurse.

@jrflat
Copy link
Contributor Author

jrflat commented Jan 10, 2025

@swift-ci please test

@jrflat jrflat requested review from itingliu and parkera January 10, 2025 23:20
// These would fail if we did not percent-encode the colon.
// .string should always produce a valid URL string, or nil.

XCTAssertNotNil(URL(string: comp.string!))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference: we prefer using XCTUnwrap to unwrap the string (like you did above on line 1327). I'd also appreciate if we can use XCTAssertEqual to assert the exact content instead of less-informational XCTAssertNotNil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that makes sense, I'll do a cleanup of all the URL tests in a follow-up PR to prefer that pattern

@jrflat jrflat merged commit 2bc9094 into swiftlang:main Jan 13, 2025
3 checks passed
jrflat added a commit to jrflat/swift-foundation that referenced this pull request Mar 5, 2025
jrflat added a commit to jrflat/swift-foundation that referenced this pull request Mar 5, 2025
parkera pushed a commit that referenced this pull request Mar 5, 2025
* (141549683) Restore behavior of URL(string: "") returning nil (#1103)

* (142076445) Allow URL.standardized to return an empty string URL (#1110)

* (142076445) Allow URL.standardized to return an empty string URL

* Add ?? self to prevent force-unwrap

* (142446243) Compatibility behaviors for Swift URL (#1113)

* (142589056) URLComponents.string should percent-encode colons in first path segment if needed (#1117)

* (142667792) URL.absoluteString crashes if baseURL starts with colon (#1119)

* (143159003) Don't encode colon if URLComponents path starts with colon (#1139)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants