-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Captures<'t> .at, .name return "" to represent both "matches empty string" and "no match" #14602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm in favor of this. I think you make a good case for an I would vastly prefer changing the return types of existing functions instead of adding |
@emk Do you want to take a crack at making the change? If not, I'm happy to do it. |
@BurntSushi Sorry, I got sidetracked running down other issues. But I was doing triage of my open Rust issues right now, and I saw this one again. Do we need to deal with this change before 1.0? And are you still in favor of changing the existing |
OK, I went ahead and implemented this as PR #19818. This is just one possible approach, of course—I'm not attached to the particular API. Anyway, I did my best to make this mergable, but it's my first patch, so I've probably missed a lot of stuff. Once again, thank you for your guidance here! |
Hello! This is my first Rust patch, and I fear that I've probably skipped at least 7 critical steps. I'd appreciate your feedback and advice about how to contribute to Rust. This patch is based on a discussion with @BurntSushi in rust-lang#14602 a while back. I'm happy to revise it as needed to fit into the modern world. :-) As discussed in that issue, the existing `at` and `name` functions represent two different results with the empty string: 1. Matched the empty string. 2. Did not match anything. Consider the following example. This regex has two named matched groups, `key` and `value`. `value` is optional: ```rust // Matches "foo", "foo;v=bar" and "foo;v=". regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?"); ``` We can access `value` using `caps.name("value")`, but there's no way for us to distinguish between the `"foo"` and `"foo;v="` cases. Early this year, @BurntSushi recommended modifying the existing `at` and `name` functions to return `Option`, instead of adding new functions to the API. This is a [breaking-change], but the fix is easy: - `refs.at(1)` becomes `refs.at(1).unwrap_or("")`. - `refs.name(name)` becomes `refs.name(name).unwrap_or("")`.
Hello! This is my first Rust patch, and I fear that I've probably skipped at least 7 critical steps. I'd appreciate your feedback and advice about how to contribute to Rust. This patch is based on a discussion with @BurntSushi in rust-lang#14602 a while back. I'm happy to revise it as needed to fit into the modern world. :-) As discussed in that issue, the existing `at` and `name` functions represent two different results with the empty string: 1. Matched the empty string. 2. Did not match anything. Consider the following example. This regex has two named matched groups, `key` and `value`. `value` is optional: ```rust // Matches "foo", "foo;v=bar" and "foo;v=". regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?"); ``` We can access `value` using `caps.name("value")`, but there's no way for us to distinguish between the `"foo"` and `"foo;v="` cases. Early this year, @BurntSushi recommended modifying the existing `at` and `name` functions to return `Option`, instead of adding new functions to the API. This is a [breaking-change], but the fix is easy: - `refs.at(1)` becomes `refs.at(1).unwrap_or("")`. - `refs.name(name)` becomes `refs.name(name).unwrap_or("")`.
The `ui_test` package runs test in edition 2021 mode by default. This PR switches our tests to edition 2024. The commits progressively make our test suite compatible with both edition 2021 and edition 2024, then switches the testing mode to edition 2024, which is compatible with what `cargo dev lint` also uses by default. The changes are (without functionality changes in tests): - Add `unsafe` when [calling unsafe constructs inside `unsafe fn`](https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-op-in-unsafe-fn.html), to [unsafe attributes](https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-attributes.html), and to [`extern` blocks](https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-extern.html). - Use stricter reference patterns to accomodate with the [new match ergonomics rules](https://doc.rust-lang.org/edition-guide/rust-2024/match-ergonomics.html). - Add some `use<>` markers where required to keep the same semantics under the [new RPIT lifetime rules](https://doc.rust-lang.org/edition-guide/rust-2024/rpit-lifetime-capture.html). Some other changes ensure that non-regression tests are still enforced: - Add edition 2021 specific tests when switching rules would make some explicitly tested constructs untested, or when the test require using the older [temporary tail expression scoping rules](https://doc.rust-lang.org/edition-guide/rust-2024/temporary-tail-expr-scope.html). - In `misnamed_getters`, check that getters containing an `unsafe` block (for example a getter on an `Union` field) trigger the lint. The last commit switches the default edition for running UI tests to edition 2024. changelog: [`misnamed_getters`]: getters containing an `unsafe` block are also linted
The methods
at
andname
onCaptures<'t>
return""
to represent both failed matches and successful matches against the empty string. In the case ofat
, a programmer can usepos
to distinguish between the two situations. But in the case ofname
, there's no good, public API which allows us to test for a match.The follow regex has two named matched groups,
key
andvalue
.value
is optional:We can access
value
usingcaps.name("value")
, but there's no way for us to distinguish between the"foo"
and"foo;v="
cases.Even if we're willing to treat both meanings of
""
as "failed match", it's still a nuisance to convert them toOption
:Now, I don't know whether it would be better to modify
at
andname
to returnOption
, or add newat_opt
andname_opt
functions. But at the very least, there should be some convenient way to distinguish between the cases"foo"
and"foo;v="
.This would be relatively easy to implement, because
Captures
usesOption
internally.The text was updated successfully, but these errors were encountered: