Skip to content

Captures<'t> .at, .name return "" to represent both "matches empty string" and "no match" #14602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
emk opened this issue Jun 2, 2014 · 4 comments

Comments

@emk
Copy link
Contributor

emk commented Jun 2, 2014

The methods at and name on Captures<'t> return "" to represent both failed matches and successful matches against the empty string. In the case of at, a programmer can use pos to distinguish between the two situations. But in the case of name, there's no good, public API which allows us to test for a match.

The follow regex has two named matched groups, key and value. value is optional:

// Matches "foo", "foo;v=bar" and "foo;v=".
regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?");

We can access value using caps.name("value"), but there's no way for us to distinguish between the "foo" and "foo;v=" cases.

Even if we're willing to treat both meanings of "" as "failed match", it's still a nuisance to convert them to Option:

fn name<'t>(caps: &Captures<'t>, name: &str) -> Option<&'t str> {
    let s = caps.name(name);
    if s == "" { None } else { Some(s) }
}

Now, I don't know whether it would be better to modify at and name to return Option, or add new at_opt and name_opt functions. But at the very least, there should be some convenient way to distinguish between the cases "foo" and "foo;v=".

This would be relatively easy to implement, because Captures uses Option internally.

@huonw huonw added the A-libs label Jun 2, 2014
@BurntSushi
Copy link
Member

I'm in favor of this. I think you make a good case for an Option type here.

I would vastly prefer changing the return types of existing functions instead of adding at_opt and name_opt. They increase the API needlessly IMO. The library is marked experimental so we don't have to worry about backwards compatibility. :-)

@BurntSushi
Copy link
Member

@emk Do you want to take a crack at making the change? If not, I'm happy to do it.

@emk
Copy link
Contributor Author

emk commented Dec 13, 2014

@BurntSushi Sorry, I got sidetracked running down other issues. But I was doing triage of my open Rust issues right now, and I saw this one again.

Do we need to deal with this change before 1.0? And are you still in favor of changing the existing name and at accessors to return Option? I think we should go ahead and do this, and I'd be happy to submit a patch.

@emk
Copy link
Contributor Author

emk commented Dec 13, 2014

OK, I went ahead and implemented this as PR #19818. This is just one possible approach, of course—I'm not attached to the particular API.

Anyway, I did my best to make this mergable, but it's my first patch, so I've probably missed a lot of stuff. Once again, thank you for your guidance here!

brson added a commit to brson/rust that referenced this issue Dec 16, 2014
Hello! This is my first Rust patch, and I fear that I've probably skipped at least 7 critical steps. I'd appreciate your feedback and advice about how to contribute to Rust.

This patch is based on a discussion with @BurntSushi in rust-lang#14602 a while back. I'm happy to revise it as needed to fit into the modern world. :-)

As discussed in that issue, the existing `at` and `name` functions represent two different results with the empty string:

1. Matched the empty string.
2. Did not match anything.

Consider the following example.  This regex has two named matched groups, `key` and `value`. `value` is optional:

```rust
// Matches "foo", "foo;v=bar" and "foo;v=".
regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?");
```

We can access `value` using `caps.name("value")`, but there's no way for us to distinguish between the `"foo"` and `"foo;v="` cases.

Early this year, @BurntSushi recommended modifying the existing `at` and `name` functions to return `Option`, instead of adding new functions to the API.

This is a [breaking-change], but the fix is easy:

- `refs.at(1)` becomes `refs.at(1).unwrap_or("")`.
- `refs.name(name)` becomes `refs.name(name).unwrap_or("")`.
alexcrichton added a commit to alexcrichton/rust that referenced this issue Dec 17, 2014
Hello! This is my first Rust patch, and I fear that I've probably skipped at least 7 critical steps. I'd appreciate your feedback and advice about how to contribute to Rust.

This patch is based on a discussion with @BurntSushi in rust-lang#14602 a while back. I'm happy to revise it as needed to fit into the modern world. :-)

As discussed in that issue, the existing `at` and `name` functions represent two different results with the empty string:

1. Matched the empty string.
2. Did not match anything.

Consider the following example.  This regex has two named matched groups, `key` and `value`. `value` is optional:

```rust
// Matches "foo", "foo;v=bar" and "foo;v=".
regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?");
```

We can access `value` using `caps.name("value")`, but there's no way for us to distinguish between the `"foo"` and `"foo;v="` cases.

Early this year, @BurntSushi recommended modifying the existing `at` and `name` functions to return `Option`, instead of adding new functions to the API.

This is a [breaking-change], but the fix is easy:

- `refs.at(1)` becomes `refs.at(1).unwrap_or("")`.
- `refs.name(name)` becomes `refs.name(name).unwrap_or("")`.
@bors bors closed this as completed in c2b0d7d Dec 17, 2014
bors pushed a commit to rust-lang-ci/rust that referenced this issue Apr 22, 2025
The `ui_test` package runs test in edition 2021 mode by default. This PR
switches our tests to edition 2024. The commits progressively make our
test suite compatible with both edition 2021 and edition 2024, then
switches the testing mode to edition 2024, which is compatible with what
`cargo dev lint` also uses by default.

The changes are (without functionality changes in tests):
- Add `unsafe` when [calling unsafe constructs inside `unsafe
fn`](https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-op-in-unsafe-fn.html),
to [unsafe
attributes](https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-attributes.html),
and to [`extern`
blocks](https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-extern.html).
- Use stricter reference patterns to accomodate with the [new match
ergonomics
rules](https://doc.rust-lang.org/edition-guide/rust-2024/match-ergonomics.html).
- Add some `use<>` markers where required to keep the same semantics
under the [new RPIT lifetime
rules](https://doc.rust-lang.org/edition-guide/rust-2024/rpit-lifetime-capture.html).

Some other changes ensure that non-regression tests are still enforced:
- Add edition 2021 specific tests when switching rules would make some
explicitly tested constructs untested, or when the test require using
the older [temporary tail expression scoping
rules](https://doc.rust-lang.org/edition-guide/rust-2024/temporary-tail-expr-scope.html).
- In `misnamed_getters`, check that getters containing an `unsafe` block
(for example a getter on an `Union` field) trigger the lint.

The last commit switches the default edition for running UI tests to
edition 2024.

changelog: [`misnamed_getters`]: getters containing an `unsafe` block
are also linted
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants