Skip to content

Modify regex::Captures::{at,name} to return Option #19818

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 17, 2014

Conversation

emk
Copy link
Contributor

@emk emk commented Dec 13, 2014

Hello! This is my first Rust patch, and I fear that I've probably skipped at least 7 critical steps. I'd appreciate your feedback and advice about how to contribute to Rust.

This patch is based on a discussion with @BurntSushi in #14602 a while back. I'm happy to revise it as needed to fit into the modern world. :-)

As discussed in that issue, the existing at and name functions represent two different results with the empty string:

  1. Matched the empty string.
  2. Did not match anything.

Consider the following example. This regex has two named matched groups, key and value. value is optional:

// Matches "foo", "foo;v=bar" and "foo;v=".
regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?");

We can access value using caps.name("value"), but there's no way for us to distinguish between the "foo" and "foo;v=" cases.

Early this year, @BurntSushi recommended modifying the existing at and name functions to return Option, instead of adding new functions to the API.

This is a [breaking-change], but the fix is easy:

  • refs.at(1) becomes refs.at(1).unwrap_or("").
  • refs.name(name) becomes refs.name(name).unwrap_or("").

@rust-highfive
Copy link
Contributor

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see CONTRIBUTING.md for more information.

@BurntSushi
Copy link
Member

@emk Oh yay! LGTM. :-)

@emk
Copy link
Contributor Author

emk commented Dec 14, 2014

Thank you for the quick response! Bors claims that this patch is now "STALE", and GitHub says there are "merge conflicts that must be resolved". What should I do next?

@alexcrichton
Copy link
Member

@emk if you run git fetch --all && git rebase origin/master and then force push the result you should be good to go.

@emk
Copy link
Contributor Author

emk commented Dec 14, 2014

The fix is to replace this:

<<<<<<< HEAD
        let text = re.replace_all(text, |&mut: refs: &Captures| -> String {
            let (pre, name) = (refs.at(1), refs.at(2));
=======
        let text = re.replace_all(text, |refs: &Captures| -> String {
            let pre = refs.at(1).unwrap_or("");
            let name = refs.at(2).unwrap_or("");
>>>>>>> regex_at_name_opt

…with this:

        let text = re.replace_all(text, |&mut: refs: &Captures| -> String {
            let pre = refs.at(1).unwrap_or("");
            let name = refs.at(2).unwrap_or("");

I'm happy to submit additional patches to resolve this conflict if somebody can point me towards the policy for merge conflicts an PRs. Thank you!

@emk
Copy link
Contributor Author

emk commented Dec 14, 2014

Oh, sorry. You beat me to the post. I'm currently re-running make check to verify the updated patch, and I'll force-push it tomorrow morning.

Closes rust-lang#14602.  As discussed in that issue, the existing `at` and `name`
functions represent two different results with the empty string:

1. Matched the empty string.
2. Did not match anything.

Consider the following example.  This regex has two named matched
groups, `key` and `value`. `value` is optional:

```rust
// Matches "foo", "foo;v=bar" and "foo;v=".
regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?");
```

We can access `value` using `caps.name("value")`, but there's no way for
us to distinguish between the `"foo"` and `"foo;v="` cases.

Early this year, @BurntSushi recommended modifying the existing `at` and
`name` functions to return `Option`, instead of adding new functions to
the API.

This is a [breaking-change], but the fix is easy:

- `refs.at(1)` becomes `refs.at(1).unwrap_or("")`.
- `refs.name(name)` becomes `refs.name(name).unwrap_or("")`.
@emk emk force-pushed the regex_at_name_opt branch from 8be3d88 to c2b0d7d Compare December 14, 2014 14:01
@emk
Copy link
Contributor Author

emk commented Dec 14, 2014

OK, let's try this again. I've resolved the merge conflict. I've also fixed several doc tests that I overlooked the first time around.

The doc tests were broken in the original patch because I ended up running make check piecemeal to work around issues with Valgrind, and I somehow missed the doctests. I've now figured out how to disable Valgrind, and I've managed to run the doctests as follows:

make check
# Fix broken doctests
make check-docs NO_REBUILD=1

The check-docs succeeded, but if check does anything important after check-docs, I haven't run it yet. So I'm running another full make check right now, but it's going to take several hours.

Anyway, thank you for helping a newbie figure out how to produce a competent Rust patch!

@BurntSushi
Copy link
Member

@emk I've long since forgotten the incantation, but I'm pretty sure you can run tests for a specific crate. You shouldn't have to run everything.

@BurntSushi
Copy link
Member

@emk The docs here look promising: https://github.com/rust-lang/rust/blob/master/Makefile.in#L11

@emk
Copy link
Contributor Author

emk commented Dec 14, 2014

Thank you for your help! I've already run all the per-crate tests and the doc tests several times, plus the runpass, etc. So this patch should be basically good.

For the future, I'll keep trying to run a full make check on my system. Right now, it's failing in DWARF debugsym tests that call into gdb. I probably need a new gdb version of something.

bors added a commit that referenced this pull request Dec 14, 2014
Modify `regex::Captures::{at,name}` to return `Option`

Reviewed-by: alexcrichton
brson added a commit to brson/rust that referenced this pull request Dec 16, 2014
Hello! This is my first Rust patch, and I fear that I've probably skipped at least 7 critical steps. I'd appreciate your feedback and advice about how to contribute to Rust.

This patch is based on a discussion with @BurntSushi in rust-lang#14602 a while back. I'm happy to revise it as needed to fit into the modern world. :-)

As discussed in that issue, the existing `at` and `name` functions represent two different results with the empty string:

1. Matched the empty string.
2. Did not match anything.

Consider the following example.  This regex has two named matched groups, `key` and `value`. `value` is optional:

```rust
// Matches "foo", "foo;v=bar" and "foo;v=".
regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?");
```

We can access `value` using `caps.name("value")`, but there's no way for us to distinguish between the `"foo"` and `"foo;v="` cases.

Early this year, @BurntSushi recommended modifying the existing `at` and `name` functions to return `Option`, instead of adding new functions to the API.

This is a [breaking-change], but the fix is easy:

- `refs.at(1)` becomes `refs.at(1).unwrap_or("")`.
- `refs.name(name)` becomes `refs.name(name).unwrap_or("")`.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Dec 17, 2014
Hello! This is my first Rust patch, and I fear that I've probably skipped at least 7 critical steps. I'd appreciate your feedback and advice about how to contribute to Rust.

This patch is based on a discussion with @BurntSushi in rust-lang#14602 a while back. I'm happy to revise it as needed to fit into the modern world. :-)

As discussed in that issue, the existing `at` and `name` functions represent two different results with the empty string:

1. Matched the empty string.
2. Did not match anything.

Consider the following example.  This regex has two named matched groups, `key` and `value`. `value` is optional:

```rust
// Matches "foo", "foo;v=bar" and "foo;v=".
regex!(r"(?P<key>[a-z]+)(;v=(?P<value>[a-z]*))?");
```

We can access `value` using `caps.name("value")`, but there's no way for us to distinguish between the `"foo"` and `"foo;v="` cases.

Early this year, @BurntSushi recommended modifying the existing `at` and `name` functions to return `Option`, instead of adding new functions to the API.

This is a [breaking-change], but the fix is easy:

- `refs.at(1)` becomes `refs.at(1).unwrap_or("")`.
- `refs.name(name)` becomes `refs.name(name).unwrap_or("")`.
@bors bors merged commit c2b0d7d into rust-lang:master Dec 17, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants