Move backtracking into `OneOf` #108

stephencelis · 2022-02-08T19:01:18Z

This PR significantly changes how backtracking is done in the library, hopefully for the better!

All backtracking work has been removed from individual operators (with the exception of Peek and Not)
Backtracking has been inserted in between each step of OneOf

Up till now it's been the responsibility of the parser to "be a good citizen" and clean up after itself on failure. This means if a parser failed to do so, you could get into a bad state or make a bad assumption.

By centralizing this work in OneOf, most parsers are freed of this responsibility and no longer have to worry about undoing changes to input. That is now simply be the responsibility of OneOf instead.

tgrapperon · 2022-02-08T19:33:37Z

If I understand correctly, this is a huge change of behavior for end-users of the library. Until now, the input was freely available to reuse somewhere else if parsing did fail, but it can now be partially consumed. Issues can be caught with appropriate tests, but many may trip on this otherwise. Aside a prominent documentation/communication about the change, maybe there's a safer way to handle the transition? For example by adding a .parse() overload or new variant that won't consume the input?

stephencelis · 2022-02-08T19:51:59Z

@tgrapperon Thanks for the feedback!

If I understand correctly, this is a huge change of behavior for end-users of the library.

We do realize the behavior change is significant, but believe the impact to be small and we will call it out in the release notes.

Of all the parsers we've written we only found one to have a problem with this change, and that's the XCTest log parser in our benchmarks. The problem was that a parser was written in a very strange way that assumed a compactMap would undo its work. The fix itself was an improvement in readability:

https://github.com/pointfreeco/swift-parsing/pull/108/files#diff-27f107d6fe057661c029f42de5d29456e46036fff3acee94ee831d337ec1ec31

Until now, the input was freely available to reuse somewhere else if parsing did fail, but it can now be partially consumed.

While this was true of many of the parsers in the library, there has never been any guarantee that a parser defined outside of the library would undo its work, nor have we documented that parsers undo their work (all documentation around backtracking is additive in this PR). In the end, expecting a parser to undo upon failure isn't something we could ever truly trust it to do. Moving this logic to only OneOf naturally falls out of that.

Aside a prominent documentation/communication about the change, maybe there's a safer way to handle the transition? For example by adding a .parse() overload or new variant that won't consume the input?

We could keep the other backtracking behavior in for now with the plan to remove it in the future, but that feels like we're just kicking the can down the road and will have the same problem whenever we do change the behavior. We'd like to rip the band-aid off early.

Would you like to run any parsers you've written on main to see if anything unexpected happens?

tgrapperon · 2022-02-08T20:16:48Z

Thanks for the response @stephencelis! I agree. I was just worried by the transition, not the behavior. Fortunately, the library often works on derived types like Substring's or their views, and it is very probable that the user created the input's value specifically for the parser.
Maybe adding a - Note: or - Warning: in the final parser() documentation would help the information to stand out.

I have a case where I'm trying to parse a String with a parser, and if fails, I send it in another function that is using ML and a more fuzzy approach. But I just checked and the "Substring protection" did kick-in here, as the ML function accepts a String argument (and the parser even works with utf8 views).

I wrote very few parsers with an explicit parse function myself, and I would always backtrack completely if some internal parser did failed in this case, so I guess there won't be behavioral issues. My concern was really about top-level uses, but it turns out I was probably over-worried!

This is a good change overall, and I can't wait for what's coming in the next weeks/months!

mackoj · 2022-02-08T21:03:04Z

I did test it on our project and it did perform as expected 👍🏽

randomeizer · 2022-02-08T23:24:50Z

In general I think this is fine. One additional "good citizen" might be Many, particularly if it fails to match the terminator, but I think it is already built that way.

randomeizer · 2022-02-08T23:31:26Z

Might be worth adding relevant documentation about Many (and Not/Peek?) also backtracking? Particularly if it affects performance.

stephencelis and others added 8 commits January 25, 2022 15:15

wip

fb99c56

Merge remote-tracking branch 'origin/main' into one-of-backtracking

c87c120

wip

c2f060f

wip

079a113

docs

db89f3a

readme

7a6031b

format

2bda77a

readme

042254e

stephencelis requested a review from mbrandonw February 8, 2022 19:01

stephencelis added 2 commits February 8, 2022 14:05

fix

747ede4

wip

cfcc40c

mbrandonw approved these changes Feb 8, 2022

View reviewed changes

stephencelis merged commit 819f9d2 into main Feb 8, 2022

stephencelis deleted the one-of-backtracking branch February 8, 2022 19:19

mbrandonw mentioned this pull request Feb 24, 2022

Optionally should backtrack when failing #148

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move backtracking into `OneOf` #108

Move backtracking into `OneOf` #108

Uh oh!

stephencelis commented Feb 8, 2022

Uh oh!

tgrapperon commented Feb 8, 2022

Uh oh!

stephencelis commented Feb 8, 2022

Uh oh!

tgrapperon commented Feb 8, 2022 •

edited

Loading

Uh oh!

mackoj commented Feb 8, 2022

Uh oh!

randomeizer commented Feb 8, 2022 •

edited

Loading

Uh oh!

randomeizer commented Feb 8, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Move backtracking into OneOf #108

Move backtracking into OneOf #108

Uh oh!

Conversation

stephencelis commented Feb 8, 2022

Uh oh!

tgrapperon commented Feb 8, 2022

Uh oh!

stephencelis commented Feb 8, 2022

Uh oh!

tgrapperon commented Feb 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mackoj commented Feb 8, 2022

Uh oh!

randomeizer commented Feb 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

randomeizer commented Feb 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Move backtracking into `OneOf` #108

Move backtracking into `OneOf` #108

tgrapperon commented Feb 8, 2022 •

edited

Loading

randomeizer commented Feb 8, 2022 •

edited

Loading

randomeizer commented Feb 8, 2022 •

edited

Loading