Skip to content

Conversation

@randomeizer
Copy link
Contributor

@randomeizer randomeizer commented Feb 22, 2022

This is an implementation of the ideas posted in #144.

The basic concept with this PR is to reverse the building order of the final input when printing.

In the current implementation, the input is built up from the beginning to the end of the parser chain. Each subsequent parser appends its value to the end of the input.

The chief consequence of this is that you can build Parsers which throw an error when calling parse, but will not throw an error when calling print. For example:

let hello = Parse {
  "Hello, "
  Rest()
  "!"
}

try hello.parse("Hello, world!") // throws an error because Rest consumed the remaining input before "!"
try hello.print("world") // succeeds because `Rest` has no idea that more parsing will occur.

This is obviously a buggy parser, but the fact that they are not commutable is a concern. Both should fail the same way.

Another case is Not, which checks the current input to see if it matches, if so, throws an error. Otherwise parsing continues. Either way, no input is consumed. Here's an example:

Currently, Not would have to be implemented like so:

extension Not: Printer {
  @inlinable
  public func print(_ output: Void, to input: inout Upstream.Input) throws {
    // umm, nothing to do here...
  }
}
let uncommentedLine = Parse {
  Not { "//" }
  Prefix { $0 != "\n" }
}

try uncommentedLine.parse("let foo = true") // no problem
try uncommentedLine.parse("// let foo = true") // throws an error
try uncommentedLine.print("let foo = true") // no problem
try uncommentedLine.print("// let foo = true") // also no problem. WRONG!

The solution is to build up the input in reverse, so that each print can check that the same expectations for the parse still hold.

This has two main changes in how print(_:to:) works.

  1. It must "prepend", not "append" to the input.
  2. Due to this, the passed-in input will contain all subsequent print values, and can be checked for correctness.

To fix our uncommentedLine Parser, we need to tweak Parsers.ZipVO:

extension Parsers.ZipVO: Printer
where
  P0: Printer,
  P1: Printer,
  P0.Input == P1.Input,
  P0.Output == Void
{
  @inlinable public func print(
    _ output: P1.Output,
    to input: inout P0.Input
  ) rethrows {
    try p1.print(output, to: &input)
    try p0.print(to: &input)
  }
}

It now prints p1 first, then p0.

Now, the Prefix will get printed into the input before the Not receives it, and we can do this:

extension Not: Printer {
  @inlinable
  public func print(_ output: Void, to input: inout Upstream.Input) throws {
    do {
      var i = input
      _ = try upstream.parse(&i)
    } catch {
        return
    }
    //      throw PrintingError.expected("not to be \(result)", at: i)
    throw PrintingError()
  }
}

Now, this line will also fail, same as when it is parsed:

try uncommentedLine.print("// let foo = true") // now fails!

As a side-effect of this change, the following is now also possible:

var input = "42 George St"[...]
let parser = Int.parser()

let number = parser.parse(&input)
parser.print(number, to: &input)

input == "42 George St"[...] // true!

This was not possible with the append approach, indicating that prepending is necessary for true commutability.

Thank you for attending my Ted Talk.

Consequentially, all parsers that compose multiple parsers work backwards through provided parsers, and the `input` is built up from the back to the front.

This allows print to check that the output is correct for cases like `Rest`, `Prefix`, `Not`, and `Peek`.
@randomeizer
Copy link
Contributor Author

randomeizer commented Feb 22, 2022

A couple of notes:

Currently there is one failing test: PeekTests.testPrintSkippedPeekFails().

I've left it in because I haven't been able to find a way to have a Skip { Peek {...} } compile without shortcutting it completely by using .printing(""). This allows bad printing.

I'm not sure that it's possible to actually Skip > Peek and make it correctly printable, because Peek outputs a non-Void value, and requires that value to check the current input matches that value.

As such, I'm going to revive my pitch for Expect. It would be the inverse of the Not ParserPrinter, basically, and would call the upstream Parser against the input and fail the print if it doesn't match. That can be built, compiles, and parses/prints correctly.

I haven't checked it in currently, but am happy to do so.

@mbrandonw
Copy link
Member

Hey @randomeizer, Stephen is traveling right now so may not have time to reply, but we discussed this morning and what you have found is definitely interesting! 🙂

The reversal of zip mimics how OneOf works, which tries the printers from last to first:

@inlinable public func print(_ output: P0.Output, to input: inout P0.Input) rethrows {
let original = input
do { try self.p1.print(output, to: &input) } catch let e1 {
do { input = original; try self.p0.print(output, to: &input) } catch let e0 {
throw PrintingError.manyFailed(
[e1, e0], at: input
)
}
}
}

So it definitely seems like a viable thing to do. And further, as you have noted, by reversing the order printers get an opportunity to better see the full picture of what is being printed in order to implement its logic, just like parsers. There are even some existing parser/printers that could take advantage of this, like End:

extension End: Printer {
  @inlinable
  public func print(_ output: Void, to input: inout Input) throws {
    guard input.isEmpty
    else { throw PrintingError() }
  }
}

Which would making printing non-sensical things like this fail even though currently it passes on printer-throws-2:

ParsePrint { "Hello"; End(); "World" }.print(())

So, suffice to say, we've got some more thinking to do! We're going to play around with this for a bit more and see where it takes us. Thanks for brining it up!

@mbrandonw
Copy link
Member

Just wanted to relay a few discussion points @stephencelis and I had this morning about this PR.

  • First, and we were already thinking about this and now this PR makes it even more obvious, but the library should probably discourage using print(_:to:) directly due to its non-intuitive behavior of prepending. The method should only be used in Printer conformances, and for regular printing tasks we should have a print(_:) overload that uses EmptyInitializable to start off the input for printing (maybe we already do). The same goes for parse(_:) that uses inout.

  • We are going to take a slight performance hit by using prepending instead of appending. I think we'll just have to accept it, but good to be aware of. For example, this benchmark shows a 3x slowdown:

    image

@randomeizer
Copy link
Contributor Author

Yeah, the prepending nature of input is not exactly intuitive. I considered changing the function signature to print(_:before:) to make it clearer when reading, but didn't want to disrupt the code even further while investigating.

The performance penalty is unfortunate. It's a pity we can't actually append then just flip the order at the end of the print, once. I suspect it would be a faster operation. Might be possible if print(_) is only used at the top level I guess, but I suspect it would make writing printers even more mind-bending...

@randomeizer
Copy link
Contributor Author

If print(_:to:) is becoming primarily an implementation conformance, perhaps a clearer name is prefix(_:to:), and keep print(_:) as the "normal" function? Makes it inconsistent with parse though, although maybe that should change also if it's getting "internalised".

@randomeizer
Copy link
Contributor Author

Regarding prepending, it might be faster to create a new EmptyInitializable and append both the prefix and the existing input to it, then assign it to the inout input, where appropriate. Not sure it would work in all situations though, and there is no doubt a cost to essentially recreating the whole input every time it prepends.

@mbrandonw
Copy link
Member

Regarding prepending, it might be faster to create a new EmptyInitializable and append both the prefix and the existing input to it, then assign it to the inout input, where appropriate. Not sure it would work in all situations though, and there is no doubt a cost to essentially recreating the whole input every time it prepends.

Yeah I think that's the way to go. Creating an empty value and then appending in reverse order is only slightly slower than just appending.

Also, my benchmark above is not great because count is O(n) and the vast majority of the time spent in that benchmark is computing the count. So when you remove that, prepending is more like 150x slower 😬:

Screen Shot 2022-02-22 at 10 08 37 AM

They now append into a copy of themselves, then append the `input`.
@randomeizer
Copy link
Contributor Author

I've just pushed an update to Literal which flips the order and uses append instead of insert. Haven't done any benchmarks on it yet.

The implementations of PrependingCollection would also need optimisation, but I'm going to have to park it there today.

@mbrandonw
Copy link
Member

Yeah, I think most printer implementations won't have to think about this detail because they will usually write their conformances against PrependableCollection. So there may only be a few cases we have to do the "create empty and append" dance, such as the PrependableCollection conformances and the literal parsers.

Copy link
Member

@stephencelis stephencelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great finds! Thanks! We're going to merge this into the main printer branch now.

@stephencelis stephencelis merged commit fcb318f into pointfreeco:printer-throws-2 Mar 2, 2022
@randomeizer randomeizer deleted the printer-throws-2-prepend branch March 3, 2022 05:09
mbrandonw added a commit that referenced this pull request Apr 11, 2022
* wip

* don't backtrack

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* rank one of errors

* rank errors by most processed

* nested

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* from

* wip

* wip

* wip

* Update benches

* wip

* idiomatic

* wip

* wip

* wip

* truncate

* wip

* wip

* wip

* wip

* clean up

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* Some docs and update tests to use public conversion

* removed testable imports

* lots of docs

* wip

* wip

* wip

* many terminator failure

* wip

* add loop error

* force MapConversion to work only on printers

* bye bye exactly

* wip

* wip

* Added reversion to original if any parsers in `Zip` variations throw an error (#107)

* Added reversion to original if any parsers in `Zip` variations throw an error.

* Updated ParserBuilderTests to check correct consumption of the input

* wip

* fix

* wip

* wip

* wip

* wip

* wip

* wip

* fix

* wip

* wip

* wip

* wip

* Bump package tools version

* wip

* fix

* backtrack printing

* wip

* wip

* wip

* wip

* wip

* New VoidMap parser and updated race parser.

* wip

* pullback

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* docs

* wip

* wip

* wip

* wip

* wip

* wip

* wiop

* print tests for PrefixUpTo/Through

* Printer prepending instead of appending (#145)

* Pointing `swift-custom-dump` to the "main" branch, since `0.4.0` hasn't been released publicly yet.

* Switched printing to `prepend` rather than `append`

Consequentially, all parsers that compose multiple parsers work backwards through provided parsers, and the `input` is built up from the back to the front.

This allows print to check that the output is correct for cases like `Rest`, `Prefix`, `Not`, and `Peek`.

* Added `Many` tests, fixed terminator bug

* Whitespace cleanup

* Fixed typo in test name.

* Updated `End` to check input when printing.

Added related tests.

* Switched `Literal` printing to append rather than insert

They now append into a copy of themselves, then append the `input`.

* Updated `PrependableCollection` to use `append` internally

* Improved error output for OneOfMany

* Adds backtracking to `Optionally`

* Added unit test to check backtracking

* Updated the `testBacktracking` case

Makes it clearer under what circumstances it will fail without backtracking.

* Added backtracking to Optionally

Also updated documentation about backtracking in general.

* Removed incorrectly merged test case.

* Added `Expect`

Which can be both parsed and printed.

* Made the unexpected `Peek` success a passing test

Added notes about why it doesn't throw an error.

* wip

* wip

* wip

* wip

* wip

* Update README.md

Co-authored-by: Brandon Williams <[email protected]>

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* fixes

Co-authored-by: Stephen Celis <[email protected]>
Co-authored-by: Stephen Celis <[email protected]>
Co-authored-by: Brandon Williams <[email protected]>

* wip

* wip

* Updates Optionally to throw any internal `wrapped.print(...)` error (#167)

* Pointing `swift-custom-dump` to the "main" branch, since `0.4.0` hasn't been released publicly yet.

* Made Optionally fail if the wrapped parser fails

Added test cases for printing, and example of failing an invalid print.

* Remove `.map` overload on Always

* wip

* wip

* Make Rest.print fail on empty output.

* roundtripping doc

* wip

* struct conversion docs

* rename .struct to .memberwise

* fixes

* lots of docs

* docs

* wip

* wip

* Revert "wip"

This reverts commit cde658d.

* Printer -> ParserPrinter

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* basic errors

* wip

* wip

* fix

* Simplify `Consumed`

* wip

* wip

* wip

* decumulator

* wip

* wip

* wip

* wip

* wip

* wip

* clean up

* wip

* wip

* wip

* wip

* wip

* wip

* fix

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* fix docc warnings

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* fix

Co-authored-by: Stephen Celis <[email protected]>
Co-authored-by: David Peterson <[email protected]>
Co-authored-by: Stephen Celis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants