From 562a746a7828016082d4d903c731ed3202153906 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Wed, 15 Nov 2023 23:06:48 +0100 Subject: [PATCH 01/20] cabal exact pritn --- proposals/0000-cabal-exact-printer.md | 51 +++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 proposals/0000-cabal-exact-printer.md diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md new file mode 100644 index 00000000..debc0f82 --- /dev/null +++ b/proposals/0000-cabal-exact-printer.md @@ -0,0 +1,51 @@ +# Community Project Template - place your title here + +_This template is for Haskell Foundation Technical Proposals that are community projects asking for support. +Community project proposals are requests that the Haskell Foundation allocate funding to a particular project or goal to be executed by community members._ + +_Please delete the italic text before submitting._ + + +## Abstract + +_This section should provide a summary of the proposal that identifies the key problems to be solved and summarizes the solution._ + +## Background + +_This section should explain any background (targeting a casual audience) needed to understand the proposal’s motivation (e.g. a high level overview of the technical details and some history)._ + +## Problem Statement + +_This section should describe the problem that the proposal intends to solve and how solving the problem will benefit the Haskell community. +It should also enumerate the requirements against which a solution should be evaluated._ + +## Prior Art and Related Efforts + +_This section should describe prior attempts to solve the problem, other relevant prior work, and what others in the community are doing to address the problem. +It should describe the relationship between the proposed work and the existing efforts. +If past attempts did not succeed, this section should provide a theory of why not._ + +## Technical Content + +_This section should describe the work that is being proposed to the community for comment, including both technical aspects (choices of system architecture, integration with existing tools and workflows) and community governance (how the developed project will be administered, maintained, and otherwise cared for in the future). +It should also describe the benefits, drawbacks, and risks that are associated with these decisions. +It can be a good idea to describe alternative approaches here as well, and why the proposer prefers the current approach._ + +## Timeline + +_When will the project be completed? +What are the intermediate steps and intermediate concrete deliverables for the community?_ + +## Budget + +_How much money is needed to accomplish the goal? +How will it be used?_ + +## Stakeholders + +_Who stands to gain or lose from the implementation of this proposal? +Proposals should identify stakeholders so that they can be contacted for input, and a final decision should not occur without having made a good-faith effort to solicit representative feedback from important stakeholder groups._ + +## Success + +_Under what conditions will the project be considered a success?_ From f5ef56c2b8cd79159852f3b5b56f4b7fab92bcb9 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Wed, 29 Nov 2023 22:58:14 +0100 Subject: [PATCH 02/20] exact printer --- proposals/0000-cabal-exact-printer.md | 164 +++++++++++++++++++++++--- 1 file changed, 148 insertions(+), 16 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index debc0f82..bfa36ce6 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -1,51 +1,183 @@ -# Community Project Template - place your title here - -_This template is for Haskell Foundation Technical Proposals that are community projects asking for support. -Community project proposals are requests that the Haskell Foundation allocate funding to a particular project or goal to be executed by community members._ - -_Please delete the italic text before submitting._ +# Community Project Cabal Exact Printer ## Abstract -_This section should provide a summary of the proposal that identifies the key problems to be solved and summarizes the solution._ +The Exact Printer project aims to develop a precise parsing and printing tool for .cabal files in Haskell, inspired by ghc-exactprint. This tool will enable byte-for-byte bidirectional parsing and printing, enhancing the functionality and usability of the Cabal package manager. The project addresses the need for accurate and efficient management of package descriptions, a crucial aspect for Haskell developers. ## Background -_This section should explain any background (targeting a casual audience) needed to understand the proposal’s motivation (e.g. a high level overview of the technical details and some history)._ +cabal is a build tool for haskell. +It directs GHC and deals with "packages". +Which are collections of haskell modules. +cabal allows you to depend on libraries, +publish libraries and manage build flags. + +In essence if you want to do something non-trivial +with haskell you want to use cabal. + +.cabal files are fundamental in Haskell programming, used to describe package structures. +Current methods of parsing and printing these files lack the precision needed +for certain tasks. +Such as generating bounds for dependencies, +formatting, +or adding missing modules from the manifest. + +The project is inspired by the capabilities of [ghc-exactprint](https://github.com/alanz/ghc-exactprint) and aims to bring similar +[functionalities](https://gitlab.haskell.org/ghc/ghc/-/wikis/api-annotations#in-tree-exact-printing-annotations) to Cabal. +It defines exact printing as follows: +> Taking an abstract syntax tree (AST) and converting it into a string that looks like what the user originally wrote +> is called exact-printing. An exact-printed program includes the original spacing and all comments. ## Problem Statement _This section should describe the problem that the proposal intends to solve and how solving the problem will benefit the Haskell community. It should also enumerate the requirements against which a solution should be evaluated._ +This [issue](https://github.com/haskell/cabal/issues/7544) is stracked on the cabal bug tracker. + +Currently if you build a project with an extra module not listed in your cabal file, +ghc emits a warning: +``` +: error: [-Wmissing-home-modules, -Werror=missing-home-modules] + These modules are needed for compilation but not listed in your .cabal file's other-modules: + X +``` + +You'd say, why doesn't cabal just add this module to the cabal file? +Well, it can't. +cabal is currently only able to parse cabal files, +and print them back out in a mangled form via the pretty printer. + +you can see this mangling by running `cabal format` on a cabal file, +the issues I saw were: + +1. Delete all comments +2. merge any `common` stanza into wherever it was imported. +3. change line ordering +4. change spacing (although perhaps to be expected from a formatter) + +A similar problem occurs when HLS want's to do any modification to a cabal +file during development. +For example if a module was added or renamed, or if a (hidden) library is required. +Or perhaps some function used in a known library via hoogle for example. +HLS has no clue what to do, +because even if it links against the cabal library, +there is no function to create a cabal file +(lest it fucks it up) + ## Prior Art and Related Efforts -_This section should describe prior attempts to solve the problem, other relevant prior work, and what others in the community are doing to address the problem. -It should describe the relationship between the proposed work and the existing efforts. -If past attempts did not succeed, this section should provide a theory of why not._ +Previous attempts to address this problem have been fragmented, and no comprehensive solution has been developed. This project builds upon the ideas discussed in various issues over the past six years (e.g., Haskell/cabal issues #3614, #6621, #6187, #4965). The project will synthesize these discussions into a cohesive solution. + +I think prior art would be tools such as + +cabalfmt, hpack and autopack ## Technical Content -_This section should describe the work that is being proposed to the community for comment, including both technical aspects (choices of system architecture, integration with existing tools and workflows) and community governance (how the developed project will be administered, maintained, and otherwise cared for in the future). -It should also describe the benefits, drawbacks, and risks that are associated with these decisions. -It can be a good idea to describe alternative approaches here as well, and why the proposer prefers the current approach._ +This proposal want's to add a function to cabal: + +``` +printExact :: GenericPackageDescription -> Text +``` + +Which will do exact printing. +This function has the following properties: + +byte for byte roundtrip of all hacakgePackage: +``` + forall (hackagePackage :: ByteString) . (printExact <$> (parseGeneric hackagePackage)) == Right hackagePackage +``` + +where `hackagePackage` is a cabal package found on hackage. + + + +to support exact printing a new field is added to `GenericPackageDescritpion`: + +```haskell +data GenericPackageDescription { + ... + , exactPrintMeta :: ExactPrintMeta + } +``` + +which in turn contains various meta data we need for exact printing: +```haskell +data ExactPrintMeta = ExactPrintMeta + { exactPositions :: Map [NameSpace] ExactPosition + , exactComments :: Map Position Text + } +``` +It's unclear what other fields are required right now. +For example build bounds require another map +like `Map ([NameSpace], PackageVersionConstraint) Original`, +this kind of representation allows us to retrieve the original only if it hasn't changed. +However initial inspection of the parser showed it's difficult to retrieve `PackageVersionConstraint` and `[NameSpace]` together, +because they're deeply nested within field grammars. +so perhaps an intrusive design is more easy for that, +it's unclear to me right now. +However it is clear these problems can be solved, it just takes time and effort. + +However peliminary testing shows that this approach works with multiple secrion cabal files. + +Pertubation of the `GenericPackageDescription` must be possible for ++ module addition/removal ++ library addition/removal + +The issue for addition is that you now have to invent exact positions. +for removal, if it involves a line, you've to fix up all following lines, +(and it has to know something was removed). + +### Partials + ++ modification to the `GenericPackageDescription`. + covering every conceivable modification would be tough. + +### Not included + ++ Any warnings during parsing won't be included (low value add) ++ Any integration in tools such as `cabal format` or `cabal gen-bounds`. + These tasks are relatively easy in comparison, however if we include this as an unused well tested library + function, it'll be easier to release. ## Timeline _When will the project be completed? What are the intermediate steps and intermediate concrete deliverables for the community?_ +I think the overall work is roughly 2 and a half week of fulltime work. +However this maybe executed over several months. +I'd expect the project to be completed by April 2024. + +I've a free week in decemeber for example, but then it'll be weekends and nights. + ## Budget -_How much money is needed to accomplish the goal? -How will it be used?_ +120 hours * 120 euro per hours is 14'400 euro. + +The money will be used to compensate for opportunity cost, +and allowing me, and hopefully others, +to justify taking on similar large projects in the future. ## Stakeholders _Who stands to gain or lose from the implementation of this proposal? Proposals should identify stakeholders so that they can be contacted for input, and a final decision should not occur without having made a good-faith effort to solicit representative feedback from important stakeholder groups._ +The primary benificiaries would be HLS and cabal users. +we can: + ++ improve cabal usability, gen-bounds comes to mind. ++ improve HLS and cabal interaction. ++ + +An indirect loser maybe the stack build tool. +Which misses this direct investment, + +and cabal may drastically. + ## Success _Under what conditions will the project be considered a success?_ From bb5087d6048b3e27a23e5b3372429b311f6c50df Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Thu, 30 May 2024 00:20:50 +0200 Subject: [PATCH 03/20] add more text --- proposals/0000-cabal-exact-printer.md | 59 ++++++++++++++++++++------- 1 file changed, 45 insertions(+), 14 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index bfa36ce6..8d3ac309 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -3,11 +3,16 @@ ## Abstract -The Exact Printer project aims to develop a precise parsing and printing tool for .cabal files in Haskell, inspired by ghc-exactprint. This tool will enable byte-for-byte bidirectional parsing and printing, enhancing the functionality and usability of the Cabal package manager. The project addresses the need for accurate and efficient management of package descriptions, a crucial aspect for Haskell developers. +The Exact Printer project aims to develop a precise parsing and printing tool for .cabal files in Haskell, +inspired by ghc-exactprint. +This tool will enable byte-for-byte bidirectional parsing and printing, +enhancing the functionality and usability of the Cabal package manager. +The project addresses the need for accurate and efficient management of package descriptions, +a crucial aspect for Haskell developers. ## Background -cabal is a build tool for haskell. +Cabal is a build tool for haskell. It directs GHC and deals with "packages". Which are collections of haskell modules. cabal allows you to depend on libraries, @@ -34,6 +39,7 @@ It defines exact printing as follows: _This section should describe the problem that the proposal intends to solve and how solving the problem will benefit the Haskell community. It should also enumerate the requirements against which a solution should be evaluated._ + This [issue](https://github.com/haskell/cabal/issues/7544) is stracked on the cabal bug tracker. Currently if you build a project with an extra module not listed in your cabal file, @@ -46,16 +52,38 @@ ghc emits a warning: You'd say, why doesn't cabal just add this module to the cabal file? Well, it can't. -cabal is currently only able to parse cabal files, -and print them back out in a mangled form via the pretty printer. - -you can see this mangling by running `cabal format` on a cabal file, -the issues I saw were: - -1. Delete all comments -2. merge any `common` stanza into wherever it was imported. -3. change line ordering -4. change spacing (although perhaps to be expected from a formatter) +Cabal is currently only able to parse Cabal files, +and print them back out in a mangled form. +There are other programs providing module detection, +but nothing is integrated in cabal itself. + +This problem has been solved by the community, several times outside of cabal. +For example: +
    +
  • [hpack](https://github.com/sol/hpack),
  • +
  • [autopack](https://github.com/kowainik/autopack)
  • +
  • [cabal-fmt](https://github.com/phadej/cabal-fmt)
  • +
  • [gild](https://taylor.fausak.me/2024/02/17/gild/)
  • +
+ +Of course many of these projects do more then just module expension. +hpack provides a completly different cabal file layout for exampe, +cabal-fmt and gild are formatters for cabal file. +Only auto-pack just does this one feature. +However, since all these programs implement this functionality, +there is clearly demand for it. + +There are more issues then just module expension however. +For example [cabal gen-bounds](https://github.com/haskell/cabal/issues/7304) could modify a cabal file in place, +with [cabal edit](https://github.com/haskell/cabal/issues/7337) we could add a dependency via cli, +[cabal init](https://github.com/haskell/cabal/issues/6187) could be simplified. + +The current implementation of writing, cabal format, has the following issues: + +1. Deletes all comments +2. merge any `common` stanza into wherever it was imported. https://github.com/haskell/cabal/issues/5734 +3. changes line ordering. +4. changes spacing (although perhaps to be expected from a formatter) A similar problem occurs when HLS want's to do any modification to a cabal file during development. @@ -63,8 +91,11 @@ For example if a module was added or renamed, or if a (hidden) library is requir Or perhaps some function used in a known library via hoogle for example. HLS has no clue what to do, because even if it links against the cabal library, -there is no function to create a cabal file -(lest it fucks it up) +there is no function to modify a generic cabal representation and print a cabal file that keeps +it similar to the users'. + +The goal is to make non invasive changes. +This tech proposal therefore aims to address all these issues. ## Prior Art and Related Efforts From 2a6f04dc2f00f728334f5bf35b1e2c070721bd88 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Sun, 23 Jun 2024 23:56:38 +0200 Subject: [PATCH 04/20] clear out draft --- proposals/0000-cabal-exact-printer.md | 80 +++++++++++++++++---------- 1 file changed, 51 insertions(+), 29 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index 8d3ac309..b9408282 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -5,10 +5,12 @@ The Exact Printer project aims to develop a precise parsing and printing tool for .cabal files in Haskell, inspired by ghc-exactprint. -This tool will enable byte-for-byte bidirectional parsing and printing, -enhancing the functionality and usability of the Cabal package manager. -The project addresses the need for accurate and efficient management of package descriptions, -a crucial aspect for Haskell developers. +This tool will enable byte-for-byte bidirectional parsing and printing. +Which will allow both cabal and other tools to modify cabal files without +mangling the format, structure or comments of users. + +The overal goal would be to roundtrip 99% of all hackage packages. + ## Background @@ -36,10 +38,6 @@ It defines exact printing as follows: ## Problem Statement -_This section should describe the problem that the proposal intends to solve and how solving the problem will benefit the Haskell community. -It should also enumerate the requirements against which a solution should be evaluated._ - - This [issue](https://github.com/haskell/cabal/issues/7544) is stracked on the cabal bug tracker. Currently if you build a project with an extra module not listed in your cabal file, @@ -70,7 +68,8 @@ Of course many of these projects do more then just module expension. hpack provides a completly different cabal file layout for exampe, cabal-fmt and gild are formatters for cabal file. Only auto-pack just does this one feature. -However, since all these programs implement this functionality, +However, since all these programs implement this functionality +in their own distinct way, there is clearly demand for it. There are more issues then just module expension however. @@ -101,9 +100,28 @@ This tech proposal therefore aims to address all these issues. Previous attempts to address this problem have been fragmented, and no comprehensive solution has been developed. This project builds upon the ideas discussed in various issues over the past six years (e.g., Haskell/cabal issues #3614, #6621, #6187, #4965). The project will synthesize these discussions into a cohesive solution. -I think prior art would be tools such as +I think prior art would be tools such as cabalfmt, hpack and autopack + +Previous attempts were [abandoned](https://github.com/haskell/cabal/pull/7626). +Or they revolved around creating a seperate AST[^ast], which was against maintainer recommendation, +and then [abandoned](https://github.com/haskell/cabal/pull/9385). + +A related effort is to build combinators that allow modifyng the `Field` type directly. +This would depracate the GenericPackage structure and make an alternative structure +available. +A proof of concept was developed during zurich hack +https://discourse.haskell.org/t/pre-proposal-cabal-exact-print/9582/9?u=jappie + +I suppose the idea is to completly replace `GenericPackageDescription` with +this `Field` type. +Which is a significant effort, +however this work can be amend the exact print effort, +because better modification of cabal files would be appreciated. +Exact printing is mostly a module that takes some input type and then +does the formatting. -cabalfmt, hpack and autopack +Furthermore the test suite created by the exact print effort this module +describes can also be used in the related `GenericPackageDescription` to `Field` effort. ## Technical Content @@ -124,7 +142,6 @@ byte for byte roundtrip of all hacakgePackage: where `hackagePackage` is a cabal package found on hackage. - to support exact printing a new field is added to `GenericPackageDescritpion`: ```haskell @@ -161,6 +178,8 @@ The issue for addition is that you now have to invent exact positions. for removal, if it involves a line, you've to fix up all following lines, (and it has to know something was removed). +The overal goal would be to roundtrip 99% of all hackage packages. + ### Partials + modification to the `GenericPackageDescription`. @@ -175,18 +194,24 @@ for removal, if it involves a line, you've to fix up all following lines, ## Timeline -_When will the project be completed? -What are the intermediate steps and intermediate concrete deliverables for the community?_ +I expect that after this project is approved it'd take roughly 4 months in total to complete. +This would be about 2 'man' months, however there maybe random hickups. +Most work will be done by either me or one of my employees @Riuga. + +Intermediate steps are the listed tests in the linked PR, +then after all those pass we'll move onto a full hackage run, +and sift out more . + +We can easily track progress via the tests being finished. -I think the overall work is roughly 2 and a half week of fulltime work. -However this maybe executed over several months. I'd expect the project to be completed by April 2024. I've a free week in decemeber for example, but then it'll be weekends and nights. ## Budget -120 hours * 120 euro per hours is 14'400 euro. +I think this should be around 15'000 euro to complete, +considering the size of the overal work. The money will be used to compensate for opportunity cost, and allowing me, and hopefully others, @@ -194,21 +219,18 @@ to justify taking on similar large projects in the future. ## Stakeholders -_Who stands to gain or lose from the implementation of this proposal? -Proposals should identify stakeholders so that they can be contacted for input, and a final decision should not occur without having made a good-faith effort to solicit representative feedback from important stakeholder groups._ - -The primary benificiaries would be HLS and cabal users. +The primary benificiaries would be cabal users. we can: -+ improve cabal usability, gen-bounds comes to mind. -+ improve HLS and cabal interaction. -+ - -An indirect loser maybe the stack build tool. -Which misses this direct investment, ++ opens up the possibility to improve cabal user experience, via inserting modules or running gen-bounds. ++ makes it easier for cabal library to deal with cabal files, such as hpack, HLS and cabal-fmt ++ makes maintaining cabal itself easier, eg cabal init could be described via GenericPackageDesription -and cabal may drastically. +Furthermore I've heard that the HLS project will benefit greatly from this effort, +(TODO where?) ## Success -_Under what conditions will the project be considered a success?_ +This proposal successful once the cabal exact print branch is merged into cabal proper with the provided tests passing. +Most of hackage should be exact printable, say 99%, +which means, an existing hackage cabal file is being parsed, and then printed again resulting into the same output as input. From 9367c7b521d61c4abd4063a24824cc833fb87157 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Tue, 16 Jul 2024 21:04:53 -0400 Subject: [PATCH 05/20] Add intro --- proposals/0000-cabal-exact-printer.md | 151 +++++++++++++++----------- 1 file changed, 90 insertions(+), 61 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index b9408282..b799bed8 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -2,29 +2,36 @@ ## Abstract - -The Exact Printer project aims to develop a precise parsing and printing tool for .cabal files in Haskell, -inspired by ghc-exactprint. -This tool will enable byte-for-byte bidirectional parsing and printing. +The Exact Printer project aims to develop a precise parsing and printing tool for .cabal files in the cabal library. +This will enable byte-for-byte bidirectional parsing and printing. Which will allow both cabal and other tools to modify cabal files without mangling the format, structure or comments of users. -The overal goal would be to roundtrip 99% of all hackage packages. - - ## Background +This is the formal application of my [blogpost](https://jappie.me/cabal-exact-printing.html). +Ironically I started on this proposal first, +but then I got cold feet and decided to write a "lower stake" informal blogpost instead. +Writing this proposal has been difficult. +I've no idea why this is so hard for me. +I think it's partly because there is no going back after opening the proposal. +and I'm mostly just making these numbers and timelines up y'know. +I've no idea for it'll take 2 months, but it seems reasonable. +I've no idea if there is even budget for this, but asking 10k for a man month +seems reasonable as well. +Getting this accepted and solved won't make me rich, +but it could make me happy. +It'll solve a pain point bringing my work life closer to epicurean atraxia. Cabal is a build tool for haskell. It directs GHC and deals with "packages". Which are collections of haskell modules. cabal allows you to depend on libraries, publish libraries and manage build flags. - In essence if you want to do something non-trivial with haskell you want to use cabal. -.cabal files are fundamental in Haskell programming, used to describe package structures. -Current methods of parsing and printing these files lack the precision needed +`.cabal` files are used to describe package structures. +The current implementation for printing these files lacks the precision needed for certain tasks. Such as generating bounds for dependencies, formatting, @@ -37,9 +44,6 @@ It defines exact printing as follows: > is called exact-printing. An exact-printed program includes the original spacing and all comments. ## Problem Statement - -This [issue](https://github.com/haskell/cabal/issues/7544) is stracked on the cabal bug tracker. - Currently if you build a project with an extra module not listed in your cabal file, ghc emits a warning: ``` @@ -51,10 +55,10 @@ ghc emits a warning: You'd say, why doesn't cabal just add this module to the cabal file? Well, it can't. Cabal is currently only able to parse Cabal files, -and print them back out in a mangled form. +it can print them back out, but in a mangled form. + There are other programs providing module detection, but nothing is integrated in cabal itself. - This problem has been solved by the community, several times outside of cabal. For example:
    @@ -66,23 +70,25 @@ For example: Of course many of these projects do more then just module expension. hpack provides a completly different cabal file layout for exampe, -cabal-fmt and gild are formatters for cabal file. +`cabal-fmt` and `gild` are formatters for cabal files. Only auto-pack just does this one feature. However, since all these programs implement this functionality in their own distinct way, there is clearly demand for it. -There are more issues then just module expension however. +There are more issues then just module expansion however. For example [cabal gen-bounds](https://github.com/haskell/cabal/issues/7304) could modify a cabal file in place, with [cabal edit](https://github.com/haskell/cabal/issues/7337) we could add a dependency via cli, [cabal init](https://github.com/haskell/cabal/issues/6187) could be simplified. -The current implementation of writing, cabal format, has the following issues: +The current implementation of printing in cabal via the `cabal format` command, has the following issues: + +1. It Deletes all comments [^1] +2. It merges any `common` stanza into wherever it was imported. https://github.com/haskell/cabal/issues/5734 +3. It changes line ordering. +4. It changes spacing (although perhaps to be expected from a formatter) -1. Deletes all comments -2. merge any `common` stanza into wherever it was imported. https://github.com/haskell/cabal/issues/5734 -3. changes line ordering. -4. changes spacing (although perhaps to be expected from a formatter) +[^1]: Andreas helped me solve this on zurich hack. A similar problem occurs when HLS want's to do any modification to a cabal file during development. @@ -93,21 +99,29 @@ because even if it links against the cabal library, there is no function to modify a generic cabal representation and print a cabal file that keeps it similar to the users'. -The goal is to make non invasive changes. +The goal is to make non invasive changes to cabal files. This tech proposal therefore aims to address all these issues. +Furthermore by bringing it directly into cabal we can enforce the round tripping +property. +This will ensure clients of the cabal library can print cabal files more +easily and have some stability guarantees. ## Prior Art and Related Efforts -Previous attempts to address this problem have been fragmented, and no comprehensive solution has been developed. This project builds upon the ideas discussed in various issues over the past six years (e.g., Haskell/cabal issues #3614, #6621, #6187, #4965). The project will synthesize these discussions into a cohesive solution. +This [issue](https://github.com/haskell/cabal/issues/7544) is tracked on the cabal bug tracker. +Essentially this proposal attempts to "solve" that issue. +As can be seen in the issue, there have been previous attempts, +previous attempts to address this problem have been fragmented, +and no comprehensive solution has been developed. -I think prior art would be tools such as cabalfmt, hpack and autopack - -Previous attempts were [abandoned](https://github.com/haskell/cabal/pull/7626). -Or they revolved around creating a seperate AST[^ast], which was against maintainer recommendation, +Previous attempts for making this directly into cabal were [abandoned](https://github.com/haskell/cabal/pull/7626). +I guess they got demotivated by the shear size of the effort, +or they revolved around creating a seperate AST[^ast], +which was against maintainer recommendation (because it'd make the issue even bigger), and then [abandoned](https://github.com/haskell/cabal/pull/9385). A related effort is to build combinators that allow modifyng the `Field` type directly. -This would depracate the GenericPackage structure and make an alternative structure +This would depracate the `GenericPackage` structure and make an alternative structure available. A proof of concept was developed during zurich hack https://discourse.haskell.org/t/pre-proposal-cabal-exact-print/9582/9?u=jappie @@ -115,17 +129,22 @@ https://discourse.haskell.org/t/pre-proposal-cabal-exact-print/9582/9?u=jappie I suppose the idea is to completly replace `GenericPackageDescription` with this `Field` type. Which is a significant effort, -however this work can be amend the exact print effort, +however this work can be added to the exact print effort, because better modification of cabal files would be appreciated. Exact printing is mostly a module that takes some input type and then does the formatting. +Unlike the `Field` effort, +this proposal only focusess only on getting exact printing +to work with a minimal footprint. +We don't want to do any additional refactoring. Furthermore the test suite created by the exact print effort this module describes can also be used in the related `GenericPackageDescription` to `Field` effort. -## Technical Content -This proposal want's to add a function to cabal: + +## Technical Content +This proposal want's to add a function to the cabal library: ``` printExact :: GenericPackageDescription -> Text @@ -140,8 +159,6 @@ byte for byte roundtrip of all hacakgePackage: ``` where `hackagePackage` is a cabal package found on hackage. - - to support exact printing a new field is added to `GenericPackageDescritpion`: ```haskell @@ -158,19 +175,23 @@ data ExactPrintMeta = ExactPrintMeta , exactComments :: Map Position Text } ``` + +For comments some parser modifications were needed which were developed during zurich hack 2024. +This was a big uncertainty which now has been addressed. + It's unclear what other fields are required right now. -For example build bounds require another map +For example build bounds might require another map like `Map ([NameSpace], PackageVersionConstraint) Original`, this kind of representation allows us to retrieve the original only if it hasn't changed. -However initial inspection of the parser showed it's difficult to retrieve `PackageVersionConstraint` and `[NameSpace]` together, -because they're deeply nested within field grammars. -so perhaps an intrusive design is more easy for that, -it's unclear to me right now. -However it is clear these problems can be solved, it just takes time and effort. +However I'm confident all of this is do-able. -However peliminary testing shows that this approach works with multiple secrion cabal files. +Peliminary testing shows that this approach works with multiple sections of cabal files, +and now also with some basic comments. +Comments likely have to be worked out further as we only made that one test pass, +but the 'tools', and more importantly, the understanding is in place. +There is a working prototype. -Pertubation of the `GenericPackageDescription` must be possible for +Change of the `GenericPackageDescription` must be possible for + module addition/removal + library addition/removal @@ -182,55 +203,63 @@ The overal goal would be to roundtrip 99% of all hackage packages. ### Partials -+ modification to the `GenericPackageDescription`. - covering every conceivable modification would be tough. +modification to the `GenericPackageDescription`. +covering every conceivable modification would be though. +However we should be able to do common operations such as adding libraries or modules. +What we want is a sort of default behavior, all subsequent mappings in the +exact printer position should be shifted if we add a line. ### Not included + Any warnings during parsing won't be included (low value add) + Any integration in tools such as `cabal format` or `cabal gen-bounds`. These tasks are relatively easy in comparison, however if we include this as an unused well tested library - function, it'll be easier to release. + function, it'll be easier to release so other people can go out and test. ## Timeline - I expect that after this project is approved it'd take roughly 4 months in total to complete. This would be about 2 'man' months, however there maybe random hickups. -Most work will be done by either me or one of my employees @Riuga. +Most work will be done by either me or one of my colleagues under contract @Riuga. Intermediate steps are the listed tests in the linked PR, then after all those pass we'll move onto a full hackage run, and sift out more . +We can track progress via the tests being finished. +I'd expect the project to be completed by 4 months after accepting this proposal. -We can easily track progress via the tests being finished. - -I'd expect the project to be completed by April 2024. - -I've a free week in decemeber for example, but then it'll be weekends and nights. +If there are delays we'll communicate them promptly. ## Budget - -I think this should be around 15'000 euro to complete, +I think this should be around 20'000 euro to complete, considering the size of the overal work. +I think it's best to spread this out over subgoals, for example: + +5k: finish test suite in the linked PR +5k: 60% of hackage roundtripped +10k: 99% of hackage roundtripped The money will be used to compensate for opportunity cost, and allowing me, and hopefully others, to justify taking on similar large projects in the future. -## Stakeholders +This is not particularly prestigious work (you can't write a phd on this), +but as shown in the problem statement it bothers *a lot* of people. +So a monetary 'carrot' should help us drag this over the line. +## Stakeholders The primary benificiaries would be cabal users. we can: -+ opens up the possibility to improve cabal user experience, via inserting modules or running gen-bounds. ++ opens up the possibility to improve cabal user experience, + via inserting modules or running gen-bounds. + makes it easier for cabal library to deal with cabal files, such as hpack, HLS and cabal-fmt -+ makes maintaining cabal itself easier, eg cabal init could be described via GenericPackageDesription ++ makes maintaining cabal itself easier, + eg cabal init could be described via `GenericPackageDesription` -Furthermore I've heard that the HLS project will benefit greatly from this effort, -(TODO where?) +Furthermore I've heard that the HLS project will benefit this effort, +it could add a dependencies plugin for example: https://github.com/haskell/haskell-language-server/issues/155 ## Success - This proposal successful once the cabal exact print branch is merged into cabal proper with the provided tests passing. -Most of hackage should be exact printable, say 99%, +Most of hackage should be exact printable, which means, an existing hackage cabal file is being parsed, and then printed again resulting into the same output as input. From 99f519a7bc317dcaae084b1af60677495abdff1e Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Wed, 17 Jul 2024 14:49:08 -0400 Subject: [PATCH 06/20] Remove riuga --- proposals/0000-cabal-exact-printer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index b799bed8..f08af4bb 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -219,7 +219,7 @@ exact printer position should be shifted if we add a line. ## Timeline I expect that after this project is approved it'd take roughly 4 months in total to complete. This would be about 2 'man' months, however there maybe random hickups. -Most work will be done by either me or one of my colleagues under contract @Riuga. +Most work will be done by me or a subcontractor if I can find them. Intermediate steps are the listed tests in the linked PR, then after all those pass we'll move onto a full hackage run, From 3c138b463e4c4f58f09581f1f61a7e90eb30c5fc Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 22 Jul 2024 19:32:33 -0400 Subject: [PATCH 07/20] Add direct link to the exact printer pull request --- proposals/0000-cabal-exact-printer.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index f08af4bb..e685b94f 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -107,12 +107,13 @@ This will ensure clients of the cabal library can print cabal files more easily and have some stability guarantees. ## Prior Art and Related Efforts - This [issue](https://github.com/haskell/cabal/issues/7544) is tracked on the cabal bug tracker. Essentially this proposal attempts to "solve" that issue. As can be seen in the issue, there have been previous attempts, previous attempts to address this problem have been fragmented, -and no comprehensive solution has been developed. +and no comprehensive solution has been finished. +There is however a work in progress implementation of the [exact printer](https://github.com/haskell/cabal/pull/9436/). +The goal of this proposal is to buy time to finish that implementation. Previous attempts for making this directly into cabal were [abandoned](https://github.com/haskell/cabal/pull/7626). I guess they got demotivated by the shear size of the effort, From 49a06797d8f3bd074720e8b68ba68c57241081da Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 22 Jul 2024 19:51:53 -0400 Subject: [PATCH 08/20] Draft out current work left todo --- proposals/0000-cabal-exact-printer.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index e685b94f..c2eb743e 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -115,6 +115,20 @@ and no comprehensive solution has been finished. There is however a work in progress implementation of the [exact printer](https://github.com/haskell/cabal/pull/9436/). The goal of this proposal is to buy time to finish that implementation. +The current work left to be done on this pull request is: ++ Add more comment preservation on more locations. ++ Deal with changes in generic package description. + + We need to add tests about which changes we care about. + For example add a build field, add a field which causes comment overlap on x,y. + Delete a section, add a language flag, etc. + + it the algorithm just relatively shifts everything if you add + a build field for example. + You know something got added because you can't find it in ExactPrintMeta. ++ Redo how common stanza's are handled (they're currently "merged" into sections directly, which is unrecoverable). ++ add support for comma printing, ++ add support for braces. ++ add support for conditional branches. + Previous attempts for making this directly into cabal were [abandoned](https://github.com/haskell/cabal/pull/7626). I guess they got demotivated by the shear size of the effort, or they revolved around creating a seperate AST[^ast], From fe8bea5ced76918441c95ecf85e364248cb0b53e Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Tue, 30 Jul 2024 15:48:08 -0400 Subject: [PATCH 09/20] Draft out common stanzas design --- proposals/0000-cabal-exact-printer.md | 70 +++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index c2eb743e..4f332615 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -216,6 +216,75 @@ for removal, if it involves a line, you've to fix up all following lines, The overal goal would be to roundtrip 99% of all hackage packages. +### Common stanzas +Currently if you run the pretty printer on a parsed generic package you will +lose all common stanzas. +The reason for this is that they're merged in the imported sections[^see-pr], +and then forgotten about. + +[^see-pr]: I marked where this happened https://github.com/haskell/cabal/pull/9436/files#diff-39a353df50e7eed47b5958c6025b67b06fac735a8b5b994c1464d6fd84df745eR718 + +I think there are two somewhat viable approaches to dealing with common stanzas. + +1. Keep the current "merging" approach. However track their original shape in + `ExactPrintMeta`. Then let the printer figure out based on that information + what came from the commen stanza and what came in the original section. + +2. Refactor `GenericPackageDescription` so that the parser no longer merges these + sections and instead stores them as proper records. + Then make the callsites smart enough to deal with common stanzas. + +In this case I'd attempt approach 2 first. +It seems less fragile to me to just refactor this. +It also removes the concern of carrying duplicate information, and should +make it more obvious to the users of the cabal syntax library +on how to manipulate common stanzas. + +how this is done concretly, you go to the associated types and +add imports field: +```haskell +newtype CommonStanzaName = CommonStanzaName Text + +data Library = Library + { libName :: LibraryName + , imports :: [CommonStanzaName] + ... + } +``` + +and to generic package description you add another map: +```haskell +data GenericPackageDescription = GenericPackageDescription + { packageDescription :: PackageDescription + , gpdCommonStanzas :: Map CommonStanzaName CommonStanza + } +``` + +I'm not sure if this is fast enough, but +one easy and dirty way to make it all work with +backwards compatibility is to +create custom getters, (original setters are fine) + +we expose a getter (which is the current record field name `condSubLibraries`): + +```haskell +condSubLibraries :: GenericPackageDescription -> [( UnqualComponentName , CondTree ConfVar [Dependency] Library)] +``` + +which merges the common stanzas on every "get" call. +Then the actual record field will be renamed, +and doesn't have this merging in behavior. + +This way, new code will be able to access a library without common stanzas merged in, +as would be expected wthin the cabal file. +Old code, will have the same behavior as the current implementation. +And it will allow the exact printer to make a distinction between common stanzas and +whatever is in the other stanzas. + +### Conditionals +TODO: see how these work - so I can think of a design, I've a suspicion but i just need to +confirm my mental model maps to reality + ### Partials modification to the `GenericPackageDescription`. @@ -226,6 +295,7 @@ exact printer position should be shifted if we add a line. ### Not included ++ Support for braces + Any warnings during parsing won't be included (low value add) + Any integration in tools such as `cabal format` or `cabal gen-bounds`. These tasks are relatively easy in comparison, however if we include this as an unused well tested library From 75c8ab6b66cbfbbdf781451fcd675394b6e86acc Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 5 Aug 2024 12:45:43 -0400 Subject: [PATCH 10/20] add note on previous efforts --- proposals/0000-cabal-exact-printer.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index 4f332615..6e679f97 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -107,6 +107,8 @@ This will ensure clients of the cabal library can print cabal files more easily and have some stability guarantees. ## Prior Art and Related Efforts +TODO: describe why previous efforts didn't succeed. How is this proposal different? + This [issue](https://github.com/haskell/cabal/issues/7544) is tracked on the cabal bug tracker. Essentially this proposal attempts to "solve" that issue. As can be seen in the issue, there have been previous attempts, From 61a591603e61c137b94185ce3b634f7bbdc2dde0 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Wed, 7 Aug 2024 15:59:57 -0400 Subject: [PATCH 11/20] Add notes on conditionals --- proposals/0000-cabal-exact-printer.md | 120 +++++++++++++++++++++++++- 1 file changed, 118 insertions(+), 2 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index 6e679f97..1e158724 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -158,8 +158,6 @@ We don't want to do any additional refactoring. Furthermore the test suite created by the exact print effort this module describes can also be used in the related `GenericPackageDescription` to `Field` effort. - - ## Technical Content This proposal want's to add a function to the cabal library: @@ -193,6 +191,36 @@ data ExactPrintMeta = ExactPrintMeta } ``` +Where Exact position is: +``` +data ExactPosition = ExactPosition {namePosition :: Position + , argumentPosition :: [Position] } +``` +And `Postion` is just a row and column coordinate of a textfile. +This type already exists in cabal. + +A namespace is used to find exact positions: +``` +data NameSpace = NameSpace + { nameSpaceName :: FieldName + , nameSpaceSectionArgs :: [ByteString] + } + deriving (Show, Eq, Typeable, Data, Ord, Generic) +``` +It just encodes a path down the rose tree. +for example: +``` +library + if flag(foo) + build-depends: base <5 +``` +would be encoded as: +``` +,[NameSpace {nameSpaceName = "library", nameSpaceSectionArgs = []},NameSpace {nameSpaceName = "if", nameSpaceSectionArgs = ["flag(foo)"]},NameSpace {nameSpaceName = "build-depends", nameSpaceSectionArgs = []}] +``` +This gives a unique way of figuring out the exact position of the build-depends field. +Although conditionals need a little bit more refinement, see the conditional section for that. + For comments some parser modifications were needed which were developed during zurich hack 2024. This was a big uncertainty which now has been addressed. @@ -287,6 +315,94 @@ whatever is in the other stanzas. TODO: see how these work - so I can think of a design, I've a suspicion but i just need to confirm my mental model maps to reality +-- I suspect the pretty printer already supports this because condtree is part of syntax + -- need to test. + -- I think all we need to do is more exact location mapping on partials and it should be fine. + + +For conditionals I wrote a test to see how far it got in it's current state: + +```cabal +cabal-version: 3.0 +name: bounded +version: 0 +synopsis: The -any none demo +build-type: Simple + +flag foo + manual: True + default: True + +library + default-language: Haskell2010 + exposed-modules: AnyNone + if flag(foo) + build-depends: base <5 + else + build-depends: base <5.5 +``` +which got printed as (anomalies marked with `{}`): + +```cabal +cabal-version: 3.0 +name: bounded +version: 0 +synopsis: The -any none demo +build-type: Simple + +flag foo + manual: True + {1} + +library +ifflag(foo) {2} +build-depends:base <5 + default-language: Haskell2010 + exposed-modules: AnyNone + + else + build-depends: base <5.5 +``` + +1. Default gone, + it may not be stored within the flag ast, we need to add support for that. + Once support is added I suspect it'll work with the other strategy +2. Indentation of `if` wrong, and it's also in the wrong position (should be moved 2 lines down). + +Currently it looks like the exact positions of if fields aren't stored, +once we do this it should be printed at a much better location. +We also need to add support for multiple ifs in a single section. +I think we can do that by changing the lookup for if statments, and +adding the index occurred for a section. +So the namespace type would be changed to: + +```haskell +data NameSpace = NameSpace + { nameSpaceName :: FieldName + , nameSpaceSectionArgs :: [ByteString] + , nameSpaceDuplicateBust :: Word + } + deriving (Show, Eq, Typeable, Data, Ord, Generic) +``` + +the newly added field `nameSpaceOccuranceBust`, tracks how many if's it has encountered in this case. +Allowing multiple exact positions to be stored per section, +even if they share the same arguments. +For all lookups where this can't occur, the number would remain 0. + +### Performance and Interaction with hackage + +Changes to cabal shouldn't affect the normal operation of hackage. +So performance should remain at similar level as they are now. + +This can be tested by spinning up a virtual machine with similar +specs to hackage production and just do the heavy tasks hackage +normally does. + +If it crashes or performance degrades due to the exact printer changes +we should improve performance before declaring this finished. + + ### Partials modification to the `GenericPackageDescription`. From c79a1355632a339cabae3556a9754e1b2ba18cf1 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Wed, 7 Aug 2024 18:31:05 -0400 Subject: [PATCH 12/20] Add more lines --- proposals/0000-cabal-exact-printer.md | 42 ++++++++++++++++++--------- 1 file changed, 28 insertions(+), 14 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index 1e158724..73e5400b 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -131,11 +131,22 @@ The current work left to be done on this pull request is: + add support for braces. + add support for conditional branches. -Previous attempts for making this directly into cabal were [abandoned](https://github.com/haskell/cabal/pull/7626). -I guess they got demotivated by the shear size of the effort, -or they revolved around creating a seperate AST[^ast], +A Previous attempts by [patrick](https://github.com/ptkato) for making this directly into cabal was [abandoned](https://github.com/haskell/cabal/pull/7626). +In private they mentioned that they worked for the Haskell foundation for a while on this, +but the contract expired and he moved on to another employer. +However another issue they had with this was that there appeared to +be no agreement on how to move forward with this specific problem. +It was a somewhat chaotic debate. +So at least what we can do with this proposal is come to a consensus what +a good solution looks like. +And let the perfect not be the enemy of good. + +Another effort revolved around creating a seperate AST[^ast], which was against maintainer recommendation (because it'd make the issue even bigger), and then [abandoned](https://github.com/haskell/cabal/pull/9385). +They got discouraged because they received no maintainer feedback +after [one and a half year](https://discourse.haskell.org/t/pre-proposal-cabal-exact-print/9582/2?u=jappie). + A related effort is to build combinators that allow modifyng the `Field` type directly. This would depracate the `GenericPackage` structure and make an alternative structure @@ -143,16 +154,13 @@ available. A proof of concept was developed during zurich hack https://discourse.haskell.org/t/pre-proposal-cabal-exact-print/9582/9?u=jappie -I suppose the idea is to completly replace `GenericPackageDescription` with -this `Field` type. -Which is a significant effort, -however this work can be added to the exact print effort, -because better modification of cabal files would be appreciated. -Exact printing is mostly a module that takes some input type and then -does the formatting. +The idea was to start with the `Field` type because it describes +the syntax of a cabal file. +If it could store whitespaces it could potentially be used to be exact printed. +This `Field` type later gets parsed into other types such as +`InstalledPackageInfo`, `ProjectConfig` and `GenericPackageDescription`. -Unlike the `Field` effort, -this proposal only focusess only on getting exact printing +This proposal only focusess only on getting exact printing to work with a minimal footprint. We don't want to do any additional refactoring. Furthermore the test suite created by the exact print effort this module @@ -369,8 +377,14 @@ build-depends:base <5 Once support is added I suspect it'll work with the other strategy 2. Indentation of `if` wrong, and it's also in the wrong position (should be moved 2 lines down). -Currently it looks like the exact positions of if fields aren't stored, -once we do this it should be printed at a much better location. +Currently it looks like the exact positions of `if` fields aren't used, +correctly. +They do appear to be stored, for example in a debug dump of that file: +``` + , ([NameSpace{nameSpaceName = "library", nameSpaceSectionArgs = []}, NameSpace{nameSpaceName = "if", nameSpaceSectionArgs = ["flag", "(", "foo", ")"]}], ExactPosition{namePosition = Position 14 3, argumentPosition = [Position 14 6, Position 14 10, Position 14 11, Position 14 14]}) +``` +I think this is just a matter of debugging to get it to print on the right position. + We also need to add support for multiple ifs in a single section. I think we can do that by changing the lookup for if statments, and adding the index occurred for a section. From a50d449823ec33c4c08bd0e82cbfd7c9c111cefd Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Wed, 7 Aug 2024 18:57:19 -0400 Subject: [PATCH 13/20] Fix grammar --- proposals/0000-cabal-exact-printer.md | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index 73e5400b..e5df2a15 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -123,20 +123,21 @@ The current work left to be done on this pull request is: + We need to add tests about which changes we care about. For example add a build field, add a field which causes comment overlap on x,y. Delete a section, add a language flag, etc. - + it the algorithm just relatively shifts everything if you add + + The algorithm just relatively shifts everything if you add a build field for example. You know something got added because you can't find it in ExactPrintMeta. + Redo how common stanza's are handled (they're currently "merged" into sections directly, which is unrecoverable). -+ add support for comma printing, -+ add support for braces. + See the technical content section for more details on this. ++ Add support for comma printing. + add support for conditional branches. + See the technical content section for more details on this. A Previous attempts by [patrick](https://github.com/ptkato) for making this directly into cabal was [abandoned](https://github.com/haskell/cabal/pull/7626). In private they mentioned that they worked for the Haskell foundation for a while on this, but the contract expired and he moved on to another employer. However another issue they had with this was that there appeared to be no agreement on how to move forward with this specific problem. -It was a somewhat chaotic debate. +They found the debate somewhat chaotic, and didn't know to proceed. So at least what we can do with this proposal is come to a consensus what a good solution looks like. And let the perfect not be the enemy of good. @@ -147,7 +148,6 @@ and then [abandoned](https://github.com/haskell/cabal/pull/9385). They got discouraged because they received no maintainer feedback after [one and a half year](https://discourse.haskell.org/t/pre-proposal-cabal-exact-print/9582/2?u=jappie). - A related effort is to build combinators that allow modifyng the `Field` type directly. This would depracate the `GenericPackage` structure and make an alternative structure available. @@ -159,6 +159,10 @@ the syntax of a cabal file. If it could store whitespaces it could potentially be used to be exact printed. This `Field` type later gets parsed into other types such as `InstalledPackageInfo`, `ProjectConfig` and `GenericPackageDescription`. +Currently it's unclear to me how this would work with modifications +on `GenericPackageDescription`. +Although the proposal presented here converts `GenericPackageDescription` into +`PrettyField` which is similar to `Field`, before printing. This proposal only focusess only on getting exact printing to work with a minimal footprint. From a127fcb07d6512c152ce7a74adb466187c373f45 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 26 Aug 2024 11:17:47 -0400 Subject: [PATCH 14/20] Section on why we need exact printing and how users programs benefit --- proposals/0000-cabal-exact-printer.md | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index e5df2a15..eb74b305 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -106,8 +106,26 @@ property. This will ensure clients of the cabal library can print cabal files more easily and have some stability guarantees. -## Prior Art and Related Efforts -TODO: describe why previous efforts didn't succeed. How is this proposal different? +Why even bother with adding this directly to cabal? +What advantage do existing tools get by making cabal smart enough +to modify it's own files? +It's hard to guarantee stability across projects if many functionalities +are distributed across many projects. +For example a newly introduced cabal-add, would need to take into account any +syntax change to cabal, *forever*. +This is true for other tools as well that want to modify cabal files (such as HLS). +If cabal would support an exact printer all syntax odds and ends +will remain within cabal allowing us to change the cabal file format easily +without breaking downstream libraries and tools. +Because cabal itself supports parsing and printing and exposes +a it as library functions. +This will allow downstream programmers to parse and print +cabal files without having to care about the syntax details. +Which is different from the current situation where some diligent programmers +assumed the entire cabal file format is stable, +and wrote their own invent their own parsers end printers. +So every tool that want's to modify cabal files has a larger maintenance +burden because cabal isn't doing this upstream. This [issue](https://github.com/haskell/cabal/issues/7544) is tracked on the cabal bug tracker. Essentially this proposal attempts to "solve" that issue. From 163731e0f343d8e74f3380273cf0f5177efdd48f Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 26 Aug 2024 11:19:06 -0400 Subject: [PATCH 15/20] spell errors --- proposals/0000-cabal-exact-printer.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index eb74b305..6ef2d935 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -198,7 +198,7 @@ printExact :: GenericPackageDescription -> Text Which will do exact printing. This function has the following properties: -byte for byte roundtrip of all hacakgePackage: +byte for byte roundtrip of all `hacakgePackage`: ``` forall (hackagePackage :: ByteString) . (printExact <$> (parseGeneric hackagePackage)) == Right hackagePackage ``` @@ -213,7 +213,7 @@ data GenericPackageDescription { } ``` -which in turn contains various meta data we need for exact printing: +Which in turn contains various meta data we need for exact printing: ```haskell data ExactPrintMeta = ExactPrintMeta { exactPositions :: Map [NameSpace] ExactPosition @@ -244,7 +244,7 @@ library if flag(foo) build-depends: base <5 ``` -would be encoded as: +Would be encoded as: ``` ,[NameSpace {nameSpaceName = "library", nameSpaceSectionArgs = []},NameSpace {nameSpaceName = "if", nameSpaceSectionArgs = ["flag(foo)"]},NameSpace {nameSpaceName = "build-depends", nameSpaceSectionArgs = []}] ``` From 13c280c968670f381ed37f4d0ed921e9d496ca7e Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 26 Aug 2024 11:19:13 -0400 Subject: [PATCH 16/20] Add section on field because it comes up a lot it's kindoff annoying because the reference implementation already uses this, but somehow people aren't connecting the dots. --- proposals/0000-cabal-exact-printer.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index 6ef2d935..f2871db6 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -260,6 +260,29 @@ like `Map ([NameSpace], PackageVersionConstraint) Original`, this kind of representation allows us to retrieve the original only if it hasn't changed. However I'm confident all of this is do-able. +Many are concerned about using `Field` I suppose for the printing part, +but we don't use that, we use `PrettyField` because the pretty printer +already had a decent `FieldGrammar`, +and the type is almost the same. +The first thing `exactPrint` does is use the existing pretty field grammar to +create pretty fields: +```haskell +exactPrint :: GenericPackageDescription -> Text +exactPrint package = ... + where + fields :: [PrettyField ()] + fields = ppGenericPackageDescription (specVersion (packageDescription (package))) package + +``` + +Then we attach all the meta data to these `Fields` with `anotatePostions`: +```haskell +attachPositions :: [NameSpace] -> Map [NameSpace] ExactPosition -> [PrettyField ()] -> [PrettyField (Maybe ExactPosition)] +``` +Here we put the various pieces of meta data directly into the field for parsing. +Maybe you have an exact position at a certain point during printing, +which you can use to "repair" the default pretty printing behavior. + Peliminary testing shows that this approach works with multiple sections of cabal files, and now also with some basic comments. Comments likely have to be worked out further as we only made that one test pass, From b014e2dfa1416ed0dcb473735d242c1140ff3a00 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 26 Aug 2024 11:20:55 -0400 Subject: [PATCH 17/20] recover heading --- proposals/0000-cabal-exact-printer.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index f2871db6..b53c789e 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -127,6 +127,7 @@ and wrote their own invent their own parsers end printers. So every tool that want's to modify cabal files has a larger maintenance burden because cabal isn't doing this upstream. +## Prior Art and Related Efforts This [issue](https://github.com/haskell/cabal/issues/7544) is tracked on the cabal bug tracker. Essentially this proposal attempts to "solve" that issue. As can be seen in the issue, there have been previous attempts, From 38b0e9bfaf2cdd37b423f5f6f1bf2f0f785afba2 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 26 Aug 2024 11:43:18 -0400 Subject: [PATCH 18/20] Re-read doc, delete irrelevant stuff, fix various grammar issues --- proposals/0000-cabal-exact-printer.md | 93 +++++++++++++-------------- 1 file changed, 46 insertions(+), 47 deletions(-) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index b53c789e..548902aa 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -3,29 +3,17 @@ ## Abstract The Exact Printer project aims to develop a precise parsing and printing tool for .cabal files in the cabal library. -This will enable byte-for-byte bidirectional parsing and printing. -Which will allow both cabal and other tools to modify cabal files without -mangling the format, structure or comments of users. +This will allow both cabal and other tools to modify cabal files without +mangling the format, structure or comments of users files. +Furthermore it makes cabal authoritative on the cabal file format +allowing downstream users to use the provided printing functions +and get a stability guarantee. ## Background -This is the formal application of my [blogpost](https://jappie.me/cabal-exact-printing.html). -Ironically I started on this proposal first, -but then I got cold feet and decided to write a "lower stake" informal blogpost instead. -Writing this proposal has been difficult. -I've no idea why this is so hard for me. -I think it's partly because there is no going back after opening the proposal. -and I'm mostly just making these numbers and timelines up y'know. -I've no idea for it'll take 2 months, but it seems reasonable. -I've no idea if there is even budget for this, but asking 10k for a man month -seems reasonable as well. -Getting this accepted and solved won't make me rich, -but it could make me happy. -It'll solve a pain point bringing my work life closer to epicurean atraxia. - -Cabal is a build tool for haskell. +Cabal is a build tool for Haskell. It directs GHC and deals with "packages". -Which are collections of haskell modules. -cabal allows you to depend on libraries, +Which are collections of Haskell modules. +Cabal allows you to depend on libraries, publish libraries and manage build flags. In essence if you want to do something non-trivial with haskell you want to use cabal. @@ -43,9 +31,23 @@ It defines exact printing as follows: > Taking an abstract syntax tree (AST) and converting it into a string that looks like what the user originally wrote > is called exact-printing. An exact-printed program includes the original spacing and all comments. +PS: This is the formal application of my [blogpost](https://jappie.me/cabal-exact-printing.html). +Ironically I started on this proposal first, +but then I got cold feet and decided to write a "lower stake" informal blogpost instead. +Writing this proposal has been difficult. +I've no idea why this is so hard for me. +I think it's partly because there is no going back after opening the proposal. +and I'm mostly just making these numbers and timelines up y'know. +I've no idea for it'll take 2 months, but it seems reasonable. +I've no idea if there is even budget for this, but asking 10k for a man month +seems reasonable as well. +Getting this accepted and solved won't make me rich, +but it could make me happy. +It'll solve a pain point bringing my work life closer to epicurean atraxia. + ## Problem Statement Currently if you build a project with an extra module not listed in your cabal file, -ghc emits a warning: +GHC emits a warning: ``` : error: [-Wmissing-home-modules, -Werror=missing-home-modules] These modules are needed for compilation but not listed in your .cabal file's other-modules: @@ -68,8 +70,8 @@ For example:
  • [gild](https://taylor.fausak.me/2024/02/17/gild/)
-Of course many of these projects do more then just module expension. -hpack provides a completly different cabal file layout for exampe, +Of course many of these projects do more then just module expansion. +hpack provides a completely different cabal file layout for example, `cabal-fmt` and `gild` are formatters for cabal files. Only auto-pack just does this one feature. However, since all these programs implement this functionality @@ -93,7 +95,7 @@ The current implementation of printing in cabal via the `cabal format` command, A similar problem occurs when HLS want's to do any modification to a cabal file during development. For example if a module was added or renamed, or if a (hidden) library is required. -Or perhaps some function used in a known library via hoogle for example. +Or perhaps some function used in a known library via Hoogle for example. HLS has no clue what to do, because even if it links against the cabal library, there is no function to modify a generic cabal representation and print a cabal file that keeps @@ -103,7 +105,7 @@ The goal is to make non invasive changes to cabal files. This tech proposal therefore aims to address all these issues. Furthermore by bringing it directly into cabal we can enforce the round tripping property. -This will ensure clients of the cabal library can print cabal files more +This will ensure clients of the Cabal library can print cabal files more easily and have some stability guarantees. Why even bother with adding this directly to cabal? @@ -111,27 +113,28 @@ What advantage do existing tools get by making cabal smart enough to modify it's own files? It's hard to guarantee stability across projects if many functionalities are distributed across many projects. -For example a newly introduced cabal-add, would need to take into account any +For example a newly introduced tool called [cabal-add](https://github.com/Bodigrim/cabal-add), would need to take into account any syntax change to cabal, *forever*. This is true for other tools as well that want to modify cabal files (such as HLS). If cabal would support an exact printer all syntax odds and ends will remain within cabal allowing us to change the cabal file format easily without breaking downstream libraries and tools. -Because cabal itself supports parsing and printing and exposes -a it as library functions. +This is because cabal would supports parsing and printing, +and exposes this capability as library functions. This will allow downstream programmers to parse and print cabal files without having to care about the syntax details. +Furthermore it'll make it easier for programmers to add new cabal related tools. Which is different from the current situation where some diligent programmers assumed the entire cabal file format is stable, -and wrote their own invent their own parsers end printers. +and write their own own parsers and printers. So every tool that want's to modify cabal files has a larger maintenance burden because cabal isn't doing this upstream. ## Prior Art and Related Efforts This [issue](https://github.com/haskell/cabal/issues/7544) is tracked on the cabal bug tracker. Essentially this proposal attempts to "solve" that issue. -As can be seen in the issue, there have been previous attempts, -previous attempts to address this problem have been fragmented, +As can be seen in the issue, there have been previous attempts. +these attempts have been fragmented, and no comprehensive solution has been finished. There is however a work in progress implementation of the [exact printer](https://github.com/haskell/cabal/pull/9436/). The goal of this proposal is to buy time to finish that implementation. @@ -144,7 +147,7 @@ The current work left to be done on this pull request is: Delete a section, add a language flag, etc. + The algorithm just relatively shifts everything if you add a build field for example. - You know something got added because you can't find it in ExactPrintMeta. + You know something got added because you can't find it in `ExactPrintMeta`. + Redo how common stanza's are handled (they're currently "merged" into sections directly, which is unrecoverable). See the technical content section for more details on this. + Add support for comma printing. @@ -162,7 +165,7 @@ a good solution looks like. And let the perfect not be the enemy of good. Another effort revolved around creating a seperate AST[^ast], -which was against maintainer recommendation (because it'd make the issue even bigger), +which was against maintainer recommendation because it'd make the issue even bigger, and then [abandoned](https://github.com/haskell/cabal/pull/9385). They got discouraged because they received no maintainer feedback after [one and a half year](https://discourse.haskell.org/t/pre-proposal-cabal-exact-print/9582/2?u=jappie). @@ -262,7 +265,7 @@ this kind of representation allows us to retrieve the original only if it hasn't However I'm confident all of this is do-able. Many are concerned about using `Field` I suppose for the printing part, -but we don't use that, we use `PrettyField` because the pretty printer +but we don't use that exact type, we use `PrettyField` because the pretty printer already had a decent `FieldGrammar`, and the type is almost the same. The first thing `exactPrint` does is use the existing pretty field grammar to @@ -366,14 +369,6 @@ And it will allow the exact printer to make a distinction between common stanzas whatever is in the other stanzas. ### Conditionals -TODO: see how these work - so I can think of a design, I've a suspicion but i just need to -confirm my mental model maps to reality - --- I suspect the pretty printer already supports this because condtree is part of syntax - -- need to test. - -- I think all we need to do is more exact location mapping on partials and it should be fine. - - For conditionals I wrote a test to see how far it got in it's current state: ```cabal @@ -451,7 +446,6 @@ even if they share the same arguments. For all lookups where this can't occur, the number would remain 0. ### Performance and Interaction with hackage - Changes to cabal shouldn't affect the normal operation of hackage. So performance should remain at similar level as they are now. @@ -462,10 +456,14 @@ normally does. If it crashes or performance degrades due to the exact printer changes we should improve performance before declaring this finished. +Furthermore we don't want to slow down common operations of cabal files +to build it's database to solve. +If parsing is slow then it becomes amplified at scale and +make end users frustrated when common commands take longer. +So if there is a slowdown, it shouldn't be noticeable at scale. ### Partials - -modification to the `GenericPackageDescription`. +Modification to the `GenericPackageDescription`. covering every conceivable modification would be though. However we should be able to do common operations such as adding libraries or modules. What we want is a sort of default behavior, all subsequent mappings in the @@ -473,7 +471,7 @@ exact printer position should be shifted if we add a line. ### Not included -+ Support for braces ++ Support for braces. They don't do anything. + Any warnings during parsing won't be included (low value add) + Any integration in tools such as `cabal format` or `cabal gen-bounds`. These tasks are relatively easy in comparison, however if we include this as an unused well tested library @@ -515,7 +513,8 @@ we can: + opens up the possibility to improve cabal user experience, via inserting modules or running gen-bounds. -+ makes it easier for cabal library to deal with cabal files, such as hpack, HLS and cabal-fmt ++ makes it easier for users of the cabal library to deal with cabal files, + such as hpack, HLS and cabal-fmt + makes maintaining cabal itself easier, eg cabal init could be described via `GenericPackageDesription` From b5139efe5c86ff77d36f8863ff2926ed0ee5e3f7 Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 26 Aug 2024 12:56:17 -0400 Subject: [PATCH 19/20] Add example on why use GenericPackageDescription instead of Field --- proposals/0000-cabal-exact-printer.md | 29 +++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index 548902aa..08466456 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -130,6 +130,35 @@ and write their own own parsers and printers. So every tool that want's to modify cabal files has a larger maintenance burden because cabal isn't doing this upstream. +An example how a program to add dependencies to the main library could look: +```haskell +main :: IO () +main = do + theFile <- readFile "my.cabal" + dependName <- PackageName <$> getLine + case parseGenericPackageDescription theFile of + let (warns, eDescription) = runParseResult res + case eDescription of + Left someFailure -> do + error $ "failed parsing " <> show someFailure + Right generic -> + let + depends = mkDependency dependName anyVersion (NES.singleton LMainLibName) + modified = generic { condLibrary = case (condLibrary generic) of + Just lib -> (lib { condTreeConstraints = depends : condTreeConstraints lib }) + Nothing -> Nothing + } + in + Text.writeFile "my.cabal" $ exactPrint modified +``` + +All the difficulty in this programming lies in figuring out where to place a dependency, +we just made a decision here to do it in the main library assuming it exists. +We also assumed there would be no conditionals, +all these questions are what a program to add dependencies should ask to a user, +and the `GenericPackageDescription` type guides the programmer in asking the right questions. +Therefore, we can say that `GenericPackageDescription` is stronger typed then `Field`. + ## Prior Art and Related Efforts This [issue](https://github.com/haskell/cabal/issues/7544) is tracked on the cabal bug tracker. Essentially this proposal attempts to "solve" that issue. From 43389438ef14ea641b794e17e39ff993860d69ae Mon Sep 17 00:00:00 2001 From: Jappie Klooster Date: Mon, 26 Aug 2024 12:57:42 -0400 Subject: [PATCH 20/20] Add note on lack of syntax mangling --- proposals/0000-cabal-exact-printer.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/proposals/0000-cabal-exact-printer.md b/proposals/0000-cabal-exact-printer.md index 08466456..e6e40978 100644 --- a/proposals/0000-cabal-exact-printer.md +++ b/proposals/0000-cabal-exact-printer.md @@ -158,6 +158,8 @@ We also assumed there would be no conditionals, all these questions are what a program to add dependencies should ask to a user, and the `GenericPackageDescription` type guides the programmer in asking the right questions. Therefore, we can say that `GenericPackageDescription` is stronger typed then `Field`. +Note that there is no low level syntax mangling going on at all, +because the functions exposed in the cabal library takes care of that for us. ## Prior Art and Related Efforts This [issue](https://github.com/haskell/cabal/issues/7544) is tracked on the cabal bug tracker.