Skip to content

Replacer: Allow LocalPath to pass through #1713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jayvdb
Copy link

@jayvdb jayvdb commented Oct 23, 2020

Paths were already stored inside SectionReader._subs as
path objects, however they are turned into strings when they
go through Replacer, and various transforms occur on these
paths, for example breaking paths containing '#' and ' '.

This change defaults to path objects being retained through the
Replacer, and path segments being joined together using path object
joining semantics.

This mode can be disabled for settings of type path by wrapping
the value in quotes, and disabled for other values by disabling
new global setting literal_paths.

Fixes #763 and #924

Contribution checklist:

(also see CONTRIBUTING.rst for details)

  • wrote descriptive pull request text
  • added/updated test(s)
  • updated/extended the documentation
  • added relevant issue keyword
    in message body
  • added news fragment in changelog folder
    • fragment name: <issue number>.<type>.rst for example (588.bugfix.rst)
    • <type> is must be one of bugfix, feature, deprecation,breaking, doc, misc
    • if PR has no issue: consider creating one first or change it to the PR number after creating the PR
    • "sign" fragment with "by :user:<your username>"
    • please use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files - by :user:superuser."
    • also see examples
  • added yourself to CONTRIBUTORS (preserving alphabetical order)

Paths were already stored inside SectionReader._subs as
path objects, however they are turned into strings when they
go through Replacer, and various transforms occur on these
paths, for example breaking paths containing '#' and ' '.

This change defaults to path objects being retained through the
Replacer, and path segments being joined together using path object
joining semantics.

This mode can be disabled for settings of type `path` by wrapping
the value in quotes, and disabled for other values by disabling
new global setting literal_paths.

Fixes tox-dev#763
and tox-dev#924
@jayvdb
Copy link
Author

jayvdb commented Oct 24, 2020

There are still more tests to write, to verify that path objects pass through each arg type sanely.
There will be some power users whose tox.ini will be broken, if they are doing complicated syntax in a single value that starts with a path, usually to workaround the problem that this solves by adding quotation marks around strings. I'll construct an example of what can break. And maybe detect any common cases and disable the new mode.

The most obvious way to reduce the surface area of the possible breakage is to not use literal path mode if the input value contains ' anywhere (and maybe ").

Copy link
Member

@gaborbernat gaborbernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we really need to handle this use case given no one complained about this yet. Makes the code much more complicated, and paths having # feels more an academic exercise than something likely to come up. Also now introduces a new flag, which I'm not that keen on migrating to version 4. tox already has too many options 🤔

@jayvdb
Copy link
Author

jayvdb commented Oct 24, 2020

Just using this approach for type path solves the issues raised, and doesnt need the new flag, so I could split that bit off, and get that bit polished first. The benefit isnt just # and - the code may look a bit more complicated, but having a shorter/clearer route through the engine for paths makes the parser conceptually easier, and makes it much easier for users to do medium complexity string operations using paths (which is a fairly common case), without fighting to get strings through CommandLine and shlex twice if they want the value in the subprocess environment.

The literal_paths flag is only part of the handling of types other than path, which is a segment I am much less attached to, and may also be unnecessary if I can find a nice way to avoid any reason people would want the old parser.

Another approach to allow to bypass the shlex'ing of paths for other types of values is to do allow something like {path:{toxinidir}/foo#bar} to inject literals, or maybe even {glob:{toxinidir}/foo#bar} (#1571) if that provides most of the benefits.

@jayvdb
Copy link
Author

jayvdb commented Oct 24, 2020

Also now introduces a new flag, which I'm not that keen on migrating to version 4.

If this literal_paths flag is merged, I would expect that v4 does not have this flag, and only uses this new literal path mode.

There is nothing in the prior syntax which can not be achieved more cleanly with the new mode.
The messy case is where one key starts with a path and also contains additional strings which uses quotes that would be removed by shlex. I'll provide some more concrete examples, but it will take a bit of effort to verify what exactly will break.

To convert old mode to new mode, those problematic value would need to be split into two keys, with one being a path only (and be literal, no shlex/CommandLine), and the other joining the first key and the other strings (and uses shlex and CommandLine as before).

The ability to disabling it in v3 would only be so that people dont need to re-write their ini, if they have complex syntax and it behaves slightly differently, and they dont want to learn the changes.

@gaborbernat
Copy link
Member

gaborbernat commented Oct 24, 2020

If this literal_paths flag is merged, I would expect that v4 does not have this flag, and only uses this new literal path mode.

tox 4 aims (as much as possible) to be configuration file compatible, so any flag we introduce today will likely be grandfathered into tox 4... otherwise, people would have to rewrite their tox.ini just to start using tox 4, which IMHO is a too big ask.

@jayvdb
Copy link
Author

jayvdb commented Oct 24, 2020

tox 4 aims (as much as possible) to be configuration file compatible

As you can see from this PR, the new mode passes all existing tests, and new tests. It is compatible with an extremely high percentage of existing tox.ini uses. I am being overly cautious and explaining that there are small breakages, so as to not mislead code reviewers. The next step is to isolate them. Also as above I have outlined quite a few ways to eliminate even the small breakages, and I am going to explore that also.

Also it would be nice if I could get your feedback on the proposal above at #1713 (comment) to split off the type=path component which fixes the issues raised without any syntax breakage, and thus has no introduced flag.

@gaborbernat
Copy link
Member

medium complexity string operations using paths (which is a fairly common case)

Can you detail on these a bit?

@gaborbernat
Copy link
Member

@jayvdb this is merge conflicting.

@gaborbernat
Copy link
Member

I'll close this for now until you can attend to the PR again. Once you've done we can reopen.

@gaborbernat gaborbernat closed this Nov 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tox fails when running in a path containing a hash
2 participants