Skip to content

subprocess PATH semantics and portability #52803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dabrahams mannequin opened this issue Apr 28, 2010 · 21 comments
Open

subprocess PATH semantics and portability #52803

dabrahams mannequin opened this issue Apr 28, 2010 · 21 comments
Labels
docs Documentation in the Doc dir topic-subprocess Subprocess issues. type-feature A feature request or enhancement

Comments

@dabrahams
Copy link
Mannequin

dabrahams mannequin commented Apr 28, 2010

BPO 8557
Nosy @mark-summerfield, @bitdancer, @briancurtin, @dabrahams, @eryksun, @henryiii, @asottile
Files
  • probe.py: demonstrates portable Popen wrapper
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2010-04-28.10:44:53.142>
    labels = ['type-feature', '3.8', '3.9', '3.10', 'docs']
    title = 'subprocess PATH semantics and portability'
    updated_at = <Date 2021-05-13.20:49:33.930>
    user = 'https://github.com/dabrahams'

    bugs.python.org fields:

    activity = <Date 2021-05-13.20:49:33.930>
    actor = 'Henry Schreiner'
    assignee = 'docs@python'
    closed = False
    closed_date = None
    closer = None
    components = ['Documentation']
    creation = <Date 2010-04-28.10:44:53.142>
    creator = 'dabrahams'
    dependencies = []
    files = ['17180']
    hgrepos = []
    issue_num = 8557
    keywords = []
    message_count = 21.0
    messages = ['104422', '104429', '104437', '104527', '104611', '104646', '104647', '104738', '104752', '104766', '104902', '104904', '104908', '104909', '104912', '104925', '105867', '262382', '262399', '320098', '388155']
    nosy_count = 10.0
    nosy_names = ['mark', 'r.david.murray', 'brian.curtin', 'docs@python', 'dabrahams', 'RubyTuesdayDONO', 'eryksun', 'pepalogik', 'Henry Schreiner', 'Anthony Sottile']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue8557'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @dabrahams
    Copy link
    Mannequin Author

    dabrahams mannequin commented Apr 28, 2010

    On POSIX systems, the PATH environment variable is always used to
    look up directory-less executable names passed as the first argument to Popen(...), but on Windows, PATH is only considered when shell=True is also passed.

    Actually I think it may be slightly weirder than that when
    shell=False, because the following holds for me:

    C:\>rem ##### Prepare minimal PATH #####
    C:\>set "PATH=C:\Python26\Scripts;C:\Python26;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem"

    C:\>rem ##### Prepare a minimal, clean environment #####
    C:\>virtualenv --no-site-packages e:\zzz
    New python executable in e:\zzz\Scripts\python.exe
    Installing setuptools................done.

    C:\>rem ##### Show that shell=True makes the difference in determining whether PATH is respected #####
    C:\>python
    Python 2.6.5 (r265:79096, Mar 19 2010, 18:02:59) [MSC v.1500 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import subprocess
    >>> subprocess.Popen(['python', '-c', 'import sys; print sys.executable'])
    <subprocess.Popen object at 0x0000000001DBE080>
    >>> C:\Python26\python.exe
    
    >>> subprocess.Popen(['python', '-c', 'import sys; print sys.executable'], env={'PATH':r'e:\zzz\Scripts'})
    <subprocess.Popen object at 0x0000000001F05A90>
    >>> C:\Python26\python.exe
    
    >>> subprocess.Popen(['python', '-c', 'import sys; print sys.executable'], env={'PATH':r'e:\zzz\Scripts'}, shell=True)
    <subprocess.Popen object at 0x0000000001F05B00>
    >>> e:\zzz\Scripts\python.exe

    That is, it looks like the environment at the time Python is invoked is what counts unless I pass shell=True. I don't even seem to be able to override this behavior by changing os.environ: you can clear() it and pass env={} and subprocess.Popen(['python']) still succeeds.

    This is a very important problem for portable code and one that took me hours to suss out. I think:

    a) the current behavior needs to be documented
    b) it needs to be fixed if possible
    c) otherwise, shell=True should be the default

    @dabrahams dabrahams mannequin added the type-bug An unexpected behavior, bug, or error label Apr 28, 2010
    @dabrahams dabrahams mannequin assigned docspython Apr 28, 2010
    @dabrahams dabrahams mannequin added the docs Documentation in the Doc dir label Apr 28, 2010
    @dabrahams
    Copy link
    Mannequin Author

    dabrahams mannequin commented Apr 28, 2010

    It's worse than I thought; there isn't even one setting for shell that works everywhere. This is what happens on POSIX (tested on Mac and Ubuntu):

    $ mkdir /tmp/xxx
    $ cd /tmp/xxx
    xxx $ virtualenv /tmp/zzz
    xxx $ python
    Python 2.6.5 (r265:79063, Mar 23 2010, 08:10:08) 
    [GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from subprocess import *
    >>> p = Popen(['python', '-c', 'import sys;print sys.executable'], 
    ...           stdin=PIPE,stdout=PIPE,stderr=PIPE,
    ...           env={'PATH':'/tmp/zzz/bin'})
    >>> stdout,stderr = p.communicate(None)
    >>> print stdout
    /tmp/zzz/bin/python

    >> print stderr

    >>> p = Popen(['python', '-c', 'import sys;print sys.executable'], shell=True,
    ...           stdin=PIPE,stdout=PIPE,stderr=PIPE,
    ...           env={'PATH':'/tmp/zzz/bin'})
    >>> stdout,stderr = p.communicate(None)
    >>> print stdout

    >> print stderr

    @mark-summerfield
    Copy link
    Mannequin

    mark-summerfield mannequin commented Apr 28, 2010

    IMO there's another problem with subprocess portablity---the lack of control over encodings: see bpo-6135.

    @bitdancer
    Copy link
    Member

    Changing the default value of shell is not an option anyway.

    The behavior you describe is exactly what one should expect: the environment in which the executable is located is the environment of the process calling Popen, not the environment passed to Popen. The environment passed to Popen is the environment in which the subprocess executes. When using shell=True, this is the environment in which the shell executes, and the *shell* then looks up the executable in that new environment. As far as I know this behavior is the same on both Windows and Unix, and thus is not a portability issue. (How the shell works can be a portability issue, though.)

    I'm not sure that this needs to be documented explicitly, as it is a logical consequence of how subprocesses work, but if you want to suggest a doc update I'll take a look at it.

    I suspect your Unix example is about the fragility of the rules for computing sys.executable (there's an issue in the tracker about that...you may get a different result on trunk), but I haven't checked it.

    @bitdancer bitdancer changed the title subprocess portability issue subprocess PATH semantics Apr 29, 2010
    @dabrahams
    Copy link
    Mannequin Author

    dabrahams mannequin commented Apr 30, 2010

    I wrote a Python script (enclosed) to methodically test how these things work, that doesn't rely on peculiarities of sys.executable. The tests did reveal some notable differences on *nix and 'doze:

    • When shell=False on windows you must launch the process using a full filename (e.g. "foo.exe", not just "foo", pass --invoke-filename to the script to enable that). This may seem obvious to you, but for me it was surprising that one executable lookup function (looking in PATH) is in effect but not the other (extending unqualified executable names). This should be spelled out in the docs.

    • On *nix, with shell=False and the executable is neither in the PATH in the environment at the time of Python's launch nor in os.environ at the time of Popen, passing Popen an explicit env whose PATH includes the executable is enough to cause it to be found. Not so on 'doze.

    • On 'doze, when the executable is in the PATH of os.environ but not in that of Popen's explicit env argument, even with shell=False, no Exception is raised (but returncode is nonzero)

    @bitdancer
    Copy link
    Member

    Well, it seems I was mistaken when I thought I knew how this worked :)
    Checking the os.exec documentation linked from the subprocess page, I see that when an environment is supplied PATH is indeed checked in it. The documentation for CreateProcess, however, indicates that PATH is ignored, and any extension must be supplied explicitly.

    At the very least the docs should be updated to clarify that execvpe is used when an environment is passed on posix, and to link to the CreateProcess docs. A discussion of PATH could perhaps be put in a note or footnote (probably footnote, there are enough notes already in those docs!)

    I'm not sure how one creates a good portability story out of these pieces. It doesn't seem as though there is any way to harmonize the two, since we are dealing with the semantics of system calls over which we have no control.

    For reference, here is (a?) link to CreateProcess docs that I found via Google:

    http://msdn.microsoft.com/en-us/library/ms682425(VS.85).aspx

    It doesn't look like the kind of link that one could trust to be stable, though, so I'm not sure if we should include it in the docs.

    I'm adding Brian Curtin as nosy to see if he knows whether or not there are permalink-like links to the relevant Microsoft documentation that we could use.

    @bitdancer bitdancer changed the title subprocess PATH semantics subprocess PATH semantics and portability Apr 30, 2010
    @briancurtin
    Copy link
    Member

    You could take the "(VS8.5)" part out of the link which will give the latest version, which may not always be the relevant version (although I doubt this specific API would change).

    That's about the best permalink-like feature you'll find, but overall, leaving the link as-is is pretty safe in my experience.

    @dabrahams
    Copy link
    Mannequin Author

    dabrahams mannequin commented May 1, 2010

    @r.david.murray: did you try running my test? I think it shows that we are pretty darned close to fully portable. I believe we could fix Popen to make it fully portable pretty easily. In fact, there may be a pure-python fix. Documenting the differences would also not be hard. I would discourage you from relying *solely* on a description such as "uses execvpe on POSIX" to describe the semantics. Aside from being a nonportable description, it doesn't help anyone who isn't familiar with the POSIX system calls.

    @bitdancer
    Copy link
    Member

    I didn't run the script. I have now, but I'm not clear from its output what each test is actually doing, and don't really have the time to figure it out from the code right now.

    I think it is probably more efficient to just ask you what your suggestion is for making things more portable?

    As for the docs, the docs link to the os.exec python docs, which explain the PATH semantics. Linking to the Microsoft documentation would equivalently explain the Windows semantics. An explicit footnote discussing the differences in PATH behavior in the subprocess context would probably be helpful.

    @dabrahams
    Copy link
    Mannequin Author

    dabrahams mannequin commented May 2, 2010

    I've uploaded a new probe.py that contains a win32 Popen wrapper that I think acts just like *nix's Popen w.r.t. PATH and environment (pass --fix to demonstrate). I suggest using this or an equivalent wrapper for Win32, and documenting the fact that with shell=False, filename extensions need to be supplied explicitly on windows.

    @dabrahams
    Copy link
    Mannequin Author

    dabrahams mannequin commented May 3, 2010

    Not to appear impatient, but...<bump>.
    It's a fairly tidy answer, I think :-)

    @bitdancer
    Copy link
    Member

    Sorry for my Windows ignorance, but if CreateProcess ignores the PATH, how does updating the PATH fix the problem?

    @dabrahams
    Copy link
    Mannequin Author

    dabrahams mannequin commented May 4, 2010

    I'm probably as ignorant as you are of Windows issues. I just know what my experiments tell me: if you force the contents of any explicit 'env' argument into os.environ before calling Popen, you get the same behavior as on *nix.

    @bitdancer
    Copy link
    Member

    Well, it wouldn't be the first time the microsoft docs were wrong.

    There are two questions here: (1) is this behavior consistent across all microsoft platforms we support? (2) is this *change* in behavior of Popen acceptable?

    For (1) we need a unit test added to the subprocess unit tests that can check this.

    For (2)...well, I think it would be good for the behavior to be as consistent as practical, so I'd be in favor. We need some second opinions, though, to make sure we aren't overlooking some negative consequence. I'm also not sure that this qualifies as a bug fix, so it may only be possible to get it into 3.2, assuming it is acceptable.

    Note that I have not tested your program on Windows myself, I'm taking your word for it that it works ;) I'll be more inclined to test things if the tests are in the form of unit tests, which should be much easier to understand than your test program.

    @dabrahams
    Copy link
    Mannequin Author

    dabrahams mannequin commented May 4, 2010

    R. David Murray wrote:

    There are two questions here: (1) is this behavior consistent across all microsoft platforms we support?

    I'll be honest: I don't know.

    (2) is this *change* in behavior of Popen acceptable?

    I don't know that either.

    I'll be more inclined to
    test things if the tests are in the form of unit tests, which should
    be much easier to understand than your test program.

    I guess no good deed goes unpunished ;-)

    I also guess that whether you think unit tests will be easier to
    understand depends on what kind of information you expect to glean
    from the code. My script was designed to probe for all
    inconsistencies between ‘doze and POSIX behaviors, and it is more
    revealing in that respect than a unit test would be. The unit test
    that would prompt the particular code change I'm suggesting would look
    more like:

    put directory X in the env argument's PATH (but not in os.environ)
    attempt to launch X/some_executable as simply “some_executable”
    assert that X/some_executable actually ran
    

    I don't know what Popen's unit tests look like, and to be honest,
    right now I just don't have any more time to pour into this particular
    issue. Even if it doesn't get fixed in Python I'm going to be using
    my wrapper for uniformity. I hope everything I've done so far is
    useful to the community but if not, I still have what I need.

    @bitdancer
    Copy link
    Member

    Fair enough. Thank you for your detective work, and hopefully someone will be interested enough to pick this up again later.

    @bitdancer bitdancer added the stale Stale PR or inactive for long period of time. label May 4, 2010
    @dabrahams
    Copy link
    Mannequin Author

    dabrahams mannequin commented May 16, 2010

    New data point: in some contexts on Windows (not sure of the exact cause but I was dealing with multiple drives), even this workaround isn't enough. I ended up having to do something like this (i.e. manually search the path) on win32:

        def full_executable_path(invoked, environ):
    
            if os.path.splitext(invoked)[1]:
                return invoked
            
            explicit_dir = os.path.dirname(invoked)
    
            if explicit_dir:
                path = [ explicit_dir ]
            else:
                path = environ.get('PATH').split(os.path.pathsep)
    
            extensions = environ.get(
                'PATHEXT',
                # Use *something* in case the environment variable is
                # empty.  These come from my machine's defaults
                '.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.PSC1'
                ).split(os.path.pathsep)
    
            for dir in path:
                for ext in extensions:
                    full_path = os.path.join(dir, invoked+ext)
                    if os.path.exists( full_path ):
                        return full_path
        return invoked # Not found; invoking it will likely fail
    
        class Popen(subprocess.Popen):
            def __init__(
                self, args, bufsize=0, executable=None,
                stdin=None, stdout=None, stderr=None,
                preexec_fn=None, close_fds=False, shell=False, 
                cwd=None, env=None, 
                *args_, **kw):
    
                if executable is None and not shell:
                    executable = full_executable_path(args[0], env or os.environ)
            super(Popen,self).__init__(
                args, bufsize, executable, stdin, stdout, stderr, 
                preexec_fn, close_fds, shell, cwd, env, *args_, **kw)
    

    @asottile
    Copy link
    Mannequin

    asottile mannequin commented Mar 24, 2016

    Here's the workaround I'm opting for:

    if sys.platform =='win32':
        distutils.spawn.find_executable(cmd[0]) + cmd[1:]

    @eryksun
    Copy link
    Contributor

    eryksun commented Mar 25, 2016

    As is documented for CreateProcess 1, the search path always includes the following directories:

    * The directory from which the application loaded.
    * The current directory for the parent process.
    * The Windows system directory. Use the
      GetSystemDirectory function to get the path of
      this directory.
    * The 16-bit Windows system directory. There is no
      function that obtains the path of this directory,
      but it is searched. The name of this directory is
      System.
    * The Windows directory. Use the GetWindowsDirectory
      function to get the path of this directory.
    * The directories that are listed in the PATH
      environment variable.
    

    The value of PATH comes from the calling process environment, not from the environment passed in the lpEnvironment parameter. If you need to search some other list of paths, you can use shutil.which to find the fully qualified path of the target executable.

    Note that in Vista+ you can remove the current directory from the search list by defining the environment variable "NoDefaultCurrentDirectoryInExePath" 2.

    The following examples show the minimum search path that CreateProcess uses when PATH isn't defined.

        >>> 'PATH' in os.environ
        False
    
        >>> subprocess.call('python -Sc "import sys; print(sys.prefix)"')
        Breakpoint 0 hit
        KERNELBASE!SearchPathW:
        00007ff9`cf4b5860 488bc4          mov     rax,rsp
        0:000> du @rcx
        0000006c`a7074410  "C:\Program Files\Python35;.;C:\W"
        0000006c`a7074450  "indows\SYSTEM32\;C:\Windows\syst"
        0000006c`a7074490  "em;C:\Windows"
        0:000> g
        C:\Program Files\Python35
        0
    >>> os.environ['NoDefaultCurrentDirectoryInExePath'] = '1'
    
        >>> subprocess.call('python -Sc "import sys; print(sys.prefix)"')
        Breakpoint 0 hit
        KERNELBASE!SearchPathW:
        00007ff9`cf4b5860 488bc4          mov     rax,rsp
        0:000> du @rcx
        0000006c`a6560710  "C:\Program Files\Python35;C:\Win"
        0000006c`a6560750  "dows\SYSTEM32\;C:\Windows\system"
        0000006c`a6560790  ";C:\Windows"
        0:000> g
        C:\Program Files\Python35
        0

    Note that in the 2nd case the current directory ('.') is no longer present between the application directory ("C:\Program Files\Python35") and the system directory ("C:\Windows\SYSTEM32\").

    CreateProcess executes PE executables and batch files (run via the %ComSpec% interpreter). It automatically appends the .exe extension when searching for an executable. It does this via the lpExtension parameter of SearchPath 3.

    Some .com files are PE executables (e.g. chcp.com). Otherwise it's not really usefully to loop over the PATHEXT extensions unless you're using shell=True, since most are filetypes that CreateProcess doesn't support [4].

    [4]: If Microsoft's Windows team cared at all about cross-platform
    idioms they'd add shebang support to CreateProcess, which
    would make all scripts, not just batch files, directly
    executable without requiring ShellExecuteEx and registered
    filetypes. ShellExecuteEx doesn't support a lot of useful
    creation flags that are only available by calling
    CreateProcess directly, and it also doesn't check ACLs to
    prevent executing a file. So scripts are second class
    citizens in Windows, which is why Python has to embed
    scripts in .exe wrappers.

    @eryksun eryksun added the 3.7 (EOL) end of life label Feb 4, 2017
    @pepalogik
    Copy link
    Mannequin

    pepalogik mannequin commented Jun 20, 2018

    A related issue exists with cwd: bpo-15533.

    @eryksun
    Copy link
    Contributor

    eryksun commented Mar 5, 2021

    The Popen() docs begin by explaining that it has "os.execvp()-like" behavior in POSIX and uses CreateProcess() in Windows. Personally, I do not think it's proper for Python's documentation to discuss details of how CreateProcess() handles lpCommandLine (args), lpApplicationName (executable), lpCurrentDirectory (cwd), and lpEnvironment (env). So maybe all this needs is to clearly map Popen() parameters to the corresponding CreateProcess() parameters.

    If Popen() implements a parameter on its own, then it makes sense to me to document the behavior. For example, in POSIX the behavior of cwd is implemented by Popen(), and documented as follows:

    In particular, the function looks for executable (or for the first 
    item in args) relative to cwd if the executable path is a relative
    path.
    

    This claim is not always true in POSIX since a base filename without a slash in it, which is a relative path, is not searched for in the current directory unless "." is in PATH. But the main problem with the above sentence is the lack of a disclaimer that it only applies to POSIX. In Windows, cwd is passed through directly as the lpCurrentDirectory of CreateProcess(). This parameter sets the working directory of the child process and has nothing to do with searching for an executable parsed out of lpCommandLine or resolving a relative path in lpApplicationName. It may affect the result with shell=True, but even in that case there are caveats. Regardless, Python doesn't do anything with cwd in Windows except pass it to CreateProcess(), so the cwd -> lpCurrentDirectory parameter mapping is all there is to document.

    @eryksun eryksun added 3.8 (EOL) end of life 3.9 only security fixes labels Mar 5, 2021
    @eryksun eryksun added 3.10 only security fixes type-feature A feature request or enhancement and removed 3.7 (EOL) end of life stale Stale PR or inactive for long period of time. type-bug An unexpected behavior, bug, or error labels Mar 5, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @vstinner vstinner added the topic-subprocess Subprocess issues. label Jul 8, 2024
    @picnixz picnixz removed 3.10 only security fixes 3.9 only security fixes 3.8 (EOL) end of life labels Mar 1, 2025
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir topic-subprocess Subprocess issues. type-feature A feature request or enhancement
    Projects
    Status: Todo
    Development

    No branches or pull requests

    5 participants