Skip to content

gh-108948: tarfile should handle sticky bit in FreeBSD #108950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 21 commits into from

Conversation

sorcio
Copy link
Contributor

@sorcio sorcio commented Sep 5, 2023

Instead of failing on FreeBSD, we emulate the behavior of other platforms. That means that if a file wants to set the sticky bit and we are not root, we just ignore it.

@sorcio
Copy link
Contributor Author

sorcio commented Sep 5, 2023

I think this should be backported to 3.12 and 3.11 but it would be useful if #107902 is backported first.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that test_tarfile.test_modes() fails on FreeBSD without this change, and it does pass with this change.

But I don't think that the tarfile change is correct.

ethanfurman
ethanfurman previously approved these changes Sep 5, 2023
Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep the first exception and ignore the second chmod() exception:

    def chmod(self, tarinfo, targetpath):
        """Set file permissions of targetpath according to tarinfo.
        """
        if tarinfo.mode is None:
            return

        try:
            try:
                os.chmod(targetpath, tarinfo.mode)
            except OSError as exc:
                if not(hasattr(errno, "EFTYPE") and e.errno == errno.EFTYPE):
                    raise

                # gh-108948: On FreeBSD, chmod() fails when trying to set the
                # sticky bit on a file as a normal user. It's a noop in most
                # other platforms.
                try:
                    os.chmod(targetpath, tarinfo.mode & ~stat.S_ISVTX)
                except OSError:
                    # re-raise the first chmod() exception,
                    # ignore the second chmod() exception.
                    raise exc from None
                else:
                    # second chmod() succeeded
                    pass
        except OSError:
            raise ExtractError("could not change mode")

Enjoy the 3 levels of nested try/exception :-)

@ethanfurman
Copy link
Member

ethanfurman commented Sep 6, 2023

Will the first chmod failure have details about the sticky-bit? Because that would be inaccurate and could be a time-waster. It seems to me that the second one is more appropriate.

Also, the patch as-is doesn't check for root -- if the chmod is failing for the root user, there should be no need to try (and possibly succeed) without the sticky-bit.

@ethanfurman ethanfurman dismissed their stale review September 6, 2023 00:11

Concerns about failures as root user.

@vstinner
Copy link
Member

vstinner commented Sep 6, 2023

Will the first chmod failure have details about the sticky-bit?

Well, if you care about keeping all information, the 3 exceptions can easily be chained ;-) Maybe that the most accurate information ;-)

@sorcio
Copy link
Contributor Author

sorcio commented Sep 6, 2023

Will the first chmod failure have details about the sticky-bit? Because that would be inaccurate and could be a time-waster. It seems to me that the second one is more appropriate.

The first exception within the check is inevitably OSError: [Errno 79] Inappropriate file type or format. I don't think it provides useful information to the user (by itself) in case of a failure. Especially if we agree that tarfile is supposed to silently drop the sticky bit, as it does on Linux. Chaining sounds like a good option not to discard any information, as @vstinner suggested, although I'm worried that the first error is only confusing if one doesn't read tarfile source.

Note: I checked what BSD tar does. When running as non-root, it defaults to ignore "SGID, SUID, sticky bit, file attributes or file flags, extended file attributes and ACLs", with an option to reverse the behavior. When enabling this option (default on root) it indeed fails with an error similar to Can't set permissions to 01777: Inappropriate file type or format. GNU tar is basically the same. This seems different than what we are doing without any "extraction filter", and for this patch I was more keen on replicating what happens on Linux than matching what the tar utility does.

Also, the patch as-is doesn't check for root -- if the chmod is failing for the root user, there should be no need to try (and possibly succeed) without the sticky-bit.

The retry only happens on EFTYPE. Among the supported platforms, I think only FreeBSD sets this error. It's documented to only happen on one very specific case:

[EFTYPE] The effective user ID is not the super-user, the mode includes the sticky bit (S_ISVTX), and path does not refer to a directory.

If we are to trust this, EFTYPE already implies non-root. The only issue I see is that this is platform-specific and some other system might use EFTYPE to mean something different. I don't mind adding a safeguard, be it against euid or sys.platform, if you think it's useful.

@sorcio
Copy link
Contributor Author

sorcio commented Sep 6, 2023

I was trying to add a test case to cover the branch where the two exceptions are chained. But I cannot think of a way to make chmod fail that wouldn't trigger an error before getting to chmod. Opening the file for write is enough to fall into any of the error cases, except for EFTYPE itself, and the EPERM one related to groups which we can't test in general because it needs a specific setup. Good news is that this is such a corner case that it wouldn't matter if the traceback is not perfectly clear, and I feel we are already doing more than enough.

@ethanfurman
Copy link
Member

@sorcio, thank you for doing the extra research! LGTM.

@vstinner
Copy link
Member

vstinner commented Sep 6, 2023

I was trying to add a test case to cover the branch where the two exceptions are chained. But I cannot think of a way to make chmod fail that wouldn't trigger an error before getting to chmod.

You can mock the os.chmod() function for the duration of your test, so you can control exactly how it works.

@sorcio
Copy link
Contributor Author

sorcio commented Sep 15, 2023

Thanks for the review feedback @vstinner! I applied all your suggestions and replied to some of your comments.

I could merge / rebase main to add the FreeBSD CI runner, since it's one of those PRs where it's directly useful. But I haven't yet because Cirrus is blocking new runs for the month.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@encukou: Oh good, it seems like filter='data' already clears the sticky bit in _get_filtered_attrs():

        # Strip high bits & group/other write bits
        mode = mode & 0o755

Is it a new functional test needed to ensure that the sticky bit is ignored with filter='data'?

Lib/tarfile.py Outdated
Comment on lines 2591 to 2592
except OSError as e:
raise ExtractError("could not change mode") from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except OSError as e:
raise ExtractError("could not change mode") from e
except OSError as exc3:
raise ExtractError("could not change mode") from exc3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -690,6 +691,36 @@ def format_mtime(mtime):
tar.close()
os_helper.rmtree(DIR)

@unittest.skipUnless(hasattr(errno, "EFTYPE"), "errno.EFTYPE required")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to mock this constant to test the code on other platforms?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to go with it. Maybe it's overkill for such a niche case, but more tests rarely hurt. All new tests never skip.

self.addCleanup(os_helper.rmtree, DIR)
with tar:
# this should not raise:
tar.extract("sticky1", DIR, filter="fully_trusted")
Copy link
Member

@vstinner vstinner Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you should split this test in two tests:

  • a method which doesn't mock anything and jsut check that tarfile works as expected: this line. -- functional test
  • another method which mocks chmod() to test the complicated EFTYPE code path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Again, maybe overkill, but it's three tests now.

other_error = OSError(errno.EPERM, "different error")
mock.side_effect = [eftype_error, other_error]
with self.assertRaises(tarfile.ExtractError) as excinfo:
tar.extract("sticky2", DIR, filter="fully_trusted")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since tarfile implementation now uses a complicated triple-nested try block, would it be possible to test also the othe code path: EFTYPE error but then the second chmod() succeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is done now.

@@ -0,0 +1 @@
On FreeBSD, :mod:`tarfile` will not attempt to set the sticky bit on extracted files when it's not possible, matching the behavior of other platforms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum. I suggest to explain it differently.

:mod:`tarfile`: On FreeBSD, if the sticky bit cannot be set,
if the user is not root for example, clear the sticky bit,
matching the behavior of other platforms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be worth it to add a note about the sticky bit in https://docs.python.org/dev/library/tarfile.html#tarfile.TarFile.extract documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with Ethan's suggestion for the blurb.

I also added a note, but under the filters documentation, so if a user searches for "sticky" or "bits" or similar keywords they find relevant information right there. I can change it or add more, of course.

@sorcio
Copy link
Contributor Author

sorcio commented Sep 15, 2023

The macOS runner failure is a funny bit of history because in the documentation for chmod they still state that "The sticky bit may only be set by the super user on shareable executable files" but in fact that's not true anymore. You can set the bit on any file as unprivileged user (assuming the file system supports it) and it has no special meaning. One free bit per file, in case you needed extra storage.

As far as I can gather, this has been the case since OS X Tiger (10.4, 2005) and the last mention I see of a similar check was in the previous release where, apparently, it used to return EFTYPE :) Nobody ever bothered to fix the ancient BSD documentation, and we get the EFTYPE constant in the headers as a remnant, because I don't think it's used in any way in user space code.

Anyway, I will need to add more branching and possibly double check what other BSDs are doing.

@vstinner
Copy link
Member

Why is the test executed so many times? It cannot be defined somewhere else to only be executed once?

For macOS, modify the test to check mode expected on macOS, no?

@sorcio
Copy link
Contributor Author

sorcio commented Sep 16, 2023

Why is the test executed so many times? It cannot be defined somewhere else to only be executed once?

Many/most of the tarfile tests are run once per compression format. But it's good that you asked the question because I've realized that I'm not using the compression format at all in the new test case. I had put it in a different place at first, then moved it closer to other similar tests, but forgot to adapt it. I will move it again.

For macOS, modify the test to check mode expected on macOS, no?

Sure! I haven't decided yet how to keep the test readable, but I'll come up with something. The note was mostly for historical/fun purposes, and to write it down somewhere before I forget :)

@sorcio
Copy link
Contributor Author

sorcio commented Sep 16, 2023

Following the insightful feedback from @vstinner I decided to be a lot less minimalist with my approach to testing. The fix itself is basically the same, but it's overly tested now.

The mock tests make me a bit nervous because of course they depend on implementation details (when and how os.chmod gets called) but there are other tests. test_extract_chmod_eftype_success is a bit redundant because it's already covered by TestExtractionFilters.test_modes. I don't mind, because at least it's explicit about what we are testing (and it sets errorlevel=2).

About the helper function I added: is it a bad idea? Does it belong in test.support.os_helper? Should the helper itself be tested?

As a reminder:

I think this should be backported to 3.12 and 3.11 but it would be useful if #107902 is backported first.


Although the filter preserves all permission bits unchanged, the sticky
bit (:const:`~stat.S_ISVTX`) will not be set on extracted files when the
system does not allow it. For compatibility reasons, no error is raised
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
system does not allow it. For compatibility reasons, no error is raised
system does not allow it for the current user. For compatibility reasons, no error is raised

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, yeah, it's right that running as root will set the sticky bit in the cases we are testing. But I don't want to suggest it's a permissions issue (which would likely raise EPERM), or to imply more than is necessary. E.g. I think file system implementations on Linux are allowed to silently drop the bit if they don't support it regardless of the user. What do you think is worth clarifying in this context?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can ignore my suggestion if you think that it's irrelevant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrased ever so slightly for readability, but I avoided the "for the current user" part because it feels out of place.

mode = os.stat(f.fileno()).st_mode
return bool(mode & stat.S_ISVTX)
finally:
os.unlink(filename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's safer to use os_helper.unlink() here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.



def can_chmod_set_sticky():
"""Can chmod set the sticky bit on a file?"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain in the docstring that it returns False if chmod() doesn't fail but ignores the sticky bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

def can_chmod_set_sticky_inner():
if not os_helper.can_chmod():
return False
filename = os_helper.TESTFN + "-chmod-sticky"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly, just use "chmod-sticky", to avoid further troubles with non-ASCII filenames.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided not to do it, and instead just use TESTFN like can_chmod() does. The concern is fair but I don't want to touch os_helper more than necessary.

_can_chmod_set_sticky = None


def can_chmod_set_sticky():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should move it to support.os_helper.

I suggest to rename to can_chmod_set_sticky_bit().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@encukou
Copy link
Member

encukou commented Sep 18, 2023

I am not a tarfile expert, nor do I want to become one. My contribution -- filters -- was limited in scope. It did introduce tests that revealed some surprises, and I'm happy to modify the tests to work on OSs we don't formally support.

With that out of the way: I'm wary of this change. Adding OS-specific chunks of code to tarfile makes it less maintainable and predictable.
If chmod itself had some kind of drop_invalid_bits kwarg, then tarfile should at least have an options for it, but silently dropping it in tarfile seems like second-gusessing the OS.
Also:

  • it's technically a backwards-incompatible change (it'll probably not break anyone's CI, but good luck finding it if it does!)
  • when unpacking cross-platform tarballs you should now use tar_filter and data_filter -- if you aren't, chances are that you actually want to set the exotic bits. (Note that tar_filter is the one that's designed to roughly match what the tar utility does.)
  • I don't remember seeing any complaints about this in a decade, which would suggests that no one needs the retrying behaviour.

But again: I'm not the expert. If another core dev wants to maintain the retry, go ahead.

@sorcio, thank you for your work, and sorry that I didn't catch this before you made the PR.

@sorcio
Copy link
Contributor Author

sorcio commented Sep 18, 2023

@encukou no worries, and thanks a lot for the detailed feedback! I had a hunch there could be a problem with this change, and I regret not bringing it up more explicitly (I don't care that much about the behavior, and I just wanted tests to pass on FreeBSD). The difficulty documenting the change in an unambiguous way should have been a red flag.

I don't think chmod will ever have such an option (I doubt it would be a good idea). The issue is that this is an obscure difference that depends on OS and file system implementation. It's pretty opaque from the point of view of a high level library. So I thought that smoothing it out at the library level seemed a way to make it nicer to users.

Your comment on backwards compatibility is what settles it for me. It seems unlikely that a program ever depended on this behavior; but if it did depend on it, the proposed change would be a nightmare to debug. And if a bug is caused by this, extraction filters are the solution available to users.

I can change the PR to only modify the original failing test (TestExtractionFilters.test_modes) so that it succeeds on FreeBSD, if the new helper is still considered useful. Otherwise I'm also happy to drop it :)

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this PR took longer what I expected, since it's a core feature of tarfile, and a security sensitive topic. But I prefer to take my time :-) You enhanced the doc and the tests, that's great!

Here is I hope my last review :-) We should be good to merge soon.


global _can_chmod_set_sticky_bit
if _can_chmod_set_sticky_bit is None:
_can_chmod_set_sticky_bit = can_chmod_set_sticky_bit_inner()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I suggest you to mimick code around in os_helper.py: other "can" functions also have a cache (global variable), but they don't use an "inner" function.

tar.extract(file_name, self.test_dir, filter="fully_trusted")
self.extracted_path = os.path.join(self.test_dir, file_name)

def test_extract_chmod_eftype_success(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On most platforms, this test willl be run without EFTYPE. I suggest to remove it:

Suggested change
def test_extract_chmod_eftype_success(self):
def test_extract_chmod_success(self):

self.assertEqual(got_mode, expected_mode)

@unittest.mock.patch.object(errno, "EFTYPE", fake_EFTYPE, create=True)
def test_extract_chmod_eftype_mock(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_extract_chmod_eftype_mock(self):
def test_extract_mock_chmod_eftype_success(self):

@unittest.mock.patch.object(errno, "EFTYPE", fake_EFTYPE, create=True)
def test_extract_chmod_eftype_mock(self):
# Like test_extract_chmod_eftype_success but mock chmod so we can
# explicitly test the case we want, regardless of the OS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to describe which code path is tested:

Test chmod() failing with OSError(EFTYPE) and then pass when re-run without the sticky bit.

self.extracted_path = os.path.join(self.test_dir, file_name)

def test_extract_chmod_eftype_success(self):
# Extracting a file as non-root should skip the sticky bit (gh-108948)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Extracting a file as non-root should skip the sticky bit (gh-108948)
# gh-108948: Extracting a file as non-root should skip the sticky bit

self.assertEqual(chmod_mode1 | chmod_mode2, chmod_mode1)

@unittest.mock.patch.object(errno, "EFTYPE", fake_EFTYPE, create=True)
def test_extract_chmod_eftype_error(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_extract_chmod_eftype_error(self):
def test_extract_mock_chmod_eftype_error(self):


@unittest.mock.patch.object(errno, "EFTYPE", fake_EFTYPE, create=True)
def test_extract_chmod_eftype_error(self):
# Take care that any other error is preserved in the EFTYPE case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to add first (but preserve your longer comment):

Test chmod() failing with OSError(EFTYPE) and then raise PermissionError() when re-run without the sticky bit.

Comment on lines +3119 to +3120
self.assertEqual(excinfo.exception.__cause__, other_error)
self.assertEqual(excinfo.exception.__cause__.__cause__, eftype_error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.assertEqual(excinfo.exception.__cause__, other_error)
self.assertEqual(excinfo.exception.__cause__.__cause__, eftype_error)
self.assertIs(excinfo.exception.__cause__, other_error)
self.assertIs(excinfo.exception.__cause__.__cause__, eftype_error)

@sorcio
Copy link
Contributor Author

sorcio commented Sep 19, 2023

Hey @vstinner, thanks again for the review feedback! Can we have a decision on #108950 (comment) before we go on?

@vstinner
Copy link
Member

If chmod itself had some kind of drop_invalid_bits kwarg, then tarfile should at least have an options for it, but silently dropping it in tarfile seems like second-gusessing the OS.

A sticky bit is a valid bit. The problem is that on FreeBSD, a regular user is not allowed to set it. Example:

filename="myfile"
open(filename, "x").close()
import os, stat
old_mode=os.stat(filename).st_mode
sticky_mode=old_mode | stat.S_ISVTX
os.chmod(filename, sticky_mode)
new_mode=os.stat(filename).st_mode
print("old_mode:", stat.filemode(old_mode))
print("new_mode:", stat.filemode(new_mode))

On Linux, I get:

$ ./python script.py 
old_mode: -rw-r--r--
new_mode: -rw-r--r-T

On FreeBSD, I get:

$ ./python script.py 
Traceback (most recent call last):
  File "/usr/home/vstinner/python/main/script.py", line 6, in <module>
    os.chmod(filename, sticky_mode)
OSError: [Errno 79] Inappropriate file type or format: 'myfile'

where 79 is errno.EFTYPE.

On FreeBSD, if I run the script as root, it works as expected:

$ sudo ./python script.py 
old_mode: -rw-r--r--
new_mode: -rw-r--r-T

Now, if I create a tar archive, remove the file, and decompress the archive as a regular user... oh! the sticky bit is gone!

vstinner@freebsd$ tar -cvf archive.tar myfile
a myfile

vstinner@freebsd$ tar -tvf archive.tar 
-rw-r--r-T  0 root   vstinner    0 21 sept. 21:43 myfile

vstinner@freebsd$ rm myfile 
remove myfile? y

vstinner@freebsd$ tar -xvf archive.tar 
x myfile

vstinner@freebsd$ ls -lhs myfile 
0 -rw-r--r--  1 vstinner  vstinner     0B 21 sept. 21:43 myfile

This behavior is specific to the tar command. Also, if I decompress the archive as root, I get the sticky bit:

vstinner@freebsd$ rm myfile 
vstinner@freebsd$ sudo tar -xvf archive.tar 
x myfile

vstinner@freebsd$ ls -l myfile 
-rw-r--r-T  1 root  vstinner  0 21 sept. 21:43 myfile

If I look at syscalls calls by tar as a regular user, I get:

vstinner@freebsd$ truss -o trace tar -xvf archive.tar 
vstinner@freebsd$ tail -n12 trace 
openat(AT_FDCWD,".",O_RDONLY|O_DIRECTORY|O_CLOEXEC|O_PATH,00) = 4 (0x4)
fcntl(4,F_GETFD,)				 = 1 (0x1)
fstatat(4,"myfile",0x385d45b79b90,AT_SYMLINK_NOFOLLOW) ERR#2 'No such file or directory'
close(4)					 = 0 (0x0)
openat(AT_FDCWD,"myfile",O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC,0644) = 4 (0x4)
fcntl(4,F_GETFD,)				 = 1 (0x1)
futimens(4,{ 1695332821.000000000, 1695332632.000000000 }) = 0 (0x0)
close(4)					 = 0 (0x0)
write(2,"\n",1)					 = 1 (0x1)
close(3)					 = 0 (0x0)
exit(0x0)					
process exit, rval = 0

The file myfile is created by openat() with mode 0644: without the sticky bit.

The tar program on FreeBSD comes from the libarchive project.

libarchive clears bits unless -p command line option is used:

		/*
		 * User didn't request full permissions, so don't
		 * restore SUID, SGID bits and obey umask.
		 */
		a->mode &= ~S_ISUID;
		a->mode &= ~S_ISGID;
		a->mode &= ~S_ISVTX;
		a->mode &= ~a->user_umask;

See: https://github.com/libarchive/libarchive/blob/67a20ce8f0ba22f06fb74119b4459ac69ade643b/libarchive/archive_write_disk_posix.c#L669-L678

So libarchive implementation is different: it doesn't catch EFTYPE. It doesn't call chmod() with S_ISVTX. It clears S_ISVTX before calling chmod() unless -p option is used.

@vstinner
Copy link
Member

Oh, Linux tar also clears the sticky bit:

vstinner@mona$ ./python script.py 
old_mode: -rw-r--r--
new_mode: -rw-r--r-T

vstinner@mona$ tar -cvf archive.tar myfile
myfile

vstinner@mona$ tar -tvf archive.tar 
-rw-r--r-T vstinner/vstinner 0 2023-09-22 00:10 myfile

vstinner@mona$ rm myfile 
vstinner@mona$ tar -xvf archive.tar 
myfile

vstinner@mona$ ls -l myfile 
-rw-r--r--. 1 vstinner vstinner 0 22 sept. 00:10 myfile

The sticky bit is only preserved if asked explicitly:

vstinner@mona$ rm myfile 
vstinner@mona$ tar -xpvf archive.tar 
myfile

vstinner@mona$ ls -l myfile 
-rw-r--r-T. 1 vstinner vstinner 0 22 sept. 00:10 myfile

@vstinner
Copy link
Member

While this PR mimicks Linux tar and FreeBSD tar, I'm no longer sure that trying chmod() with the sticky bit, catch EFTYPE error, and retry chmod() without the sticky mode, is the right approach.

Maybe Python tarfile should move closer to FreeBSD tar (libarchive) and Linux tar (GNU tar) behavior: clear S_ISUID, S_ISGID and S_ISVTX bits, and respect the user umask, unless -p / --preserve-permissions / --same-permissions option is used.

Problem: that would be a backward incompatible change, and currently Python doesn't have a --preserve-permissions option.

To unblock the FreeBSD buildbot, I suggest a trivial change: PR #109697 skips test_modes() if chmod() fails with EFTYPE.

I'm not sure that fixing test_tarfile on FreeBSD should drive tarfile design in general.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, I misunderstood the issue and the issue should be analyzed deeper to come with a better fix.

@sorcio
Copy link
Contributor Author

sorcio commented Sep 22, 2023

@vstinner

Maybe Python tarfile should move closer to FreeBSD tar (libarchive) and Linux tar (GNU tar) behavior: clear S_ISUID, S_ISGID and S_ISVTX bits, and respect the user umask, unless -p / --preserve-permissions / --same-permissions option is used.

Problem: that would be a backward incompatible change, and currently Python doesn't have a --preserve-permissions option.

Maybe I misunderstand you, but this is what extraction filters do. --preserve-permissions is equivalent to the fully_trusted filter. So users already have a library-provided way to deal with an EFTYPE.

As I said, I don't think this change is a good idea anymore, and I'm sorry I didn't give it more thought before. Skipping the test on EFTYPE is fine. Maybe even better is to only skip setting the sticky bit, and not the whole test case.

@vstinner
Copy link
Member

@sorcio:

Maybe I misunderstand you, but this is what extraction filters do. --preserve-permissions is equivalent to the fully_trusted filter. So users already have a library-provided way to deal with an EFTYPE.

After testing FreeBSD tar and Linux tar, reading your PR and took time to think about the issue, I think that yes, EFTYPE is the expected behavior for fully_trusted filter.

FreeBSD tar and Linux tar both uses something like Python tar_filter filter: for example, remove the sticky bit.

Changing the default filter should be discussed separately.

I now think that skipping the test if chmod() cannot set the sticky bit is the right test_tarfile fix.

I close your issue. Thanks for spending time to investigate this complicate issue!

Thanks @encukou for your feedback!

@vstinner vstinner closed this Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants