Skip to content

Reliable version of os.replace for Windows (wait/retry loop) #3239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from
4 changes: 2 additions & 2 deletions mypy/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -901,7 +901,7 @@ def write_cache(id: str, path: str, tree: MypyFile,
f.write(data_str)
f.write('\n')
data_mtime = os.path.getmtime(data_json_tmp)
os.replace(data_json_tmp, data_json)
util.replace(data_json_tmp, data_json)
manager.trace("Interface for {} has changed".format(id))

mtime = st.st_mtime
Expand All @@ -927,7 +927,7 @@ def write_cache(id: str, path: str, tree: MypyFile,
json.dump(meta, f, indent=2, sort_keys=True)
else:
json.dump(meta, f)
os.replace(meta_json_tmp, meta_json)
util.replace(meta_json_tmp, meta_json)

return interface_hash

Expand Down
105 changes: 105 additions & 0 deletions mypy/test/testutil.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
from typing import Iterator, List, Tuple, IO
import time
import os
import sys
import tempfile
from contextlib import contextmanager
from threading import Thread
from unittest import TestCase, main, skipUnless
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, turns out mypy.test, unittest, and pytest are currently being used...ouch...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but Max didn't start that. IIUC we want to kill mypy.test, but it's low priority (not sure if there's even an issue) and pytest works well enough with unittest (also, personally, I'm more used to the unittest way of writing tests -- let pytest just be the test runner).

from mypy import util
from mypy.build import random_string


WIN32 = sys.platform.startswith("win")


@skipUnless(WIN32, "only relevant for Windows")
class WindowsReplace(TestCase):
tmpdir = tempfile.TemporaryDirectory(prefix='mypy-test-',
dir=os.path.abspath('tmp-test-dirs'))
# Choose timeout value that would ensure actual wait inside util.replace is close to timeout
timeout = 0.0009 * 2 ** 10
short_lock = timeout / 4
long_lock = timeout * 2

threads = [] # type: List[Thread]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should become an instance variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was that the entire file locking arrangement is shared across all the test* methods of WIndowsReplace test case. Specifically, I want to wait for all the file locks to expire at the end, rather than wait for the file lock in each test to expire. The former is much more efficient: while the other tests are running, old file locks will naturally use up part (or even all) of their duration, so the wait at the end will be much shorter. If we adopt this approach, I need threads to be class-level since WindowsReplace is instantiated as many times as there are test* methods in it. It also means I must not delete the tmpdir until the end of the entire test case, i.e., until tearDownClass rather than instance-level tearDown. Is there a disadvantage to this approach?

Note: I temporarily reduced the number of test* metods to 1 due to issues with xdist sending different tests to different workers (and thus defeating the entire mechanism I described in the previous paragraph). But this is a horrible arrangement; once I figure out how to resolve the xdist issue, I definitely want to follow the standard practice of keeping each test in its own method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back to having 4 tests, since otherwise it's too messy. There's no semantic problem with that, and I'll deal with the extra few seconds in test runtime later.

But with this, I'd like to keep the class-level attributes if you don't see an issue with them; both semantically and performance-wise, I don't want to force each individual test to wait for the threads to expire.


@classmethod
def close_file_after(cls, file: IO, delay: float) -> Thread:
"""Start a background thread to close file after delay sec."""
def _close_file_after() -> None:
time.sleep(delay)
file.close()

t = Thread(target=_close_file_after, daemon=True)
cls.threads.append(t)
t.start()
return t

@classmethod
def tearDownClass(cls) -> None:
# Need to wait for threads to complete, otherwise we'll get PermissionError
# at the end (whether tmpdir goes out of scope or we explicitly call cleanup).
for t in cls.threads:
t.join()
cls.tmpdir.cleanup()

def prepare_src_dest(self, src_lock_duration: float, dest_lock_duration: float
) -> Tuple[str, str]:
"""Create two files in self.tmpdir random names (src, dest) and unique contents;
then spawn two threads that lock each of them for a specified duration.

Return a tuple (src, dest).
"""
src = os.path.join(self.tmpdir.name, random_string())
dest = os.path.join(self.tmpdir.name, random_string())

for fname, delay in zip((src, dest), (src_lock_duration, dest_lock_duration)):
f = open(fname, 'w')
f.write(fname)
if delay:
self.close_file_after(f, delay)
else:
f.close()

return src, dest

def replace_ok(self, src_lock_duration: float, dest_lock_duration: float,
timeout: float) -> None:
"""Check whether util._replace, called with a specified timeout,
worked successfully on two newly created files locked for specified
durations.

Return True if the replacement succeeded.
"""
src, dest = self.prepare_src_dest(src_lock_duration, dest_lock_duration)
util._replace(src, dest, timeout=timeout)
# Note that dest handle may still be open but reading from it is ok.
with open(dest) as f:
self.assertEqual(f.read(), src, 'replace failed')

def test_no_locks(self) -> None:
# No files locked.
self.replace_ok(0, 0, self.timeout)

def test_original_problem(self) -> None:
# Make sure we can reproduce https://github.com/python/mypy/issues/3215 with our setup.
src, dest = self.prepare_src_dest(self.short_lock, 0)
with self.assertRaises(PermissionError):
os.replace(src, dest)

def test_short_locks(self) -> None:
# Lock files for a time short enough that util.replace won't timeout.
self.replace_ok(self.short_lock, 0, self.timeout)
self.replace_ok(0, self.short_lock, self.timeout)

def test_long_locks(self) -> None:
# Lock files for a time long enough that util.replace times out.
with self.assertRaises(PermissionError):
self.replace_ok(self.long_lock, 0, self.timeout)
with self.assertRaises(PermissionError):
self.replace_ok(0, self.long_lock, self.timeout)


if __name__ == '__main__':
main()
43 changes: 42 additions & 1 deletion mypy/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,14 @@

import re
import subprocess
import os
import sys
import time
from xml.sax.saxutils import escape
from typing import TypeVar, List, Tuple, Optional, Sequence, Dict
from typing import (
TypeVar, List, Tuple, Optional, Sequence, Dict, Callable, Iterable, Type, Union,
AnyStr, Generic
)


T = TypeVar('T')
Expand Down Expand Up @@ -134,3 +140,38 @@ def id(self, o: object) -> int:
self.id_map[o] = self.next_id
self.next_id += 1
return self.id_map[o]


if sys.version_info >= (3, 6):
PathType = Union[AnyStr, 'os.PathLike[AnyStr]']
else:
# Workaround to allow the use of generic in _replace
class _PathType(Generic[AnyStr]): ...
PathType = _PathType[AnyStr]


def _replace(src: PathType[AnyStr], dest: PathType[AnyStr], timeout: float = 1) -> None:
"""Replace src with dest using os.replace(src, dest).

Wait and retry if OSError exception is raised.

Increase wait time exponentially until total wait of timeout sec;
on timeout, give up and reraise the last exception seen.
"""
wait = 0.001
while True:
try:
os.replace(src, dest)
return
except PermissionError:
# The actual total wait might be up to 2x larger than timeout parameter
if wait > timeout:
raise
time.sleep(wait)
wait *= 2


if sys.platform.startswith("win"):
replace = _replace
else:
replace = os.replace
1 change: 1 addition & 0 deletions runtests.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,7 @@ def add_imports(driver: Driver) -> None:
'testdiff',
'testfinegrained',
'testmerge',
'testutil',
]]


Expand Down