Skip to content

Reliable version of os.replace for Windows (wait/retry loop) #3239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from
4 changes: 2 additions & 2 deletions mypy/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -901,7 +901,7 @@ def write_cache(id: str, path: str, tree: MypyFile,
f.write(data_str)
f.write('\n')
data_mtime = os.path.getmtime(data_json_tmp)
os.replace(data_json_tmp, data_json)
util.replace(data_json_tmp, data_json)
manager.trace("Interface for {} has changed".format(id))

mtime = st.st_mtime
Expand All @@ -927,7 +927,7 @@ def write_cache(id: str, path: str, tree: MypyFile,
json.dump(meta, f, indent=2, sort_keys=True)
else:
json.dump(meta, f)
os.replace(meta_json_tmp, meta_json)
util.replace(meta_json_tmp, meta_json)

return interface_hash

Expand Down
103 changes: 103 additions & 0 deletions mypy/test/testutil.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
from typing import Iterator, List, Tuple, IO
import time
import os
import sys
import tempfile
from contextlib import contextmanager
from threading import Thread
from unittest import TestCase, main, skipUnless
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, turns out mypy.test, unittest, and pytest are currently being used...ouch...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but Max didn't start that. IIUC we want to kill mypy.test, but it's low priority (not sure if there's even an issue) and pytest works well enough with unittest (also, personally, I'm more used to the unittest way of writing tests -- let pytest just be the test runner).

from mypy import util
from mypy.build import random_string


WIN32 = sys.platform.startswith("win")


@skipUnless(WIN32, "only relevant for Windows")
class WindowsReplace(TestCase):
tmpdir = tempfile.TemporaryDirectory(prefix='mypy-test-',
dir=os.path.abspath('tmp-test-dirs'))
# Choose timeout value that would ensure actual wait inside util.replace is close to timeout
timeout = 0.0009 * 2 ** 10
short_lock = timeout / 4
long_lock = timeout * 2

threads = [] # type: List[Thread]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if you made tmpdir and threads just instance variables, resetting and cleaning them for each test case rather than once for all test cases in the class? You have only one test_* method anyways, so...???

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like you to ponder this, but it's optional at this point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(FWIW that was actually an old comment that GitHub had somehow hung on to. As were several other comments on code you had already ripped out... Sorry about the noise!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should become an instance variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was that the entire file locking arrangement is shared across all the test* methods of WIndowsReplace test case. Specifically, I want to wait for all the file locks to expire at the end, rather than wait for the file lock in each test to expire. The former is much more efficient: while the other tests are running, old file locks will naturally use up part (or even all) of their duration, so the wait at the end will be much shorter. If we adopt this approach, I need threads to be class-level since WindowsReplace is instantiated as many times as there are test* methods in it. It also means I must not delete the tmpdir until the end of the entire test case, i.e., until tearDownClass rather than instance-level tearDown. Is there a disadvantage to this approach?

Note: I temporarily reduced the number of test* metods to 1 due to issues with xdist sending different tests to different workers (and thus defeating the entire mechanism I described in the previous paragraph). But this is a horrible arrangement; once I figure out how to resolve the xdist issue, I definitely want to follow the standard practice of keeping each test in its own method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back to having 4 tests, since otherwise it's too messy. There's no semantic problem with that, and I'll deal with the extra few seconds in test runtime later.

But with this, I'd like to keep the class-level attributes if you don't see an issue with them; both semantically and performance-wise, I don't want to force each individual test to wait for the threads to expire.


@classmethod
def close_file_after(cls, file: IO, delay: float) -> Thread:
"""Start a background thread to close file after delay sec."""
def _close_file_after() -> None:
time.sleep(delay)
file.close()

t = Thread(target=_close_file_after, daemon=True)
cls.threads.append(t)
t.start()
return t

@classmethod
def tearDownClass(cls) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please switch to (instance-level) tearDown.

# Need to wait for threads to complete, otherwise we'll get PermissionError
# at the end (whether tmpdir goes out of scope or we explicitly call cleanup).
for t in cls.threads:
t.join()
cls.tmpdir.cleanup()

def prepare_src_dest(self, src_lock_duration: float, dest_lock_duration: float
) -> Tuple[str, str]:
"""Create two files in self.tmpdir random names (src, dest) and unique contents;
then spawn two threads that lock each of them for a specified duration.

Return a tuple (src, dest).
"""
src = os.path.join(self.tmpdir.name, random_string())
dest = os.path.join(self.tmpdir.name, random_string())

for fname, delay in zip((src, dest), (src_lock_duration, dest_lock_duration)):
f = open(fname, 'w')
f.write(fname)
if delay:
self.close_file_after(f, delay)
else:
f.close()

return src, dest

def replace_ok(self, src_lock_duration: float, dest_lock_duration: float,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a docstring for this.

timeout: float) -> None:
"""Check whether util._replace, called with a specified timeout,
worked successfully on two newly created files locked for specified
durations.

Return True if the replacement succeeded.
"""
src, dest = self.prepare_src_dest(src_lock_duration, dest_lock_duration)
util._replace(src, dest, timeout=timeout)
self.assertEqual(open(dest).read(), src, 'replace failed')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use with open...; this code will throw a ResourceWarning on recent versions of Python.


def test_no_locks(self) -> None:
# No files locked.
self.replace_ok(0, 0, self.timeout)

def test_original_problem(self) -> None:
# Make sure we can reproduce the issue with our setup.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't appear to be any comment to tell us what "the issue" is. Maybe add a reference to the original GitHub issue?

src, dest = self.prepare_src_dest(self.short_lock, 0)
with self.assertRaises(PermissionError):
os.replace(src, dest)

def test_short_locks(self) -> None:
# Lock files for a time short enough that util.replace won't timeout.
self.replace_ok(self.short_lock, 0, self.timeout)
self.replace_ok(0, self.short_lock, self.timeout)

def test_long_locks(self) -> None:
# Lock files for a time long enough that util.replace times out.
with self.assertRaises(PermissionError):
self.replace_ok(self.long_lock, 0, self.timeout)
with self.assertRaises(PermissionError):
self.replace_ok(0, self.long_lock, self.timeout)


if __name__ == '__main__':
main()
43 changes: 42 additions & 1 deletion mypy/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,14 @@

import re
import subprocess
import os
import sys
import time
from xml.sax.saxutils import escape
from typing import TypeVar, List, Tuple, Optional, Sequence, Dict
from typing import (
TypeVar, List, Tuple, Optional, Sequence, Dict, Callable, Iterable, Type, Union,
AnyStr, Generic
)


T = TypeVar('T')
Expand Down Expand Up @@ -134,3 +140,38 @@ def id(self, o: object) -> int:
self.id_map[o] = self.next_id
self.next_id += 1
return self.id_map[o]


if sys.version_info >= (3, 6):
PathType = Union[AnyStr, 'os.PathLike[AnyStr]']
else:
# Workaround to allow the use of generic in _replace
class _PathType(Generic[AnyStr]): ...
PathType = _PathType[AnyStr]


def _replace(src: PathType[AnyStr], dest: PathType[AnyStr], timeout: float = 1) -> None:
"""Replace src with dest using os.replace(src, dest).

Wait and retry if OSError exception is raised.

Increase wait time exponentially until total wait of timeout sec;
on timeout, give up and reraise the last exception seen.
"""
wait = 0.001
while True:
Copy link
Contributor Author

@pkch pkch Apr 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gvanrossum Moving the discussion here.

I could just make the time allowance in test much larger. That way it will pass even though this function guarantees timeout only up to a factor of 2x. But the test is already slow (adds 3-4 sec to the entire test suite) due to the file locks.

I was trying to make this test not affect total test runtime by adding another worker, since it's just sleeping for a while. But it didn't work yet (also, I fear having an extra worker might slow down tests a bit due to more process switching).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just keep it simple rather than trying to work against xdist. If it's hard to figure out now I'm sure it will be hard to maintain in the future.

Just in case, my go-to xdist wizard is @gnprice .

try:
os.replace(src, dest)
return
except PermissionError:
# The actual total wait might be up to 2x larger than timeout parameter
if wait > timeout:
raise
time.sleep(wait)
wait *= 2


if sys.platform.startswith("win"):
replace = _replace
else:
replace = os.replace
1 change: 1 addition & 0 deletions runtests.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,7 @@ def add_imports(driver: Driver) -> None:
'testdiff',
'testfinegrained',
'testmerge',
'testutil',
]]


Expand Down