-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Add several undocumented flags (--bazel, --package-root and --cache-map) in support of Bazel #4759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
55 commits
Select commit
Hold shift + click to select a range
13a52e1
New --bazel flag
4618115
Implement behavior changes for --bazel flag.
d188e9f
Don't raise OSError for makedirs('')
9d32192
Use zero mtimes in validate_meta() if bazel
f1c7834
More zero mtime, and don't compare size -> always believe cache file …
aff2f72
Different tack for cache under Bazel: use --cache-dir but omit pyversion
8048ee4
Add --cache-map option
7bc3a0b
Pass cache mapping on command line
3632c55
Tweak --cache-map arg parsing
24de3d5
Drop the -i requirement for --bazel; it's not strictly needed
ec719df
Fix lint
fcc9604
Somehow mypy/api.py had DOS line endings. Fix this. (#4768)
gvanrossum c62a22e
Add docstring for relpath()
72d2657
Merge remote-tracking branch 'upstream/master' into bazel-flag
4dd90b8
Monumental effort to support packages without __init__.py for Bazel (…
c667dfc
Fix mypy errors
a43ecdc
Fix tests
e17a702
Fix flake8
0f4c8af
Merge master (except typeshed; resolved one conflict in main.py)
cd10259
Merge branch 'master' into bazel-flag
eade116
Make --package-root a separate flag
e7ffae4
Merge master
f19892a
Sync typeshed (same as master)
a0072ab
Add tests for --package-root and fix behavior
0096ee0
Add rudimentary tests for --cache-map and --bazel.
cf9f6e0
Merge master
268b74c
Fix two type errors found by
dc1f262
Replace relpath() with os.path.relpath() (D'oh)
e103d63
Use os.path.relpath() in --package-root normalization
80daeae
Break long line
b7fcc9e
Another attempt at fixing the Windows test failures
363f4b6
Merge master
2314b54
Another attempt at fixing --package-root on Windows
605b963
Merge master
a89e7a4
Make Windows tests pass by judicious use of os.path.normpath()
29f3f59
Merge master (some tests broken)
052fa08
Change the way FileSystemCache gets the package_root
fb24a14
Merge remote-tracking branch 'upstream/master' into bazel-flag
05a448d
Avoid mypy error (see https://github.com/python/typeshed/issues/2081)
8b88cd2
Merge master
aa5facd
Fix fscache.py bug: only generate fake __init__.py, not .pyi
755b540
Merge master 211a65470079f9080881db1e069a7577a44c1bb1
7884ed4
Avoid updating hash in cache meta
01f3893
Merge master (22ee7137a5f21b398b18908cca405d104bc060d4)
1f26bdc
Merge upstream/master (464b553a4cf1e8b8a2b78d6cf6ebb6597f13af60)
3d7f633
Respond to code review
7559fc8
Merge master (0447473de2c00b5429ed0072249ecb4f134e2f38)
210de9c
Don't force tree.path absolute in non-bazel -- it breaks Windows
15b7d87
Add annotations to new functions, to make self-test pass
4135e63
Hrmpf
6cbcb08
Merge master
52649f5
merge master (again)
3bd329e
Fix bad merge
389c908
Hopefully fix flaky tests (don't use the fscache in getmtime())
4665933
Merge master one last time
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -74,10 +74,6 @@ | |
Graph = Dict[str, 'State'] | ||
|
||
|
||
def getmtime(name: str) -> int: | ||
return int(os.path.getmtime(name)) | ||
|
||
|
||
# TODO: Get rid of BuildResult. We might as well return a BuildManager. | ||
class BuildResult: | ||
"""The result of a successful build. | ||
|
@@ -230,7 +226,12 @@ def compute_lib_path(sources: List[BuildSource], | |
# to the lib_path | ||
# TODO: Don't do this in some cases; for motivation see see | ||
# https://github.com/python/mypy/issues/4195#issuecomment-341915031 | ||
lib_path.appendleft(os.getcwd()) | ||
if options.bazel: | ||
dir = '.' | ||
else: | ||
dir = os.getcwd() | ||
if dir not in lib_path: | ||
lib_path.appendleft(dir) | ||
|
||
# Prepend a config-defined mypy path. | ||
lib_path.extendleft(options.mypy_path) | ||
|
@@ -687,6 +688,31 @@ def maybe_swap_for_shadow_path(self, path: str) -> str: | |
def get_stat(self, path: str) -> os.stat_result: | ||
return self.fscache.stat(self.maybe_swap_for_shadow_path(path)) | ||
|
||
def getmtime(self, path: str) -> int: | ||
"""Return a file's mtime; but 0 in bazel mode. | ||
|
||
(Bazel's distributed cache doesn't like filesystem metadata to | ||
end up in output files.) | ||
""" | ||
if self.options.bazel: | ||
return 0 | ||
else: | ||
return int(os.path.getmtime(path)) | ||
|
||
def normpath(self, path: str) -> str: | ||
"""Convert path to absolute; but to relative in bazel mode. | ||
|
||
(Bazel's distributed cache doesn't like filesystem metadata to | ||
end up in output files.) | ||
""" | ||
# TODO: Could we always use relpath? (A worry in non-bazel | ||
# mode would be that a moved file may change its full module | ||
# name without changing its size, mtime or hash.) | ||
if self.options.bazel: | ||
return os.path.relpath(path) | ||
else: | ||
return os.path.abspath(path) | ||
|
||
def all_imported_modules_in_file(self, | ||
file: MypyFile) -> List[Tuple[int, str, int]]: | ||
"""Find all reachable import statements in a file. | ||
|
@@ -1094,14 +1120,17 @@ def get_cache_names(id: str, path: str, manager: BuildManager) -> Tuple[str, str | |
|
||
Args: | ||
id: module ID | ||
path: module path (used to recognize packages) | ||
path: module path | ||
cache_dir: cache directory | ||
pyversion: Python version (major, minor) | ||
|
||
Returns: | ||
A tuple with the file names to be used for the meta JSON, the | ||
data JSON, and the fine-grained deps JSON, respectively. | ||
""" | ||
pair = manager.options.cache_map.get(path) | ||
if pair is not None: | ||
return (pair[0], pair[1], None) | ||
prefix = _cache_dir_prefix(manager, id) | ||
is_package = os.path.basename(path).startswith('__init__.py') | ||
if is_package: | ||
|
@@ -1232,22 +1261,23 @@ def validate_meta(meta: Optional[CacheMeta], id: str, path: Optional[str], | |
manager.log('Metadata abandoned for {}: errors were previously ignored'.format(id)) | ||
return None | ||
|
||
bazel = manager.options.bazel | ||
assert path is not None, "Internal error: meta was provided without a path" | ||
# Check data_json; assume if its mtime matches it's good. | ||
# TODO: stat() errors | ||
data_mtime = getmtime(meta.data_json) | ||
data_mtime = manager.getmtime(meta.data_json) | ||
if data_mtime != meta.data_mtime: | ||
manager.log('Metadata abandoned for {}: data cache is modified'.format(id)) | ||
return None | ||
deps_mtime = None | ||
if manager.options.cache_fine_grained: | ||
assert meta.deps_json | ||
deps_mtime = getmtime(meta.deps_json) | ||
deps_mtime = manager.getmtime(meta.deps_json) | ||
if deps_mtime != meta.deps_mtime: | ||
manager.log('Metadata abandoned for {}: deps cache is modified'.format(id)) | ||
return None | ||
|
||
path = os.path.abspath(path) | ||
path = manager.normpath(path) | ||
try: | ||
st = manager.get_stat(path) | ||
except OSError: | ||
|
@@ -1272,12 +1302,14 @@ def validate_meta(meta: Optional[CacheMeta], id: str, path: Optional[str], | |
fine_grained_cache = manager.use_fine_grained_cache() | ||
|
||
size = st.st_size | ||
if size != meta.size and not fine_grained_cache: | ||
# Bazel ensures the cache is valid. | ||
if size != meta.size and not bazel and not fine_grained_cache: | ||
manager.log('Metadata abandoned for {}: file {} has different size'.format(id, path)) | ||
return None | ||
|
||
mtime = int(st.st_mtime) | ||
if mtime != meta.mtime or path != meta.path: | ||
# Bazel ensures the cache is valid. | ||
mtime = 0 if bazel else int(st.st_mtime) | ||
if not bazel and (mtime != meta.mtime or path != meta.path): | ||
try: | ||
source_hash = manager.fscache.md5(path) | ||
except (OSError, UnicodeDecodeError, DecodeError): | ||
|
@@ -1317,7 +1349,7 @@ def validate_meta(meta: Optional[CacheMeta], id: str, path: Optional[str], | |
meta_str = json.dumps(meta_dict, indent=2, sort_keys=True) | ||
else: | ||
meta_str = json.dumps(meta_dict) | ||
meta_json, _, _2 = get_cache_names(id, path, manager) | ||
meta_json, _, _ = get_cache_names(id, path, manager) | ||
manager.log('Updating mtime for {}: file {}, meta {}, mtime {}' | ||
.format(id, path, meta_json, meta.mtime)) | ||
atomic_write(meta_json, meta_str, '\n') # Ignore errors, it's just an optimization. | ||
|
@@ -1373,12 +1405,20 @@ def write_cache(id: str, path: str, tree: MypyFile, | |
corresponding to the metadata that was written (the latter may | ||
be None if the cache could not be written). | ||
""" | ||
# Obtain file paths | ||
path = os.path.abspath(path) | ||
# For Bazel we use relative paths and zero mtimes. | ||
bazel = manager.options.bazel | ||
|
||
# Obtain file paths. | ||
path = manager.normpath(path) | ||
meta_json, data_json, deps_json = get_cache_names(id, path, manager) | ||
manager.log('Writing {} {} {} {} {}'.format( | ||
id, path, meta_json, data_json, deps_json)) | ||
|
||
# Update tree.path so that in bazel mode it's made relative (since | ||
# sometimes paths leak out). | ||
if bazel: | ||
tree.path = path | ||
|
||
# Make sure directory for cache files exists | ||
parent = os.path.dirname(data_json) | ||
assert os.path.dirname(meta_json) == parent | ||
|
@@ -1390,7 +1430,8 @@ def write_cache(id: str, path: str, tree: MypyFile, | |
|
||
# Obtain and set up metadata | ||
try: | ||
os.makedirs(parent, exist_ok=True) | ||
if parent: | ||
os.makedirs(parent, exist_ok=True) | ||
st = manager.get_stat(path) | ||
except OSError as err: | ||
manager.log("Cannot get stat for {}: {}".format(path, err)) | ||
|
@@ -1405,10 +1446,11 @@ def write_cache(id: str, path: str, tree: MypyFile, | |
return interface_hash, None | ||
|
||
# Write data cache file, if applicable | ||
# Note that for Bazel we don't record the data file's mtime. | ||
if old_interface_hash == interface_hash: | ||
# If the interface is unchanged, the cached data is guaranteed | ||
# to be equivalent, and we only need to update the metadata. | ||
data_mtime = getmtime(data_json) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ^ |
||
data_mtime = manager.getmtime(data_json) | ||
manager.trace("Interface for {} is unchanged".format(id)) | ||
else: | ||
manager.trace("Interface for {} has changed".format(id)) | ||
|
@@ -1425,17 +1467,17 @@ def write_cache(id: str, path: str, tree: MypyFile, | |
# Both have the effect of slowing down the next run a | ||
# little bit due to an out-of-date cache file. | ||
return interface_hash, None | ||
data_mtime = getmtime(data_json) | ||
data_mtime = manager.getmtime(data_json) | ||
|
||
deps_mtime = None | ||
if deps_json: | ||
deps_str = json_dumps(serialized_fine_grained_deps, manager.options.debug_cache) | ||
if not atomic_write(deps_json, deps_str, '\n'): | ||
manager.log("Error writing deps JSON file {}".format(deps_json)) | ||
return interface_hash, None | ||
deps_mtime = getmtime(deps_json) | ||
deps_mtime = manager.getmtime(deps_json) | ||
|
||
mtime = int(st.st_mtime) | ||
mtime = 0 if bazel else int(st.st_mtime) | ||
size = st.st_size | ||
options = manager.options.clone_for_module(id) | ||
assert source_hash is not None | ||
|
@@ -1475,7 +1517,7 @@ def delete_cache(id: str, path: str, manager: BuildManager) -> None: | |
This avoids inconsistent states with cache files from different mypy runs, | ||
see #4043 for an example. | ||
""" | ||
path = os.path.abspath(path) | ||
path = manager.normpath(path) | ||
cache_paths = get_cache_names(id, path, manager) | ||
manager.log('Deleting {} {} {}'.format(id, path, " ".join(x for x in cache_paths if x))) | ||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,13 +32,19 @@ | |
import hashlib | ||
import os | ||
import stat | ||
from typing import Dict, List, Tuple | ||
from typing import Dict, List, Optional, Set, Tuple | ||
|
||
|
||
class FileSystemCache: | ||
def __init__(self) -> None: | ||
# The package root is not flushed with the caches. | ||
# It is set by set_package_root() below. | ||
self.package_root = [] # type: List[str] | ||
self.flush() | ||
|
||
def set_package_root(self, package_root: List[str]) -> None: | ||
self.package_root = package_root | ||
|
||
def flush(self) -> None: | ||
"""Start another transaction and empty all caches.""" | ||
self.stat_cache = {} # type: Dict[str, os.stat_result] | ||
|
@@ -49,6 +55,7 @@ def flush(self) -> None: | |
self.read_cache = {} # type: Dict[str, bytes] | ||
self.read_error_cache = {} # type: Dict[str, Exception] | ||
self.hash_cache = {} # type: Dict[str, str] | ||
self.fake_package_cache = set() # type: Set[str] | ||
|
||
def stat(self, path: str) -> os.stat_result: | ||
if path in self.stat_cache: | ||
|
@@ -58,16 +65,100 @@ def stat(self, path: str) -> os.stat_result: | |
try: | ||
st = os.stat(path) | ||
except OSError as err: | ||
if self.init_under_package_root(path): | ||
try: | ||
return self._fake_init(path) | ||
except OSError: | ||
pass | ||
# Take a copy to get rid of associated traceback and frame objects. | ||
# Just assigning to __traceback__ doesn't free them. | ||
self.stat_error_cache[path] = copy_os_error(err) | ||
raise err | ||
self.stat_cache[path] = st | ||
return st | ||
|
||
def init_under_package_root(self, path: str) -> bool: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could use some documentation. I'm not sure what's the goal/what's going on here. |
||
"""Is this path an __init__.py under a package root? | ||
|
||
This is used to detect packages that don't contain __init__.py | ||
files, which is needed to support Bazel. The function should | ||
only be called for non-existing files. | ||
|
||
It will return True if it refers to a __init__.py file that | ||
Bazel would create, so that at runtime Python would think the | ||
directory containing it is a package. For this to work you | ||
must pass one or more package roots using the --package-root | ||
flag. | ||
|
||
As an exceptional case, any directory that is a package root | ||
itself will not be considered to contain a __init__.py file. | ||
This is different from the rules Bazel itself applies, but is | ||
necessary for mypy to properly distinguish packages from other | ||
directories. | ||
|
||
See https://docs.bazel.build/versions/master/be/python.html, | ||
where this behavior is described under legacy_create_init. | ||
""" | ||
if not self.package_root: | ||
return False | ||
dirname, basename = os.path.split(path) | ||
if basename != '__init__.py': | ||
return False | ||
try: | ||
st = self.stat(dirname) | ||
except OSError: | ||
return False | ||
else: | ||
if not stat.S_ISDIR(st.st_mode): | ||
return False | ||
ok = False | ||
drive, path = os.path.splitdrive(path) # Ignore Windows drive name | ||
path = os.path.normpath(path) | ||
for root in self.package_root: | ||
if path.startswith(root): | ||
if path == root + basename: | ||
# A package root itself is never a package. | ||
ok = False | ||
break | ||
else: | ||
ok = True | ||
return ok | ||
|
||
def _fake_init(self, path: str) -> os.stat_result: | ||
"""Prime the cache with a fake __init__.py file. | ||
|
||
This makes code that looks for path believe an empty file by | ||
that name exists. Should only be called after | ||
init_under_package_root() returns True. | ||
""" | ||
dirname, basename = os.path.split(path) | ||
assert basename == '__init__.py', path | ||
assert not os.path.exists(path), path # Not cached! | ||
dirname = os.path.normpath(dirname) | ||
st = self.stat(dirname) # May raise OSError | ||
# Get stat result as a sequence so we can modify it. | ||
# (Alas, typeshed's os.stat_result is not a sequence yet.) | ||
tpl = tuple(st) # type: ignore | ||
seq = list(tpl) # type: List[float] | ||
seq[stat.ST_MODE] = stat.S_IFREG | 0o444 | ||
seq[stat.ST_INO] = 1 | ||
seq[stat.ST_NLINK] = 1 | ||
seq[stat.ST_SIZE] = 0 | ||
tpl = tuple(seq) | ||
st = os.stat_result(tpl) | ||
self.stat_cache[path] = st | ||
# Make listdir() and read() also pretend this file exists. | ||
self.fake_package_cache.add(dirname) | ||
return st | ||
|
||
def listdir(self, path: str) -> List[str]: | ||
path = os.path.normpath(path) | ||
if path in self.listdir_cache: | ||
return self.listdir_cache[path] | ||
res = self.listdir_cache[path] | ||
# Check the fake cache. | ||
if path in self.fake_package_cache and '__init__.py' not in res: | ||
res.append('__init__.py') # Updates the result as well as the cache | ||
return res | ||
if path in self.listdir_error_cache: | ||
raise copy_os_error(self.listdir_error_cache[path]) | ||
try: | ||
|
@@ -77,6 +168,9 @@ def listdir(self, path: str) -> List[str]: | |
self.listdir_error_cache[path] = copy_os_error(err) | ||
raise err | ||
self.listdir_cache[path] = results | ||
# Check the fake cache. | ||
if path in self.fake_package_cache and '__init__.py' not in results: | ||
results.append('__init__.py') | ||
return results | ||
|
||
def isfile(self, path: str) -> bool: | ||
|
@@ -133,12 +227,19 @@ def read(self, path: str) -> bytes: | |
# earlier instant than the mtime reported by self.stat(). | ||
self.stat(path) | ||
|
||
try: | ||
with open(path, 'rb') as f: | ||
data = f.read() | ||
except Exception as err: | ||
self.read_error_cache[path] = err | ||
raise | ||
dirname, basename = os.path.split(path) | ||
dirname = os.path.normpath(dirname) | ||
# Check the fake cache. | ||
if basename == '__init__.py' and dirname in self.fake_package_cache: | ||
data = b'' | ||
else: | ||
try: | ||
with open(path, 'rb') as f: | ||
data = f.read() | ||
except OSError as err: | ||
self.read_error_cache[path] = err | ||
raise | ||
|
||
md5hash = hashlib.md5(data).hexdigest() | ||
self.read_cache[path] = data | ||
self.hash_cache[path] = md5hash | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, such a comment would be useful