Skip to content

Accept a list of files, or whole packages, to typecheck #935

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gnprice opened this issue Oct 15, 2015 · 9 comments
Closed

Accept a list of files, or whole packages, to typecheck #935

gnprice opened this issue Oct 15, 2015 · 9 comments
Labels

Comments

@gnprice
Copy link
Collaborator

gnprice commented Oct 15, 2015

I'd love to be able to hand mypy a whole library, or a program where it's not totally obvious what the entry point is (or if there is a single entry point), and have it typecheck the whole thing.

For example in mypy itself, to typecheck mypy/ with all the Python files that make it up. We used to have tests.py as a standin for this, but runtests.py is more complicated and mypy runtests.py doesn't actually check everything. I can use mypy -m mypy.main, but that's getting into details about how the code is laid out.

One interface for this would be to just list a bunch of files as positional arguments to mypy. More convenient in many cases would be to say "here's a directory and typecheck everything underneath it".

Given the mypy example, actually, where mypy/test/data/ contains Python files that aren't part of the program, maybe the right option to have there is "here's a Python package and typecheck everything that's part of it". So e.g.

$ mypy -p mypy

would look up mypy for import, see that it's a package module, go find all the other files next to its __init__.py and the directories that themselves contain __init__.py, and recurse. Because there's no mypy/test/data/__init__.py, or equivalently because you can't import mypy.test.data.fixtures.list, this wouldn't pull in mypy/test/data/fixtures/list.py, which is just as desired.

I think in my ideal version of this interface, you could pass multiple -m arguments, multiple -p arguments, and multiple positional filename arguments, and they'd all just go on the queue together.

(In principle I could simulate this by invoking mypy once for each file, but that's much slower because it repeats the work of typechecking dependencies.)

@jhance
Copy link
Collaborator

jhance commented Oct 15, 2015

We currently do our argument parsing by hand, it would probably be good to refactor the whole process after deciding on the UI we want.

I think that doing more than one -m at once is pretty much just adding more than one initial state... -p would be similar but generate a whole bunch of -m states by traversing trees.

@jhance
Copy link
Collaborator

jhance commented Oct 15, 2015

I have a prototype for allowing BuildManager to support multiple sources.

No idea if it actually works for multiple sources, but it accepts them, and doesn't fail on the one-source case.

@JukkaL JukkaL added the feature label Oct 16, 2015
@JukkaL
Copy link
Collaborator

JukkaL commented Oct 16, 2015

This would be really nice! I'm not totally sure about -p yet -- we could define mypy <dir> to only look for subdirectories with __init__.py files (but that might be confusing). Multiple -m options totally make sense and multiple files.

@gnprice
Copy link
Collaborator Author

gnprice commented Oct 16, 2015

Yeah, I thought about having a rule like that and figured it would be surprising to people -- generally if I pass a directory to a command, I expect the command to operate on the whole tree, like git grep, rsync, etc. So if I say mypy foo/ and it comes out clean, I'd be pretty surprised if I later find a type error in some file under foo/ and it wasn't checked because of some rule involving the presence of __init__.py files.

You could imagine also supporting passing a directory as a positional argument and meaning to check all the files in that tree. But this is likely to bite you in a different way: if you really have Python files under the tree that don't have a chain of __init__.py leading to them, then they can't be imported without an appropriate sys.path entry, and they probably depend on neighboring files in the same situation. So the check is likely to fail complaining it can't find some modules. Maybe that's OK, though, and certainly the small-project common case is that -p foo and a positional foo/ would be synonymous.

Suppose we accepted positional foo/, and defined it to mean "all *.py files in the subtree foo/". Would that work for mypy itself? Currently several files in mypy/test/data/fixtures/ fail to typecheck, but I don't really know the context of those files and whether it might be easy to fix them, or to move them out of mypy/.

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 17, 2015

Some additional ideas:

  • Support .mypyignore files. If a directory has such as file, mypy would ignore it unless you explicitly point mypy to a file within it. That way a positional mypy/ could be made to do the right thing.
  • Rename -p to --recursive-package or something to make it more explicit what it does. We might want to reserve -p for things like -p /usr/bin/python3.4 (for example, to look up Python module search path and Python version dynamically).

@gnprice
Copy link
Collaborator Author

gnprice commented Oct 19, 2015

Sure, renaming -p to a long option would be fine -- I think the main use of it would be in scripts for CI anyway (so long as we also have positional foo/.) I'd call it just --package, as "recursive package" is redundant: Python "packages" can have packages inside them, and when module foo.bar.baz exists it is part of the package called foo. See the use of the term in https://docs.python.org/2/reference/simple_stmts.html#the-import-statement and https://docs.python.org/2/tutorial/modules.html#packages .

I'd rather have a solution that didn't involve people adding one more kind of special-purpose marker file to their trees. I think we can do

  • positional foo/ means "all *.py files in the subtree foo/", which works for casual cases so people don't need to learn anything;
  • --package foo means the files that are actually in the importable package foo, which takes care of subtler cases when necessary;
  • both of these and -m can be combined any number of times;
    and rely on __init__.py files to already describe what people are likely to want with .mypyignore. When people really need something more custom and ad-hoc, they can always just construct the list themselves and put it on the command line.

@JukkaL
Copy link
Collaborator

JukkaL commented Oct 21, 2015

I like that!

@alunduil
Copy link
Contributor

+1

@gvanrossum
Copy link
Member

We've had mypy --package dotted.package.name for a while now (since 11/7 it seems; the usage message just doesn't mention it yet). Maybe that's sufficient? It also looks fairly simple to add support for processing multiple files, but it's a little tricky to figure out what the module names should be -- currently when a single file is given the module name is always taken to be __main__, but you can't have multiple modules with the same name, and there are some different cases:

  • mypy /path/to/folder/*.py where folder is just a folder full of top-level files: the module names could just be the basename of each file with .py stripped
  • mypy package/subpackage/{module1,module2,module3}.py where we really want the module names to be package.subpackage.module1 etc.

We could distinguish these with some other flag, or with a heuristic based on the shape of the filenames or their relationship to the current directory or even to $PYTHONPATH, but it would be nice if both cases just worked without any help. Maybe it would work to crawl up the path until a folder without a init.py is found?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants