diff --git a/docs/philosophy.md b/docs/philosophy.md index 9204ec599..bc9106317 100644 --- a/docs/philosophy.md +++ b/docs/philosophy.md @@ -1,12 +1,12 @@ # pandas-stubs Type Checking Philosophy The goal of the pandas-stubs project is to provide type stubs for the public API -that represent the recommended ways of using pandas. This is opposed to the +that represent the recommended ways of using pandas. This is opposed to the philosophy within the pandas source, as described [here](https://pandas.pydata.org/docs/development/contributing_codebase.html?highlight=typing#type-hints), which is to assist with the development of the pandas source code to ensure type safety within that source. -Due to the methodology used by Microsoft to develop the original stubs, there are internal +Due to the methodology used by Microsoft to develop the original stubs, there are internal classes, methods and functions that are annotated within the pandas-stubs project that are incorrect with respect to the pandas source, but that have no effect on type checking user code that calls the public API. @@ -27,12 +27,12 @@ s = pd.Series([1, 2, 3]) lt = s < 3 ``` -In the pandas source, `lt` is a `Series` with a `dtype` of `bool`. In the pandas-stubs, -the type of `lt` is `Series[bool]`. This allows further type checking to occur in other +In the pandas source, `lt` is a `Series` with a `dtype` of `bool`. In the pandas-stubs, +the type of `lt` is `Series[bool]`. This allows further type checking to occur in other pandas methods. Note that in the above example, `s` is typed as `Series[Any]` because its type cannot be statically inferred. -This also allows type checking for operations on series that contain date/time data. Consider +This also allows type checking for operations on series that contain date/time data. Consider the following example that creates two series of datetimes with corresponding arithmetic. ```python @@ -74,22 +74,22 @@ interval of `Timestamp`s. A set of (most likely incomplete) tests for testing the type stubs is in the pandas-stubs repository in the `tests` directory. The tests are used with `mypy` and `pyright` to validate correct typing, and also with `pytest` to validate that the provided code -actually executes. The recent decision for Python 3.11 to include `assert_type()`, +actually executes. The recent decision for Python 3.11 to include `assert_type()`, which is supported by `typing_extensions` version 4.2 and beyond makes it easier -to test to validate the return types of functions and methods. Future work +to test to validate the return types of functions and methods. Future work is intended to expand the use of `assert_type()` in the test code. ## Narrow vs. Wide Arguments -A consideration in creating stubs is too make the set of type annotations for +A consideration in creating stubs is to make the set of type annotations for function arguments "just right", i.e., not too narrow and not too wide. A type annotation to an argument to a function or method is too narrow if it disallows valid arguments. A type annotation to an argument to a function or method is too wide if it allows invalid arguments. Testing for type annotations that are too narrow is rather -straightforward. It is easy to create an example for which the type checker indicates +straightforward. It is easy to create an example for which the type checker indicates the argument is incorrect, and add it to the set of tests in the pandas-stubs -repository after fixing the appropriate stub. However, testing for when type annotations +repository after fixing the appropriate stub. However, testing for when type annotations are too wide is a bit more complicated. In this case, the test will fail when using `pytest`, but it is also desirable to have type checkers report errors for code that is expected to fail type checking. @@ -108,9 +108,9 @@ Here is an example that illustrates this concept, from `tests/test_interval.py`: In this particular example, the stubs consider that `i1` will have the type `pd.Interval[pd.Timestamp]`. It is incorrect code to add a `Timestamp` to a time-based interval. Without the `if TYPE_CHECKING_INVALID_USAGE` construct, the -code would fail at runtime. Further, type checkers should report an error for this -incorrect code. By placing the `# type: ignore[operator] # pyright: ignore[reportGeneralTypeIssues]` -on the line, type checkers are told to ignore the type error. To ensure that the +code would fail at runtime. Further, type checkers should report an error for this +incorrect code. By placing the `# type: ignore[operator] # pyright: ignore[reportGeneralTypeIssues]` +on the line, type checkers are told to ignore the type error. To ensure that the pandas-stubs annotations are not too wide (allow adding a `Timestamp` to a time-based interval), mypy and pyright are configured to report unused ignore statements.