Skip to content

Remove depends_on and produces markers. #551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jan 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions docs/source/_static/md/markers.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@ $ pytask markers
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Marker ┃ Description ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ pytask.mark.depends_on │ Add dependencies to a task. See this │
│ │ tutorial for more information: │
│ │ <a href="https://bit.ly/3JlxylS">https://bit.ly/3JlxylS</a>. │
│ │ │
│ pytask.mark.persist │ Prevent execution of a task if all │
│ │ products exist and even ifsomething has │
│ │ changed (dependencies, source file, │
Expand All @@ -21,10 +17,6 @@ $ pytask markers
│ │ another run will skip the task with │
│ │ success. │
│ │ │
│ pytask.mark.produces │ Add products to a task. See this │
│ │ tutorial for more information: │
│ │ <a href="https://bit.ly/3JlxylS">https://bit.ly/3JlxylS</a>. │
│ │ │
│ pytask.mark.skip │ Skip a task and all its dependent tasks.│
│ │ │
│ pytask.mark.skip_ancestor_failed │ Internal decorator applied to tasks if │
Expand Down
4 changes: 3 additions & 1 deletion docs/source/changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,12 @@ chronological order. Releases follow [semantic versioning](https://semver.org/)
releases are available on [PyPI](https://pypi.org/project/pytask) and
[Anaconda.org](https://anaconda.org/conda-forge/pytask).

## 0.4.6
## 0.5.0 - 2024-xx-xx

- {pull}`548` fixes the type hints for {meth}`~pytask.Task.execute` and
{meth}`~pytask.TaskWithoutPath.execute`. Thanks to {user}`Ostheer`.
- {pull}`551` removes the deprecated `@pytask.mark.depends_on` and
`@pytask.mark.produces`.

## 0.4.5 - 2024-01-09

Expand Down
26 changes: 13 additions & 13 deletions docs/source/how_to_guides/interfaces_for_dependencies_products.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@ In general, pytask regards everything as a task dependency if it is not marked a
product. Thus, you can also think of the following examples as how to inject values into
a task. When we talk about products later, the same interfaces will be used.

| | `def task(arg: ... = ...)` | `Annotated[..., value]` | `@task(kwargs=...)` | `@pytask.mark.depends_on(...)` |
| --------------------------------------- | :------------------------: | :---------------------: | :-----------------: | :----------------------------: |
| Not deprecated | ✅ | ✅ | ✅ | ❌ |
| No type annotations required | ✅ | ❌ | ✅ | ✅ |
| Flexible choice of argument name | ✅ | ✅ | ✅ | ❌ |
| Supports third-party functions as tasks | ❌ | ❌ | ✅ | ❌ |
| | `def task(arg: ... = ...)` | `Annotated[..., value]` | `@task(kwargs=...)` |
| --------------------------------------- | :------------------------: | :---------------------: | :-----------------: |
| Not deprecated | ✅ | ✅ | ✅ |
| No type annotations required | ✅ | ❌ | ✅ |
| Flexible choice of argument name | ✅ | ✅ | ✅ |
| Supports third-party functions as tasks | ❌ | ❌ | ✅ |

(default-argument)=

Expand Down Expand Up @@ -58,13 +58,13 @@ dictionary. It applies to dependencies and products alike.

## Products

| | `def task(arg: Annotated[..., Product] = ...)` | `Annotated[..., value, Product]` | `produces` | `@task(produces=...)` | `def task() -> Annotated[..., value]` | `@pytask.mark.produces(...)` |
| --------------------------------------------------------- | :--------------------------------------------: | :------------------------------: | :--------: | :-------------------: | :-----------------------------------: | :--------------------------: |
| Not deprecated | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| No type annotations required | ❌ | ❌ | ✅ | ✅ | ❌ | ✅ |
| Flexible choice of argument name | ✅ | ✅ | ❌ | ✅ | ➖ | ❌ |
| Supports third-party functions as tasks | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ |
| Allows to pass custom node while preserving type of value | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ |
| | `def task(arg: Annotated[..., Product] = ...)` | `Annotated[..., value, Product]` | `produces` | `@task(produces=...)` | `def task() -> Annotated[..., value]` |
| --------------------------------------------------------- | :--------------------------------------------: | :------------------------------: | :--------: | :-------------------: | :-----------------------------------: |
| Not deprecated | ✅ | ✅ | ✅ | ✅ | ✅ |
| No type annotations required | ❌ | ❌ | ✅ | ✅ | ❌ |
| Flexible choice of argument name | ✅ | ✅ | ❌ | ✅ | ➖ |
| Supports third-party functions as tasks | ❌ | ❌ | ❌ | ✅ | ❌ |
| Allows to pass custom node while preserving type of value | ❌ | ✅ | ✅ | ✅ | ✅ |

### `Product` annotation

Expand Down
26 changes: 2 additions & 24 deletions docs/source/reference_guides/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,36 +63,16 @@ The remaining exceptions convey specific errors.

## Marks

pytask uses marks to attach additional information to task functions which is processed
by the host or by plugins. The following marks are available by default.
pytask uses marks to attach additional information to task functions that the host or
plugins process. The following marks are available by default.

### Built-in marks

```{eval-rst}
.. function:: pytask.mark.depends_on(objects: Any | Iterable[Any] | dict[Any, Any])

Specify dependencies for a task.

:type objects: Any | Iterable[Any] | dict[Any, Any]
:param objects:
Can be any valid Python object or an iterable of any Python objects. To be
valid, it must be parsed by some hook implementation for the
:func:`_pytask.hookspecs.pytask_collect_node` entry-point.

.. function:: pytask.mark.persist()

A marker for a task which should be persisted.

.. function:: pytask.mark.produces(objects: Any | Iterable[Any] | dict[Any, Any])

Specify products of a task.

:type objects: Any | Iterable[Any] | dict[Any, Any]
:param objects:
Can be any valid Python object or an iterable of any Python objects. To be
valid, it must be parsed by some hook implementation for the
:func:`_pytask.hookspecs.pytask_collect_node` entry-point.

.. function:: pytask.mark.skipif(condition: bool, *, reason: str)

Skip a task based on a condition and provide a necessary reason.
Expand Down Expand Up @@ -251,10 +231,8 @@ Nodes are the interface for different kinds of dependencies or products.
To parse dependencies and products from nodes, use the following functions.

```{eval-rst}
.. autofunction:: pytask.depends_on
.. autofunction:: pytask.parse_dependencies_from_task_function
.. autofunction:: pytask.parse_products_from_task_function
.. autofunction:: pytask.produces
```

## Tasks
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorials/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ configuration file.

```toml
[tool.pytask.ini_options]
paths = "src"
paths = ["src"]
```

## The location
Expand Down
175 changes: 1 addition & 174 deletions docs/source/tutorials/defining_dependencies_products.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,6 @@ You find a tutorial on type hints {doc}`here <../type_hints>`.

If you want to avoid type annotations for now, look at the tab named `produces`.

```{warning}
The `Decorators` tab documents the deprecated approach that should not be used anymore
and will be removed in version v0.5.
```

```{seealso}
In this tutorial, we only deal with local files. If you want to use pytask with files
online, S3, GCP, Azure, etc., read the
Expand Down Expand Up @@ -89,26 +84,6 @@ passed to this argument is automatically treated as a task product. Here, we pas
path as the default argument.

````

````{tab-item} Decorators
:sync: decorators

```{warning}
This approach is deprecated and will be removed in v0.5
```

```{literalinclude} ../../../docs_src/tutorials/defining_dependencies_products_products_decorators.py
:emphasize-lines: 9, 10
```

The {func}`@pytask.mark.produces <pytask.mark.produces>` marker attaches a product to a
task. After the task has finished, pytask will check whether the file exists.

Add `produces` as an argument of the task function to get access to the same path inside
the task function.

````

`````

```{tip}
Expand Down Expand Up @@ -170,24 +145,6 @@ pytask assumes that all function arguments that are not passed to the argument
:emphasize-lines: 9
```

````

````{tab-item} Decorators
:sync: decorators

```{warning}
This approach is deprecated and will be removed in v0.5
```

Equivalent to products, you can use the
{func}`@pytask.mark.depends_on <pytask.mark.depends_on>` decorator to specify that
`data.pkl` is a dependency of the task. Use `depends_on` as a function argument to
access the dependency path inside the function and load the data.

```{literalinclude} ../../../docs_src/tutorials/defining_dependencies_products_dependencies_decorators.py
:emphasize-lines: 9, 11
```

````
`````

Expand Down Expand Up @@ -228,25 +185,6 @@ are assumed to point to a location relative to the task module.
:emphasize-lines: 4
```

````

````{tab-item} Decorators
:sync: decorators

```{warning}
This approach is deprecated and will be removed in v0.5
```

You can also use absolute and relative paths as strings that obey the same rules as the
{class}`pathlib.Path`.

```{literalinclude} ../../../docs_src/tutorials/defining_dependencies_products_relative_decorators.py
:emphasize-lines: 6
```

If you use `depends_on` or `produces` as arguments for the task function, you will have
access to the paths of the targets as {class}`pathlib.Path`.

````
`````

Expand Down Expand Up @@ -286,7 +224,7 @@ structures if needed.

````

````{tab-item} prodouces
````{tab-item} produces
:sync: produces

If your task has multiple products, group them in one container like a dictionary
Expand All @@ -300,117 +238,6 @@ You can do the same with dependencies.
```{literalinclude} ../../../docs_src/tutorials/defining_dependencies_products_multiple2_produces.py
```

````

````{tab-item} Decorators
:sync: decorators

```{warning}
This approach is deprecated and will be removed in v0.5
```

The easiest way to attach multiple dependencies or products to a task is to pass a
{class}`dict` (highly recommended), {class}`list`, or another iterator to the marker
containing the paths.

To assign labels to dependencies or products, pass a dictionary. For example,

```python
from typing import Dict


@pytask.mark.produces({"first": BLD / "data_0.pkl", "second": BLD / "data_1.pkl"})
def task_create_random_data(produces: Dict[str, Path]) -> None:
...
```

Then, use `produces` inside the task function.

```pycon
>>> produces["first"]
BLD / "data_0.pkl"

>>> produces["second"]
BLD / "data_1.pkl"
```

You can also use lists and other iterables.

```python
@pytask.mark.produces([BLD / "data_0.pkl", BLD / "data_1.pkl"])
def task_create_random_data(produces):
...
```

Inside the function, the arguments `depends_on` or `produces` become a dictionary where
keys are the positions in the list.

```pycon
>>> produces
{0: BLD / "data_0.pkl", 1: BLD / "data_1.pkl"}
```

Why does pytask recommend dictionaries and convert lists, tuples, or other
iterators to dictionaries? First, dictionaries with positions as keys behave very
similarly to lists.

Secondly, dictionary keys are more descriptive and do not assume a fixed
ordering. Both attributes are especially desirable in complex projects.

**Multiple decorators**

pytask merges multiple decorators of one kind into a single dictionary. This might help
you to group dependencies and apply them to multiple tasks.

```python
common_dependencies = pytask.mark.depends_on(
{"first_text": "text_1.txt", "second_text": "text_2.txt"}
)


@common_dependencies
@pytask.mark.depends_on("text_3.txt")
def task_example(depends_on):
...
```

Inside the task, `depends_on` will be

```pycon
>>> depends_on
{"first_text": ... / "text_1.txt", "second_text": "text_2.txt", 0: "text_3.txt"}
```

**Nested dependencies and products**

Dependencies and products can be nested containers consisting of tuples, lists, and
dictionaries. It is beneficial if you want more structure and nesting.

Here is an example of a task that fits some model on data. It depends on a module
containing the code for the model, which is not actively used but ensures that the task
is rerun when the model is changed. And it depends on the data.

```python
@pytask.mark.depends_on(
{
"model": [SRC / "models" / "model.py"],
"data": {"a": SRC / "data" / "a.pkl", "b": SRC / "data" / "b.pkl"},
}
)
@pytask.mark.produces(BLD / "models" / "fitted_model.pkl")
def task_fit_model(depends_on, produces):
...
```

`depends_on` within the function will be

```python
{
"model": [SRC / "models" / "model.py"],
"data": {"a": SRC / "data" / "a.pkl", "b": SRC / "data" / "b.pkl"},
}
```

````
`````

Expand Down
Loading