diff --git a/README.rst b/README.rst index 383a72fc..6081d52d 100644 --- a/README.rst +++ b/README.rst @@ -66,16 +66,16 @@ projects. Its features include: `_ known from pytest. -- **Easily extensible with plugins**. pytask's architecture is based on `pluggy - `_, a plugin management framework, so that - you can adjust pytask to your needs. Plugins are available for `parallelization +- **Easily extensible with plugins**. pytask is built on top of `pluggy + `_, a plugin management framework, which + allows you to adjust pytask to your needs. Plugins are available for `parallelization `_, `LaTeX `_, `R `_, and `Stata - `_ and `many more - `_. Read `here + `_ and more can be found `here + `_. Read in `this tutorial `_ how - you can use plugins. + to use and create plugins. .. end-features diff --git a/docs/source/_static/images/how-to-capture-output.png b/docs/source/_static/images/how-to-capture-output.png new file mode 100644 index 00000000..bc04e7d4 Binary files /dev/null and b/docs/source/_static/images/how-to-capture-output.png differ diff --git a/docs/source/_static/images/how-to-debug-pdb.png b/docs/source/_static/images/how-to-debug-pdb.png new file mode 100644 index 00000000..bb91aa01 Binary files /dev/null and b/docs/source/_static/images/how-to-debug-pdb.png differ diff --git a/docs/source/_static/images/how-to-debug-show-locals.png b/docs/source/_static/images/how-to-debug-show-locals.png new file mode 100644 index 00000000..b66bfe57 Binary files /dev/null and b/docs/source/_static/images/how-to-debug-show-locals.png differ diff --git a/docs/source/_static/images/how-to-debug-trace.png b/docs/source/_static/images/how-to-debug-trace.png new file mode 100644 index 00000000..0f034408 Binary files /dev/null and b/docs/source/_static/images/how-to-debug-trace.png differ diff --git a/docs/source/_static/images/how-to-write-a-task.png b/docs/source/_static/images/how-to-write-a-task.png new file mode 100644 index 00000000..122fa5a2 Binary files /dev/null and b/docs/source/_static/images/how-to-write-a-task.png differ diff --git a/docs/source/_static/images/persist-executed.png b/docs/source/_static/images/persist-executed.png new file mode 100644 index 00000000..fc0274d0 Binary files /dev/null and b/docs/source/_static/images/persist-executed.png differ diff --git a/docs/source/_static/images/persist-persisted.png b/docs/source/_static/images/persist-persisted.png new file mode 100644 index 00000000..1c342433 Binary files /dev/null and b/docs/source/_static/images/persist-persisted.png differ diff --git a/docs/source/_static/images/persist-skipped-successfully.png b/docs/source/_static/images/persist-skipped-successfully.png new file mode 100644 index 00000000..e1400f17 Binary files /dev/null and b/docs/source/_static/images/persist-skipped-successfully.png differ diff --git a/docs/source/_static/images/pytask-collect-nodes.png b/docs/source/_static/images/pytask-collect-nodes.png new file mode 100644 index 00000000..058f7772 Binary files /dev/null and b/docs/source/_static/images/pytask-collect-nodes.png differ diff --git a/docs/source/_static/images/pytask-collect.png b/docs/source/_static/images/pytask-collect.png new file mode 100644 index 00000000..b72ee54d Binary files /dev/null and b/docs/source/_static/images/pytask-collect.png differ diff --git a/docs/source/_static/images/pytask-profile.png b/docs/source/_static/images/pytask-profile.png new file mode 100644 index 00000000..72508fad Binary files /dev/null and b/docs/source/_static/images/pytask-profile.png differ diff --git a/docs/source/changes.rst b/docs/source/changes.rst index a3672e78..f79b2fe1 100644 --- a/docs/source/changes.rst +++ b/docs/source/changes.rst @@ -11,6 +11,7 @@ all releases are available on `PyPI `_ and ------------------ - :gh:`191` adds a guide on how to profile pytask to the developer's guide. +- :gh:`193` adds more figures to the documentation. 0.1.5 - 2022-01-10 diff --git a/docs/source/how_to_guides/how_to_write_a_plugin.rst b/docs/source/how_to_guides/how_to_write_a_plugin.rst index fd75d832..7348c9ee 100644 --- a/docs/source/how_to_guides/how_to_write_a_plugin.rst +++ b/docs/source/how_to_guides/how_to_write_a_plugin.rst @@ -16,8 +16,8 @@ steps. - Check whether there exist plugins which offer similar functionality. For example, many plugins provide convenient interfaces to run another program with inputs via the command line. Naturally, there is a lot of overlap in the structure of the program and - even the the test battery. Finding the right plugin as a template may save you a lot - of time. + even the test battery. Finding the right plugin as a template may save you a lot of + time. - Make a list of hooks you want to implement. Think about how this plugin relates to functionality defined in pytask and other plugins. Maybe skim the documentation on @@ -38,25 +38,38 @@ This section explains some steps which are required for all plugins. Set up the setuptools entry-point ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -pytask discovers plugins via ``setuptools`` entry-points. This is specified in -``setup.py``. See the following example. +pytask discovers plugins via ``setuptools`` entry-points. Following the approach +advocated for by `setuptools_scm `_, the +entry-point is specified in ``setup.cfg``. -.. code-block:: python +.. code-block:: cfg + + # Content of setup.cfg + + [metadata] + name = pytask-plugin + + [options.packages.find] + where = src + + [options.entry_points] + pytask = + pytask_plugin = pytask_plugin.plugin + +For ``setuptools_scm`` you also need a ``pyproject.toml`` with the following content. + +.. code-block:: toml - # Content of setup.py + # Content of pyproject.toml - from setuptools import find_packages - from setuptools import setup + [build-system] + requires = ["setuptools>=45", "wheel", "setuptools_scm[toml]>=6.0"] - setup( - name="pytask-plugin", - version="0.0.1", - entry_points={"pytask": ["pytask_plugin = pytask_plugin.plugin"]}, - # PyPI classifier for pytask plugins - classifiers=["Framework :: pytask"], - ) + [tool.setuptools_scm] + write_to = "src/pytask_plugin/_version.py" -For an example with ``setuptools_scm`` and ``setup.cfg`` see the `pytask-parallel repo +For a complete example with ``setuptools_scm`` and ``setup.cfg`` see the +`pytask-parallel repo `_. The entry-point for pytask is called ``"pytask"`` and points to a module called diff --git a/docs/source/tutorials/how_to_capture_output.rst b/docs/source/tutorials/how_to_capture_output.rst index 54e18e49..bc5f52c7 100644 --- a/docs/source/tutorials/how_to_capture_output.rst +++ b/docs/source/tutorials/how_to_capture_output.rst @@ -46,11 +46,11 @@ You can influence output capturing mechanisms from the command line: .. code-block:: console - pytask -s # disable all capturing - pytask --capture=sys # replace sys.stdout/stderr with in-mem files - pytask --capture=fd # also point filedescriptors 1 and 2 to temp file - pytask --capture=tee-sys # combines 'sys' and '-s', capturing sys.stdout/stderr - # and passing it along to the actual sys.stdout/stderr + $ pytask -s # disable all capturing + $ pytask --capture=sys # replace sys.stdout/stderr with in-mem files + $ pytask --capture=fd # also point filedescriptors 1 and 2 to temp file + $ pytask --capture=tee-sys # combines 'sys' and '-s', capturing sys.stdout/stderr + # and passing it along to the actual sys.stdout/stderr Using print statements for debugging @@ -75,24 +75,4 @@ print statements for debugging: and running this module will show you precisely the output of the failing function and hide the other one: -.. code-block:: console - - $ pytask -s - ========================= Start pytask session ========================= - Platform: win32 -- Python 3.x.x, pytask 0.x.x, pluggy 0.13.x - Root: . - Collected 2 task(s). - - F. - =============================== Failures =============================== - _________________ Task task_capture.py::task_func2 failed ______________ - - Traceback (most recent call last): - File "task_capture.py", line 7, in task_func2 - assert False - AssertionError - - ---------------------- Captured stdout during call --------------------- - Debug statement. - - ==================== 1 succeeded, 1 failed in 0.01s ==================== +.. image:: /_static/images/how-to-capture-output.png diff --git a/docs/source/tutorials/how_to_collect_tasks.rst b/docs/source/tutorials/how_to_collect_tasks.rst index 07a87164..48a1abf0 100644 --- a/docs/source/tutorials/how_to_collect_tasks.rst +++ b/docs/source/tutorials/how_to_collect_tasks.rst @@ -21,36 +21,12 @@ For example, let us take the following task Now, running :program:`pytask collect` will produce the following output. -.. code-block:: console - - $ pytask collect - ========================= Start pytask session ========================= - Platform: linux -- Python 3.x.y, pytask 0.x.y, pluggy 0.x.y - Root: xxx - Collected 1 task(s). - - - - - ======================================================================== +.. image:: /_static/images/pytask-collect.png If you want to have more information regarding dependencies and products of the task, append the ``--nodes`` flag. -.. code-block:: console - - $ pytask collect - ========================= Start pytask session ========================= - Platform: linux -- Python 3.x.y, pytask 0.x.y, pluggy 0.x.y - Root: xxx - Collected 1 task(s). - - - - - - - ======================================================================== +.. image:: /_static/images/pytask-collect-nodes.png To restrict the set of tasks you are looking at, use markers, expression and ignore patterns as usual. @@ -59,4 +35,7 @@ patterns as usual. Further reading --------------- -- :program:`pytask collect` in :doc:`../reference_guides/command_line_interface`. +- The documentation on the command line interface of :program:`pytask collect` can be + found :doc:`here <../reference_guides/command_line_interface>`. +- Read :doc:`here ` about selecting tasks. +- Paths can be ignored with :confval:`ignore`. diff --git a/docs/source/tutorials/how_to_debug.rst b/docs/source/tutorials/how_to_debug.rst index a2f229e3..290a3b87 100644 --- a/docs/source/tutorials/how_to_debug.rst +++ b/docs/source/tutorials/how_to_debug.rst @@ -12,24 +12,19 @@ Tracebacks ---------- You can enrich the display of tracebacks by showing local variables in each stack frame. -Just execute pytask with +Just execute pytask with :confval:`show_locals`, meaning ``pytask --show-locals``. -.. code-block:: console - - $ pytask --show-locals +.. image:: /_static/images/how-to-debug-show-locals.png Debugging --------- -Running - -.. code-block:: console +Using :confval:`pdb` enables the post-mortem debugger. Whenever an exception is raised +inside a task, the prompt will enter the debugger enabling you to find out the cause of +the exception. - $ pytask --pdb - -enables the post-mortem debugger. Whenever an exception is raised inside a task, the -prompt will enter the debugger enabling you to discover the source of the exception. +.. image:: /_static/images/how-to-debug-pdb.png .. seealso:: @@ -46,11 +41,9 @@ prompt will enter the debugger enabling you to discover the source of the except Tracing ------- -If you want to enter the debugger at the start of every task, use - -.. code-block:: console +If you want to enter the debugger at the start of every task, use :confval:`trace`. - $ pytask --trace +.. image:: /_static/images/how-to-debug-trace.png Custom debugger diff --git a/docs/source/tutorials/how_to_make_tasks_persist.rst b/docs/source/tutorials/how_to_make_tasks_persist.rst index aaf7c580..5e32aaaa 100644 --- a/docs/source/tutorials/how_to_make_tasks_persist.rst +++ b/docs/source/tutorials/how_to_make_tasks_persist.rst @@ -1,16 +1,23 @@ How to make tasks persist ========================= -Sometimes you want to skip the execution. It means that if all dependencies -and products exist, the task will not be executed even though a dependency, the task's -source file or a product has changed. Instead, the state of the dependencies, the source -file and the products is updated in the database such that the next execution will skip -the task successfully. +Sometimes you want to skip the execution of a task and pretend like nothing has changed. + +A common scenario is that you have a long running task which will be executed again if +you would format the task's source file with `black `_. + +In this case, you can apply the ``@pytask.mark.persist`` decorator to the task which +will skip its execution as long as all products exist. + +Internally, the state of the dependencies, the source file and the products is updated +in the database such that the next execution will skip the task successfully. + When is this useful? -------------------- -- You ran a formatter like Black on the files in your project. +- You ran a formatter like Black on the files in your project and want to prevent the + longest running tasks from being rerun. - You extend a parametrization, but do not want to rerun all tasks. @@ -30,12 +37,11 @@ How to do it? To create a persisting task, apply the correct decorator and, et voilĂ , it is done. -Let us take the second scenario as an example. First, we define the tasks, the -dependency and the product and save everything in the same folder. +To see the whole process, first, we create some task and its dependency. .. code-block:: python - # Content of task_file.py + # Content of task_module.py import pytask @@ -47,45 +53,25 @@ dependency and the product and save everything in the same folder. produces.write_text("**" + depends_on.read_text() + "**") -.. code-block:: +.. code-block:: md - # Content of input.md. Do not copy this line. + Here is the text. +If you execute the task with pytask, the task will be executed since the product is +missing. -.. code-block:: - - # Content of output.md. Do not copy this line. - - **Here is the text.** - - -If you run pytask in this folder, you get the following output. - -.. code-block:: console - - $ pytask demo - ========================= Start pytask session ========================= - Platform: win32 -- Python 3.8.5, pytask 0.0.6, pluggy 0.13.1 - Root: xxx/demo - Collected 1 task(s). - - p - ====================== 1 succeeded in 1 second(s) ====================== - -The green p signals that the task persisted. Another execution will show the following. +.. image:: /_static/images/persist-executed.png -.. code-block:: console +After that, we change the source file of the task accidentally by formatting the file +with black. Without the ``@pytask.mark.persist`` decorator the task would run again +since it has changed. With the decorator, the execution is skipped which is signaled by +a green p. - $ pytask demo - ========================= Start pytask session ========================= - Platform: win32 -- Python 3.8.5, pytask 0.0.6, pluggy 0.13.1 - Root: xxx/demo - Collected 1 task(s). +.. image:: /_static/images/persist-persisted.png - s - ====================== 1 succeeded in 1 second(s) ====================== +If we now run the task again, it is skipped because nothing has changed and not because +it is marked with ``@pytask.mark.persist``. -Now, the task is skipped successfully because nothing has changed compared to the -previous run. +.. image:: /_static/images/persist-skipped-successfully.png diff --git a/docs/source/tutorials/how_to_profile_tasks.rst b/docs/source/tutorials/how_to_profile_tasks.rst index f7ef033e..bc2bd9d1 100644 --- a/docs/source/tutorials/how_to_profile_tasks.rst +++ b/docs/source/tutorials/how_to_profile_tasks.rst @@ -7,3 +7,7 @@ display the information, enter .. code-block:: console $ pytask profile + +Here is an example + +.. image:: /_static/images/pytask-profile.png diff --git a/docs/source/tutorials/how_to_use_plugins.rst b/docs/source/tutorials/how_to_use_plugins.rst index 20b04452..e449bfee 100644 --- a/docs/source/tutorials/how_to_use_plugins.rst +++ b/docs/source/tutorials/how_to_use_plugins.rst @@ -26,8 +26,8 @@ Plugins can be found in many places. How to use plugins ------------------ -To use a plugin, simply follow the installation instructions. A plugin will enable -itself by using pytask's entry-point. +To use a plugin, simply follow the installation instructions and the accompanying +documentation. A plugin will usually enable itself by using pytask's entry-point. How to implement your own plugin diff --git a/docs/source/tutorials/how_to_write_a_task.rst b/docs/source/tutorials/how_to_write_a_task.rst index d1b211f5..938cd58c 100644 --- a/docs/source/tutorials/how_to_write_a_task.rst +++ b/docs/source/tutorials/how_to_write_a_task.rst @@ -4,9 +4,9 @@ How to write a task Starting from the project structure in the :doc:`previous tutorial `, this tutorial teaches you how to write your first task. -The task will be defined in ``src/task_data_preparation.py`` and it will generate -artificial data which will be stored in ``bld/data.pkl``. We will call the function in -the module :func:`task_create_random_data`. +The task will be defined in ``src/my_project/task_data_preparation.py`` and it will +generate artificial data which will be stored in ``bld/data.pkl``. We will call the +function in the module :func:`task_create_random_data`. .. code-block:: @@ -33,7 +33,7 @@ Here, we define the function import pytask import numpy as np - import pandas as np + import pandas as pd from my_project.config import BLD @@ -57,31 +57,15 @@ To let pytask track the product of the task, you need to use the You learn more about adding dependencies and products to a task in the next :doc:`tutorial `. -To execute the task, type the following command in your shell. +Now, execute pytask which will automatically collect tasks in the current directory and +subsequent directories. -.. code-block:: console - - $ pytask task_data_preparation.py - ========================= Start pytask session ========================= - Platform: linux -- Python 3.x.y, pytask 0.x.y, pluggy 0.x.y - Root: xxx - Collected 1 task(s). - - . - ======================= 1 succeeded in 1 second ======================== - -Executing - -.. code-block:: console - - $ pytask - -would collect all tasks in the current working directory and in all subsequent folders. +.. image:: /_static/images/how-to-write-a-task.png .. important:: - By default, pytask assumes that tasks are functions in modules whose names are both - prefixed with ``task_``. + By default, pytask assumes that tasks are functions and both, the function name and + the module name, must be prefixed with ``task_``. Use the configuration value :confval:`task_files` if you prefer a different naming scheme for the task modules. diff --git a/src/_pytask/profile.py b/src/_pytask/profile.py index 8091b068..643607c2 100644 --- a/src/_pytask/profile.py +++ b/src/_pytask/profile.py @@ -233,7 +233,7 @@ def pytask_profile_add_info_on_task( def _to_human_readable_size(bytes_: int, units: Optional[List[str]] = None) -> str: """Convert bytes to a human readable size.""" - units = [" bytes", "KB", "MB", "GB", "TB"] if units is None else units + units = [" bytes", " KB", " MB", " GB", " TB"] if units is None else units return ( str(bytes_) + units[0] if bytes_ < 1024