Skip to content

Commit 938b600

Browse files
authored
Merge pull request #7 from hpcflow/feat/beginners-tutorial
Beginners tutorial
2 parents 7f44dfe + 50da06e commit 938b600

File tree

12 files changed

+362
-0
lines changed

12 files changed

+362
-0
lines changed

docs/source/conf.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -203,6 +203,7 @@ def prepare_task_schema_action_info(app: BaseApp):
203203

204204
with open("config.jsonc") as fp:
205205
jsonc_str = fp.read()
206+
# Strip out comments denoted by // to leave a valid JSON file
206207
json_str = re.sub(
207208
r'\/\/(?=([^"]*"[^"]*")*[^"]*$).*', "", jsonc_str, flags=re.MULTILINE
208209
)
@@ -226,6 +227,8 @@ def prepare_task_schema_action_info(app: BaseApp):
226227

227228
# distribution name (i.e. name on PyPI):
228229
with open("../../pyproject.toml") as fp:
230+
dist_name = tomlkit.load(fp)["tool"]["poetry"]["name"]
231+
supported_python_versions = tomlkit.load(fp)["tool"]["poetry"]["dependencies"]["python"]
229232
pyproject_config = tomlkit.load(fp)
230233
dist_name = pyproject_config["tool"]["poetry"]["name"]
231234
supported_python = pyproject_config["tool"]["poetry"]["dependencies"]["python"]

docs/source/user/tutorials/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,5 @@ Tutorials
44
.. toctree::
55
:maxdepth: 1
66

7+
Beginner: Install MatFlow on your local machine <install-locally>
8+
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
.. jinja:: first_ctx
2+
3+
################################################
4+
Tutorial: Install {{ app_name }} on your local machine
5+
################################################
6+
7+
This tutorial will guide you through the process of installing {{ app_name }} on your local machine (laptop or desktop), creating and running some example workflows.
8+
This tutorial is intended for users who are new to {{ app_name }} and want to understand the setup and terminology.
9+
Most workflows used in your research will be too large to run on your local machine,
10+
but this tutorial will help you understand the basics of how {{ app_name }} works before you move to setting it up on a cluster.
11+
12+
Step 1: Set up a Python environment
13+
====================================
14+
15+
The first step is to set up a Python environment on your local machine.
16+
17+
**If you have not already installed Python**, you can download the latest version of Python from the `Python website <https://www.python.org/downloads/>`_.
18+
Follow the instructions on the website for your operating system.
19+
20+
**If you have already installed Python**, you can check the version of Python installed on your machine by running
21+
``python --version``.
22+
23+
Check that your version matches one of the ones supported by {{ app_name }}.
24+
You can find the supported Python versions in the :ref:`installation instructions <def_python_versions>`_.
25+
If your version is not supported, you may need to update to a newer version of Python.
26+
27+
Next, you will need to set up a virtual environment to install {{ app_name }} and its dependencies.
28+
A virtual environment is a self-contained directory that contains a particular version of Python with the all libraries and dependencies you install.
29+
This allows you to install packages without affecting the system Python installation or other projects,
30+
and when you run a command inside that environment you are certain which versions are being used.
31+
32+
To create a virtual environment, you can use the `venv <https://docs.python.org/3/library/venv.html>`_ module that comes with Python.
33+
Follow the instructions in the `Python Packaging Guide <https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/#create-and-use-virtual-environments>`_ to create and activate a virtual environment.
34+
The convention is to call your environment ``.venv``, but you can call it whatever you like.
35+
We recommend calling it ``{{ app_module }}-env`` to make it clear that this environment is for {{ app_name }}.
36+
37+
When the environment is activated, you should see the name of the virtual environment in brackets in your terminal prompt.
38+
Whenever you are working with Python in the terminal, you can check if it is accessing your system installation of Python or a virtual environemnt by running ``which python``.
39+
This will print out the path to the Python executable it is calling, so currently the path should be inside the virtual environment folder you just created.
40+
41+
Step 2: Install {{ app_name }}
42+
=======================
43+
44+
Once you have created and activated a Python environment (check for the environment name in brackets in your prompt), you can install {{ app_name }} using pip by running
45+
``pip install {{ dist_name }}``.
46+
47+
This will install the latest version of {{ app_name }} from the Python Package Index (PyPI), and all the dependencies it needs.
48+
Once it has finished, check that {{ app_name }} has been installed correctly by running
49+
``{{ app_module }} --version``.
50+
51+
This should print the version of {{ app_name }} that you have installed.
52+
If you see an error message saying it doesn't recognise "{{ app_module }}" as a command name, check that you have activated the correct virtual environment and that you have installed {{ app_name }} correctly.
53+
54+
Step 3: Configure {{ app_name }} for your machine
55+
========================================
56+
57+
Now that you have installed {{ app_name }}, you need to set it up for your machine.
58+
{{ app_name }} uses a configuration file to store information about the machine you are running on, such as the number of cores available and the locations of important folders.
59+
This will be stored in your user home directory so that it can be read by {{ app_name }} no matter what project you are working on, or what folder you are working in.
60+
61+
The configuration file is called `config.yml` and is stored in the `~/.{{ app_name }}-new` directory (`~` is a shortcut for your user home directory, and the `.` at the start of the filename indicates that this is a hidden folder).
62+
When you first install {{ app_name }}, the directory and file will not exist.
63+
You can either make it yourself or run ``{{ app_name }} init`` to create the ``~/.{{ app_name }}-new`` directory and a ``config.yml`` file inside it with the minimum default settings.
64+
65+
Step 4: Define workflow
66+
========================
67+
68+
Now that you have installed {{ app_name }} and set up the configuration file, you can start defining :ref:`workflows <_def_workflow>`_.
69+
{{ app_name }} uses a YAML file to define the workflow, which is a text file that describes the steps in the workflow and the parameters for each step.
70+
The workflow file is stored in the directory where you want to run the workflow.
71+
72+
Step 5: Run the workflow
73+
========================
74+
75+
Once you have defined the workflow, you can run it using the command
76+
``{{ app_module }} go <workflow_file>``.
77+
78+
Step 6: Monitor the workflow
79+
============================
80+
81+
You can monitor the progress of the workflow by running
82+
``{{ app_module }} show``.
83+
This will show you the status of each step in the workflow, including whether it is running, completed, or failed.
84+
You can also view the log files generated during the run by running
85+
``{{ app_module }} logs <workflow_file>``.
86+
This will show you the log files for each step in the workflow, including any error messages or warnings that were generated during the run.
87+
88+
89+
Step 6: View the results
90+
========================
91+
92+
Once the workflow has finished running, you can view the results in the output directory specified in the workflow file.
93+
The output directory will contain the results of each step in the workflow, as well as any log files generated during the run.
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
template_components:
2+
task_schemas:
3+
- objective: process_some_data
4+
inputs:
5+
- parameter: input_data
6+
outputs:
7+
- parameter: parsed_output
8+
actions:
9+
- input_file_generators:
10+
- input_file: my_input_file
11+
from_inputs:
12+
- input_data
13+
script: <<script:/path/to/generate_input_file.py>>
14+
environments:
15+
- scope:
16+
type: any
17+
environment: python_env
18+
script_exe: python_script
19+
script: <<script:/path/to/process_input_file.py>>
20+
save_files:
21+
- processed_file
22+
output_file_parsers:
23+
parsed_output:
24+
from_files:
25+
- my_input_file
26+
- processed_file
27+
script: <<script:/path/to/parse_output.py>>
28+
save_files:
29+
- parsed_output
30+
31+
- objective: process_data_without_input_file_generator
32+
inputs:
33+
- parameter: input_data
34+
- parameter: path
35+
actions:
36+
- script: <<script:/path/to/generate_input_file.py>>
37+
script_data_in: direct
38+
script_exe: python_script
39+
save_files:
40+
- my_input_file
41+
environments:
42+
- scope:
43+
type: any
44+
environment: python_env
45+
- script: <<script:/path/to/process_input_file.py>>
46+
script_exe: python_script
47+
environments:
48+
- scope:
49+
type: any
50+
environment: python_env
51+
save_files:
52+
- processed_file
53+
54+
command_files:
55+
- label: my_input_file
56+
name:
57+
name: input_file.json
58+
- label: processed_file
59+
name:
60+
name: processed_file.json
61+
- label: parsed_output
62+
name:
63+
name: parsed_output.json
64+
65+
66+
tasks:
67+
- schema: process_some_data
68+
inputs:
69+
input_data: [1, 2, 3, 4]
70+
- schema: process_data_without_input_file_generator
71+
inputs:
72+
input_data: [1, 2, 3, 4]
73+
path: input_file.json
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
configs:
2+
default:
3+
invocation:
4+
environment_setup:
5+
match: {}
6+
config:
7+
machine: YOUR-MACHINE-NAME
8+
log_file_path: logs/<<app_name>>_v<<app_version>>.log
9+
environment_sources: [~/.matflow-new/envs_local.yaml]
10+
task_schema_sources: []
11+
command_file_sources: []
12+
parameter_sources: []
13+
default_scheduler: direct
14+
default_shell: bash
15+
schedulers:
16+
direct:
17+
defaults: {}
18+
shells:
19+
bash:
20+
defaults: {}
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
name: temp_python_env
2+
# Any setup steps e.g. loading a module, activating a virtual environment can go here
3+
setup: source venv/bin/activate
4+
# There might be multiple executables in your environment
5+
# e.g. python, abaqus, etc
6+
executables:
7+
# It's probably a good idea to stick with `python_script` for any python
8+
# executables for compatiblility with existing tasks which you
9+
# might want to call in your workflow which will expect this label
10+
- label: python_script
11+
instances:
12+
- command: python <<script_name>> <<args>>
13+
num_cores: 1
14+
parallel_mode: null
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
import json
2+
def generate_input_file(path: str, input_data: list):
3+
"""Generate an input file"""
4+
with open(path, "w") as f:
5+
json.dump(input_data, f, indent=2)
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
def greet(greeting: str, name: str):
2+
"""Return a greeting"""
3+
return {"string_to_print": f"{greeting}, {name}!"}
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
template_components:
2+
task_schemas:
3+
- objective: s1
4+
inputs:
5+
- parameter: p1
6+
outputs:
7+
- parameter: p2
8+
actions:
9+
- commands:
10+
- command: echo $(( <<parameter:p1>> + 1 )) # This is printed to stdout
11+
- command: echo $(( <<parameter:p1>> + 1 )) # This is captured as p2
12+
stdout: <<int(parameter:p2)>>
13+
- objective: s2
14+
inputs:
15+
- parameter: p2
16+
group: my_group
17+
outputs:
18+
- parameter: p3
19+
actions:
20+
- commands:
21+
- command: echo <<parameter:p2>> # This one is printed to stdout
22+
- command: echo $(( <<sum(parameter:p2)>> )) # This is captured as p3
23+
stdout: <<parameter:p3>>
24+
tasks:
25+
- schema: s1
26+
sequences:
27+
- path: inputs.p1
28+
values: [1, 2]
29+
groups:
30+
- name: my_group
31+
- schema: s2
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
template_components:
2+
task_schemas:
3+
- objective: greet
4+
inputs:
5+
- parameter: name
6+
default_value: World
7+
- parameter: greeting
8+
default_value: Hello
9+
actions:
10+
- commands:
11+
- command: echo "<<parameter:greeting>>, <<parameter:name>>!" > printed_string.txt
12+
13+
- objective: python_greet
14+
inputs:
15+
- parameter: name
16+
default_value: World
17+
- parameter: greeting
18+
default_value: Hello
19+
outputs:
20+
- parameter: string_to_print
21+
actions:
22+
- script: <<script:/path/to/greet.py>>
23+
script_data_in: direct
24+
script_data_out: direct
25+
script_exe: python_script
26+
environments:
27+
- scope:
28+
type: any
29+
environment: python_env
30+
31+
- objective: print
32+
inputs:
33+
- parameter: string_to_print
34+
actions:
35+
- commands:
36+
- command: echo "<<parameter:string_to_print>>" > printed_string.txt
37+
38+
# This schema uses the environment `temp_python_env`
39+
# which loads a python venv.
40+
# This is shown in `envs.yaml` in this repo.
41+
- objective: which_python
42+
actions:
43+
- commands:
44+
- command: which python
45+
environments:
46+
- scope:
47+
type: any
48+
environment: temp_python_env
49+
50+
# Workflow
51+
tasks:
52+
- schema: greet
53+
- schema: greet
54+
inputs:
55+
greeting: What's up
56+
name: doc
57+
- schema: python_greet
58+
inputs:
59+
greeting: Howdy
60+
name: partner
61+
- schema: print
62+
- schema: print
63+
inputs:
64+
string_to_print: another string to print!
65+
- schema: print
66+
# Explicitly reference output parameter from a task
67+
input_sources:
68+
string_to_print: task.python_greet
69+
- schema: print
70+
input_sources:
71+
# Note that local variable will appear first, regardless of its position in the list
72+
string_to_print: [task.python_greet, local]
73+
inputs:
74+
string_to_print: Yet another string to print!
75+
- schema: which_python
76+
- schema: greet
77+
sequences:
78+
- path: inputs.greeting
79+
values:
80+
- hey
81+
- see ya later
82+
- in a while
83+
nesting_order: 0
84+
- path: inputs.name
85+
values:
86+
- you
87+
- alligator
88+
- crocodile
89+
nesting_order: 1
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import json
2+
def parse_output(my_input_file: str, processed_file: str):
3+
"""Do some post-processing of data files.
4+
5+
In this instance, we're just making a dictionary containing both the input
6+
and output data.
7+
"""
8+
with open(my_input_file, "r") as f:
9+
input_data = json.load(f)
10+
with open(processed_file, "r") as f:
11+
processed_data = json.load(f)
12+
13+
combined_data = {"input_data": input_data, "output_data": processed_data}
14+
# Save file so we can look at the data
15+
with open("parsed_output.json", "w") as f:
16+
json.dump(combined_data, f, indent=2)
17+
18+
return {"parsed_output": combined_data}
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
import json
2+
def process_input_file():
3+
"""Process an input file.
4+
5+
This could be a materials science simulation for example.
6+
"""
7+
with open("input_file.json", "r") as f:
8+
data = json.load(f)
9+
data = [item * 2 for item in data]
10+
with open("processed_file.json", "w") as f:
11+
json.dump(data, f, indent=2)

0 commit comments

Comments
 (0)