Skip to content

Document schema validation order #982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
8bc3387
docs: document schema validation order
matthias-pichler Aug 21, 2024
d33c030
Update dsl.md
matthias-pichler Aug 29, 2024
bac10de
Update dsl.md
matthias-pichler Aug 29, 2024
d5cf2e1
Update dsl.md
matthias-pichler Aug 29, 2024
7c5f93c
Update dsl.md
matthias-pichler Aug 29, 2024
e446cba
Update dsl.md
matthias-pichler Aug 29, 2024
cc76e44
Update dsl.md
matthias-pichler Aug 29, 2024
1f2e32a
Update dsl.md
matthias-pichler Aug 29, 2024
672f3e1
Update dsl.md
matthias-pichler Aug 29, 2024
82f22b8
Update dsl.md
matthias-pichler Aug 29, 2024
bb21b64
Update dsl.md
matthias-pichler Aug 29, 2024
f99660e
- Refactor the OAuth2 authentication policy
cdavernas Aug 14, 2024
1f99242
Fixes examples
cdavernas Aug 14, 2024
6c5078e
Update dsl-reference.md
cdavernas Aug 14, 2024
9c35228
Update dsl-reference.md
cdavernas Aug 14, 2024
14f99f3
Added examples
cdavernas Aug 21, 2024
565de72
Fixed `call`, `raise` and `try` features
cdavernas Aug 21, 2024
5155668
Adds managing-github-issues and managing-ev-charging-stations use cases
cdavernas Aug 22, 2024
a8123e6
- Documented the difference between an event-driven schedule and a st…
cdavernas Aug 21, 2024
3ada408
Added an event-driven subsection to scheduling to explain expected wo…
cdavernas Aug 21, 2024
7bfaf5a
[NO-ISSUE] Fix: Change CI to always kick validation
ricardozanini Aug 21, 2024
e8449e3
Update dsl.md
cdavernas Aug 21, 2024
aaa81c7
Update dsl.md
cdavernas Aug 21, 2024
aa35fbd
Update dsl.md
cdavernas Aug 21, 2024
5e112d3
Update dsl.md
cdavernas Aug 21, 2024
92d845c
Update dsl.md
cdavernas Aug 21, 2024
8ef3540
Completes the EV charging station use case
cdavernas Aug 25, 2024
805ce49
- Added a new README.md to the examples directory
cdavernas Aug 25, 2024
82cf7ab
Fixes the schema and docs to make the `error.instance` property optio…
cdavernas Aug 25, 2024
c8e094a
Fixed the examples
cdavernas Aug 25, 2024
f8fd119
Fixed examples
cdavernas Aug 25, 2024
155a76f
Fixed examples
cdavernas Aug 25, 2024
0bbc40a
Fixes the `httpCall.with.body` to not be restrict its type to `object`
cdavernas Aug 25, 2024
70d5d1d
Added new examples
cdavernas Aug 25, 2024
dc371ca
Fix examples
cdavernas Aug 25, 2024
7ae6cbe
Fixed the schema to allow ISO 8601 duration expressions
cdavernas Aug 25, 2024
2b5d4cf
Added a new use case
cdavernas Aug 26, 2024
a27612b
Update examples/README.md
cdavernas Aug 26, 2024
e057780
Update examples/README.md
cdavernas Aug 26, 2024
ea13e79
Update examples/README.md
cdavernas Aug 26, 2024
f7cab2b
Update use-cases/README.md
cdavernas Aug 26, 2024
dbcf908
Update use-cases/automated-data-backup/README.md
cdavernas Aug 26, 2024
60d8dc1
Update use-cases/README.md
cdavernas Aug 26, 2024
ec84165
Update use-cases/managing-ev-charging-stations/README.md
cdavernas Aug 26, 2024
35f3878
Update use-cases/managing-github-issues/README.md
cdavernas Aug 26, 2024
a1c1f8b
Update use-cases/managing-github-issues/README.md
cdavernas Aug 26, 2024
1965a84
Update use-cases/multi-agent-ai-content-generation/README.md
cdavernas Aug 26, 2024
d48bc08
Merge branch 'main' into schema-validation-order
matthias-pichler Aug 29, 2024
7bd19f4
Merge branch 'main' into schema-validation-order
matthias-pichler Aug 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 18 additions & 14 deletions dsl-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -1589,17 +1589,17 @@ Represents the definition of the parameters that control the randomness or varia

### Input

Documents the structure - and optionally configures the filtering of - workflow/task input data.
Documents the structure - and optionally configures the transformation of - workflow/task input data.

It's crucial for authors to document the schema of input data whenever feasible. This documentation empowers consuming applications to provide contextual auto-suggestions when handling runtime expressions.

When set, runtimes must validate input data against the defined schema, unless defined otherwise.
When set, runtimes must validate raw input data against the defined schema before applying transformations, unless defined otherwise.

#### Properties

| Property | Type | Required | Description |
|----------|:----:|:--------:|-------------|
| schema | [`schema`](#schema) | `no` | The [`schema`](#schema) used to describe and validate input data.<br>*Even though the schema is not required, it is strongly encouraged to document it, whenever feasible.* |
| schema | [`schema`](#schema) | `no` | The [`schema`](#schema) used to describe and validate raw input data.<br>*Even though the schema is not required, it is strongly encouraged to document it, whenever feasible.* |
| from | `string`<br>`object` | `no` | A [runtime expression](dsl.md#runtime-expressions), if any, used to filter and/or mutate the workflow/task input. |

#### Examples
Expand All @@ -1610,9 +1610,16 @@ schema:
document:
type: object
properties:
petId:
type: string
required: [ petId ]
order:
type: object
required: [ pet ]
properties:
pet:
type: object
required: [ id ]
properties:
id:
type: string
from: .order.pet
```

Expand All @@ -1622,7 +1629,7 @@ Documents the structure - and optionally configures the transformations of - wor

It's crucial for authors to document the schema of output data whenever feasible. This documentation empowers consuming applications to provide contextual auto-suggestions when handling runtime expressions.

When set, runtimes must validate output data against the defined schema, unless defined otherwise.
When set, runtimes must validate output data against the defined schema after applying transformations, unless defined otherwise.

#### Properties

Expand All @@ -1645,16 +1652,13 @@ output:
required: [ petId ]
as:
petId: '${ .pet.id }'
export:
as:
'.petList += [ $task.output ]'
```

### Export

Certain task needs to set the workflow context to save the task output for later usage. Users set the content of the context through a runtime expression. The result of the expression is the new value of the context. The expression is evaluated against the existing context.
Certain task needs to set the workflow context to save the task output for later usage. Users set the content of the context through a runtime expression. The result of the expression is the new value of the context. The expression is evaluated against the transformed task output.

Optionally, the context might have an associated schema.
Optionally, the context might have an associated schema which is validated against the result of the expression.

#### Properties

Expand All @@ -1668,13 +1672,13 @@ Optionally, the context might have an associated schema.
Merge the task output into the current context.

```yaml
as: '.+$output'
as: '$context+.'
```

Replace the context with the task output.

```yaml
as: $output
as: '.'
```

### Schema
Expand Down
78 changes: 57 additions & 21 deletions dsl.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,98 +181,134 @@ Once the task has been executed, different things can happen:

### Data Flow

In Serverless Workflow DSL, data flow management is crucial to ensure that the right data is passed between tasks and to the workflow itself.
In Serverless Workflow DSL, data flow management is crucial to ensure that the right data is passed between tasks and to the workflow itself.

Here's how data flows through a workflow based on various transformation stages:

1. **Transform Workflow Input**
1. **Validate Workflow Input**
Before the workflow starts, the input data provided to the workflow can be validated against the `input.schema` property to ensure it conforms to the expected structure.
The execution only proceeds if the input is valid. Otherwise, it will fault with a [ValidationError (https://serverlessworkflow.io/spec/1.0.0/errors/validation)](dsl-reference.md#error).

2. **Transform Workflow Input**
Before the workflow starts, the input data provided to the workflow can be transformed to ensure only relevant data in the expected format is passed into the workflow context. This can be done using the top level `input.from` expression. It evaluates on the raw workflow input and defaults to the identity expression which leaves the input unchanged. This step allows the workflow to start with a clean and focused dataset, reducing potential overhead and complexity in subsequent tasks. The result of this expression will set as the initial value for the `$context` runtime expression argument and be passed to the first task.

*Example: If the workflow receives a JSON object as input, a transformation can be applied to remove unnecessary fields and retain only those that are required for the workflow's execution.*

2. **Transform First Task Input**
The input data for the first task can be transformed to match the specific requirements of that task. This ensures that the first task receives only the data required to perform its operations. This can be done using the task's `input.from` expression. It evaluates the transformed workflow input and defaults to the identity expression, which leaves the input unchanged. The result of this expression will be set as the `$input` runtime expression argument and be passed to the task. This transformed input will be evaluated against any runtime expressions used within the task definition.
After workflow input validation and transformation, the transformed input is passed as the raw input to the first task.

3. **Validate Task Input**
Before a task executes, its raw input can be validated against the `input.schema` property to ensure it conforms to the expected structure.
The execution only proceeds if the input is valid. Otherwise, it will fault with a [ValidationError (https://serverlessworkflow.io/spec/1.0.0/errors/validation)](dsl-reference.md#error).

*Example: If the first task is a function call that only needs a subset of the workflow input, a transformation can be applied to provide only those fields needed for the function to execute.*
4. **Transform Task Input**
The input data for the task can be transformed to match the specific requirements of that task. This ensures that the task receives only the data required to perform its operations. This can be done using the task's `input.from` expression. It evaluates the raw task input (i.e., the transformed workflow input for the first task or the transformed output of the previous task) and defaults to the identity expression, which leaves the input unchanged. The result of this expression will be set as the `$input` runtime expression argument and be passed to the task. This transformed input will be evaluated against any runtime expressions used within the task definition.

3. **Transform First Task Output**
After completing the first task, its output can be transformed before passing it to the next task or storing it in the workflow context. Transformations are applied using the `output.as` runtime expression. It evaluates the raw task output and defaults to the identity expression, which leaves the output unchanged. Its result will be input for the next task. To update the context, one uses the `export.as` runtime expression. It evaluates the raw output and defaults to the expression that returns the existing context. The result of this runtime expression replaces the workflow's current context and the content of the `$context` runtime expression argument. This helps manage the data flow and keep the context clean by removing any unnecessary data produced by the task.
*Example: If the task is a function call that only needs a subset of the workflow input, a transformation can be applied to provide only those fields needed for the function to execute.*

*Example: If the first task returns a large dataset, a transformation can be applied to retain only the relevant results needed for subsequent tasks.*
5. **Transform Task Output**
After completing the task, its output can be transformed before passing it to the next task or storing it in the workflow context. Transformations are applied using the `output.as` runtime expression. It evaluates the raw task output and defaults to the identity expression, which leaves the output unchanged. Its result will be input for the next task.

4. **Transform Last Task Input**
Before the last task in the workflow executes, its input data can be transformed to ensure it receives only the necessary information. This can be done using the task's `input.from` expression. It evaluates the transformed workflow input and defaults to the identity expression, which leaves the input unchanged. The result of this expression will be set as the `$input` runtime expression argument and be passed to the task. This transformed input will be evaluated against any runtime expressions used within the task definition. This step is crucial for ensuring the final task has all the required data to complete the workflow successfully.
*Example: If the task returns a large dataset, a transformation can be applied to retain only the relevant results needed for subsequent tasks.*

*Example: If the last task involves generating a report, the input transformation can ensure that only the data required for the report generation is passed to the task.*
6. **Validate Task Output**
After `output.as` is evaluated, the transformed task output is validated against the `output.schema` property to ensure it conforms to the expected structure. The execution only proceeds if the output is valid. Otherwise, it will fault with a [ValidationError (https://serverlessworkflow.io/spec/1.0.0/errors/validation)](dsl-reference.md#error).

5. **Transform Last Task Output**
After the last task completes, its output can be transformed before it is considered the workflow output. Transformations are applied using the `output.as` runtime expression. It evaluates the raw task output and defaults to the identity expression, which leaves the output unchanged. Its result will be passed to the workflow `output.as` runtime expression. This ensures that the workflow produces a clean and relevant output, free from any extraneous data that might have been generated during the task execution.
7. **Update Workflow Context**
To update the context, one uses the `export.as` runtime expression. It evaluates the transformed task output and defaults to the expression that returns the existing context. The result of this runtime expression replaces the workflow's current context and the content of the `$context` runtime expression argument. This helps manage the data flow and keep the context clean by removing any unnecessary data produced by the task.

*Example: If the last task outputs various statistics, a transformation can be applied to retain only the key metrics that are relevant to the stakeholders.*
8. **Validate Exported Context**
After the context is updated, the exported context is validated against the `export.schema` property to ensure it conforms to the expected structure. The execution only proceeds if the exported context is valid. Otherwise, it will fault with a [ValidationError (https://serverlessworkflow.io/spec/1.0.0/errors/validation)](dsl-reference.md#error).

6. **Transform Workflow Output**
Finally, the overall workflow output can be transformed before it is returned to the caller or stored. Transformations are applied using the `output.as` runtime expression. It evaluates the last task's output and defaults to the identity expression, which leaves the output unchanged. This step ensures that the final output of the workflow is concise and relevant, containing only the necessary information that needs to be communicated or recorded.
9. **Continue Workflow**
After the context is updated, the workflow continues to the next task in the sequence. The transformed output of the previous task is passed as the raw input to the next task, and the data flow cycle repeats.
If no more tasks are defined, the transformed output is passed to the workflow output transformation step.

10. **Transform Workflow Output**
Finally, the overall workflow output can be transformed before it is returned to the caller or stored. Transformations are applied using the `output.as` runtime expression. It evaluates the last task's transformed output and defaults to the identity expression, which leaves the output unchanged. This step ensures that the final output of the workflow is concise and relevant, containing only the necessary information that needs to be communicated or recorded.

*Example: If the workflow's final output is a summary report, a transformation can ensure that the report contains only the most important summaries and conclusions, excluding any intermediate data.*

11. **Validate Workflow Output**
After `output.as` is evaluated, the transformed workflow output is validated against the `output.schema` property to ensure it conforms to the expected structure. The execution only proceeds if the output is valid. Otherwise, it will fault with a [ValidationError (https://serverlessworkflow.io/spec/1.0.0/errors/validation)](dsl-reference.md#error).

By applying transformations at these strategic points, Serverless Workflow DSL ensures that data flows through the workflow in a controlled and efficient manner, maintaining clarity and relevance at each execution stage. This approach helps manage complex workflows and ensures that each task operates with the precise data required, leading to more predictable and reliable workflow outcomes.

Visually, this can be represented as follows:

```mermaid
flowchart TD

subgraph Legend
legend_data{{Data}}
legend_schema[\Schema/]
legend_transformation[Transformation]
legend_arg([Runtime Argument])
end

initial_context_arg([<code>$context</code>])
context_arg([<code>$context</code>])
input_arg([<code>$input</code>])
output_arg([<code>$output</code>])

workflow_raw_input{{Raw Workflow Input}}
workflow_input_schema[\Workflow: <code>input.schema</code>/]
workflow_input_from[Workflow: <code>input.from</code>]
workflow_transformed_input{{Transformed Workflow Input}}

task_raw_input{{Raw Task Input}}
task_input_schema[\Task: <code>input.schema</code>/]
task_input_from[Task: <code>input.from</code>]
task_transformed_input{{Transformed Task Input}}
task_definition[Task definition]
task_raw_output{{Raw Task output}}
task_output_as[Task: <code>output.as</code>]
task_transformed_output{{Transformed Task output}}
task_output_schema[\Task: <code>output.schema</code>/]
task_export_as[Task: <code>export.as</code>]
task_export_schema[\Task: <code>export.schema</code>/]

new_context{{New execution context}}

workflow_raw_output{{Raw Workflow Output}}
workflow_output_as[Workflow: <code>output.as</code>]
workflow_transformed_output{{Transformed Workflow Output}}
workflow_output_schema[\Workflow: <code>output.schema</code>/]

workflow_raw_input --> workflow_input_from
workflow_raw_input -- Validated by --> workflow_input_schema
workflow_input_schema -- Passed to --> workflow_input_from
workflow_input_from -- Produces --> workflow_transformed_input
workflow_transformed_input -- Set as --> initial_context_arg
workflow_transformed_input -- Passed to --> task_raw_input

subgraph Task

task_raw_input -- Passed to --> task_input_from
task_raw_input -- Validated by --> task_input_schema
task_input_schema -- Passed to --> task_input_from
task_input_from -- Produces --> task_transformed_input
task_transformed_input -- Set as --> input_arg
task_transformed_input -- Passed to --> task_definition

task_definition -- Execution produces --> task_raw_output
task_raw_output -- Passed to --> task_output_as
task_output_as -- Produces --> task_transformed_output
task_output_as -- Set as --> output_arg
task_transformed_output -- Passed to --> task_export_as
task_transformed_output -- Set as --> output_arg
task_transformed_output -- Validated by --> task_output_schema
task_output_schema -- Passed to --> task_export_as
task_export_as -- Produces --> new_context
new_context -- Validated by --> task_export_schema
end

task_transformed_output -- Passed as raw input to --> next_task

subgraph next_task [Next Task]
end

task_export_as -- Result set as --> context_arg
new_context -- set as --> context_arg

next_task -- Transformed output becomes --> workflow_raw_output
workflow_raw_output -- Passed to --> workflow_output_as
workflow_output_as -- Produces --> workflow_transformed_output
workflow_transformed_output -- Validated by --> workflow_output_schema
```

### Runtime Expressions
Expand Down
Loading