Skip to content

Expressions vs. string interpolation (<strike>paths</strike>) #285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorgenj opened this issue Mar 5, 2021 · 80 comments
Closed

Expressions vs. string interpolation (<strike>paths</strike>) #285

jorgenj opened this issue Mar 5, 2021 · 80 comments
Labels
area: spec Changes in the Specification

Comments

@jorgenj
Copy link
Contributor

jorgenj commented Mar 5, 2021

What would you like to be added:

There are really 3 concerns at hand in this issue:

  1. In situations where a field expects an expression (ie; actionDataFilter.result) is it necessary for the expression text to be surrounded by ${}.
  2. In situations where the expected type of result of an expression is unknown, a user should be able to clearly indicate whether they expect string interpolation or json data to be returned (for example; specifying arguments to a function invocation)
  3. Some data-filter fields return data, and others return paths where other data should be merged (actionDataFilter.result vs. actionDataFilter.toStateData). These different concepts should be clarified, and in particular the "path" type fields should have some examples talking about the complex ways in which it's expected to work.

Fix all the places where an 'expression' is used to specify a location in a json structure where data should be inserted/merged.

Why is this needed:

Regarding string interpolation vs. workflow expressions, if both use-cases are basically a jq expression in a string surrounded by ${}, then there are situations where it's unclear which is intended. Imho, for fields where the purpose of the field is to accept an expression, the DSL should not require the ${}, and that ${} should be reserved to indicate string interpolation is expected.

Recent changes to actionDataFilter confusingly use 'expression' for 'toStateData' attribute, and there are presumably many other places in the spec that also need to be fixed.

Expressions (denoted by a string with ${...}) typically extract or transform some values. On the other hand, a property like actionDataFilter.toStateData is expecting the user to indicate a 'path' or location in the state data where the results should be inserted/merged. It doesn't make sense to call this an 'expression', or to have the user style it like an expression. The proposal is to fix the descriptions, and remove the surrounding ${} from all of the examples.

@jorgenj jorgenj mentioned this issue Mar 5, 2021
5 tasks
@tsurdilo tsurdilo added the area: spec Changes in the Specification label Mar 5, 2021
@tsurdilo
Copy link
Contributor

tsurdilo commented Mar 7, 2021

i dont really get this at all, but others maybe do so please chime in. for me its much less confusing just to say "if a string is supposed to be evaluated as an expression by the runtime put it inside ${ ... }".

@jorgenj
Copy link
Contributor Author

jorgenj commented Mar 8, 2021

See my comments here on 'interpolation':

#286 (comment)

@jorgenj
Copy link
Contributor Author

jorgenj commented Apr 14, 2021

Reposting this here for clarity:

Consider this example from bash:

export FOO=BAR
export ${FOO}=BAR

env

What do you expect the output of env to include?

Now consider the following example with how the spec currently handles this:

toStateData: ${ .foo }

Given the following action results:

baz: 1

Given the existing state data of:

foo: .bar

Please take a moment to consider how a user might think that they should be getting the following output because the expression ${.foo} on state data in other contexts would produce a string with the value .bar:

foo: .bar
bar: 
  baz: 1

Overall, I think there's something wrong about the way the spec is using the '${}'. In any other language I've seen, something like that is used for variable interpolation embedded in a string. In fact I think I saw examples where that's how it's being used (for example : Some string ${.value1} with ${.value2} used as a parameter for a function call).

I would propose the following be considered:

  • In situations where a string value should be produced, we support embedded interpolation of expressions via ${}, where the result will be a string representation of whatever is returned by the embedded expression.
    • toStateData is one such case, the result should be a string indicating a location in the stateData where the results should be inserted. An example value should be .foo, but we could allow interpolation like .foo.${.someOtherValue} which might produce the actual string .foo.bar. If that were the case, then the action results would be inserted at .foo.bar.
  • In situations where a json result should be produced, the user must specify a string containing the text of a jq expression to be evaluated, but it should not include the ${} because we're not producing a string and they're not doing variable interpolation.
    • ActionDataFilter would be an example here, it should produce a json value. The user should indicate that via some jq syntax like .foo (note that there's no ${} because we're not doing interpolation)

@tsurdilo & @neuroglia Could you please help me understand your point of view on this?

@jorgenj
Copy link
Contributor Author

jorgenj commented Apr 16, 2021

Here's an snippet from the brigadier example that I think illustrates why this is confusing the way ${} are being used:

- name: NextEventState
  type: event
  onEvents:
  - eventRefs:
    - nextEvent
    eventDataFilter:
      toStateData: "${ .nextEvent }"
    actions:
    - name: consoleLogAction
      functionRef:
        refName: consoleLogFunction
        arguments:
          log: "fired ${ .nextEvent.data.type } caused by ${ .nextEvent.data.cause.event }"

Imagine you're a user reading this spec for the first time and you see this example. Clearly the last line is trying to do some variable interpolation, but then...what's this toStateData supposed to return if these ${} indicate interpolation? A new user might guess that it's going to use the value returned from .nextEvent as the location in stateData to insert the filtered event data?

@jorgenj
Copy link
Contributor Author

jorgenj commented May 10, 2021

Not getting a lot of response here, so I'll try again with an example workflow definition where the expression language feature is very ambiguous on how it should work.

Given the following workflow definition

...
states:
- name: MyInjectState
  type: inject
  data: 
      a: 
         sub1: "foo"
      b:
      - "bar"
      - "baz"
      c: aString
  transition: MyOperationState
- name: MyOperationState
  type: operation
  actions:
  - functionRef
     refName: myFunction
     arguments:
       f1: ${.a}
       f2: ${.b}
       f3: interpolating ${.a}, ${.b} and ${.c}
  end: true

Can someone please explain what they think the payload to myFunction will contain?

@tsurdilo
Copy link
Contributor

Can you please give some explanation for the example:

  1. StateDataFilter has input/output properties which are type String and have to be expressions.
  2. are f1, f2, f3 example expressions for the stateDataFilter "output" property?

The result would always be the result of the expression applied to the associated input data

@jorgenj
Copy link
Contributor Author

jorgenj commented May 10, 2021

Shoot, yah, this is a bad example. My intent is to show an example with the ambiguous examples side-by-side but this was the wrong way to express that. Reworking the example.

@jorgenj
Copy link
Contributor Author

jorgenj commented May 10, 2021

@tsurdilo I updated the example to something that's hopefully more correct syntax.

@jorgenj
Copy link
Contributor Author

jorgenj commented May 10, 2021

Here's what I would expect as the content of the arguments sent to the function:

{
  "f1": { "sub1": "foo" },
  "f2": [ "bar", "baz" ],
  "f3": "interpolating { \"sub1\": \"foo\" },  [ \"bar\", \"baz\" ], and \"aString\""
}

Is this what others would expect?

@cdavernas
Copy link
Member

@jorgenj it's indeed the logical output that'd be produced with something equivalent in, say, js.

In jq, though, it would result in an error, because as discussed today, jq only supports explicit strings in such cases. Even concatenating a number fails without explicit cast. There probably are methods to do so, I guess: I'm no jq expert.

I agree however that this is confusing and might lead to unexpected/inconsistent behaviors across runtime.

IMHO, interpolation should be delegated to expression language, or be defined explicitly (aka with some kind of markup), or be removed entirely as a feature supported by the spec.

@jorgenj
Copy link
Contributor Author

jorgenj commented May 10, 2021

Thanks for your input Charles. I think I agree about the notion of potentially removing interpolation for now, I think there's some edge-cases that aren't clearly defined that will be confusing and like you hinted, it seems like a pretty complex feature for runtimes to implement.

@tsurdilo
Copy link
Contributor

tsurdilo commented May 10, 2021

@jorgenj @neuroglia the results are what the expression returns, no magic.

I would just use jq so there are no issues:

"interpolating ${.b | tostring} and ${.a | tostring} and ${.c | tostring}"

Idk why string concatenation would be too difficult to implement :(

if result of expression is not a string and you want to concatenate it with a string, that would be same as in any programming language (would cause an error)

@jorgenj
Copy link
Contributor Author

jorgenj commented May 10, 2021

@tsurdilo

Great, so given that example, here's the equivalent jq as you suggested:

jq '{ f1: .a, f2: .b, f3: ("interpolating "+ (.b | tostring) +" and "+(.a | tostring) + " and " + (.c | tostring)) }'

In this case, some of the fields got the raw json value assigned, and in others interpolation was done.

How is a user supposed to understand when interpolation will occur, vs. the raw value being used by the workflow?

Put another way, for f1 the user gave the expression "${.a}" and got an object, not the tostring of that object. What if they really wanted the tostring of that and not the actual object?

@tsurdilo
Copy link
Contributor

tsurdilo commented May 10, 2021

so in your example:

f1 is
{
"sub1": "foo"
}

f2 is:
[
"bar",
"baz"
]

and

f3:

"interpolating {"sub1":"foo"}, ["bar","baz"] and aString" (with escaped quotes idk how to write it here)

@tsurdilo
Copy link
Contributor

Great, so given that example, here's the equivalent jq as you suggested:

jq '{ f1: .a, f2: .b, f3: ("interpolating "+ (.b | tostring) +" and "+(.a | tostring) + " and " + (.c | tostring)) }'

In this case, some of the fields got the raw json value assigned, and in others interpolation was done.

How is a user supposed to understand when interpolation will occur, vs. the raw value being used by the workflow?

I think your expression returns a JSON value so its result would be JSON, so your parameter would have the JSON type

when you mix text and expressions its string concatenation so results of expressions should be string type ( | tostring)

hope this helps

@tsurdilo
Copy link
Contributor

im pretty sure you can use regex to see if on runtime end you have text mixed in with expressions (${ .... }) and then
just concatenate the static text with results of expressions.
maybe there is some problem i am not seeing

@jorgenj
Copy link
Contributor Author

jorgenj commented May 10, 2021

Well, one concern here is that there seem to be pitfalls here for users of the workflow language:

  • " ${ .foo }" - is this interpolation, since it's technically concatenating a string with an expression? The user might have done this on accident, or did they?
  • "${ .foo }" - Some users might be confused if this is interpolation or not, since this could be seen as the concatenation of empty string and an expression (ie; in jq "" + (.foo | tostring))

It also seems like there's a situation where a user might want to force interpolation, as in they expect the expression ${.a} to be the tostring of the object at .a.

@tsurdilo
Copy link
Contributor

we have to add expressions in string type properties i dont think there is another way.
these types of user errors i think can be caught during testing
i think for your first case (with leading space it would cause the return to be string type and could cause error if expression result is not. but users dont have do "force" they can just pipe it throgh jq tostring if they wish

i would suggest for users that are non techical to use just expression functiosn and have devels set up expressiosn for them in function definitions so they can just use them by name and not have to write them themselves

@tsurdilo
Copy link
Contributor

i honestly think that our DSL is by far superior to like other DSLs in this regard , and if you look at like bpmn its much much worse where you non technical users would have to write programming language code and all kinds of stuff can go wrong there. i think thats a problem in general with DSL that you kinda have to use an expression language. thats why we want to move to true DSL to get to a point where in control flow logic users can reference everything by logical names rather than having to type in expressions...we will get there eventually

@jorgenj
Copy link
Contributor Author

jorgenj commented May 10, 2021

@tsurdilo

Thanks for the context, I agree that the DSL here is pretty neat, I'm just trying to clear up some confusion, please bear with me for a second.

It seems like we generally have a common understanding of how interpolation should work so I'd like to then turn the question to the following example:

...
states:
- name: MyInjectState
  type: inject
  data: 
      a: .b
      b: foo
  transition: MyOperationState
- name: MyOperationState
  type: operation
  actions:
  - functionRef
     refName: myFunction
     arguments:
       ...
     actionDataFilter:
       toStateData: ${.a}
  end: true

Let's assume the function call there is returning the following payload:

"Hello world"

When that workflow completes, what would you expect the resulting stateData from MyOperationState to look like?

Given that we've established above that when a user sees dollar-curly-braces ("${.a}") that this evaluates an expression, then it stands to reason that some users would naturally be confused if the output of the above workflow is not the following:

{
  "a": ".b",
  "b": "Hello world"
}

Edited to add:

ie; the expression ${.a} evaluates to ".b", so it stands to reason that the stateDataFilter might assign the results to "b" in the stateData rather than "a".

@tsurdilo
Copy link
Contributor

tsurdilo commented May 11, 2021

@jorgenj

since .a already exists and has a value of type string, you can see "toStateData" as jq:

.a += <results of function invocation>

so in this case the state data should be:

{
  "a": ".bHello World",
  "b": "foo"
}

if the results of the function invocation is not a string, jq should raise you error (string and object cannot be added or something similar iirc)

@jorgenj
Copy link
Contributor Author

jorgenj commented May 11, 2021

@tsurdilo

Hmm, it seems like you're ignoring the fact that "${.a}" looks like a workflow expression. Are you suggesting that the toStateData field does not act like all other workflow expressions in the DSL?

@tsurdilo
Copy link
Contributor

tsurdilo commented May 11, 2021 via email

@tsurdilo
Copy link
Contributor

tsurdilo commented May 11, 2021

Hmm, it seems like you're ignoring the fact that "${.a}" looks like a workflow expression.

is it not ?

toStateData applies the action results and adds/merges it with with the state data value selected by the expression in this case it would be the top-level "a" element of state data.

Are you suggesting that the toStateData field does not act like all other workflow expressions in the DSL?

i dont understand this question
toStateData value is a selection expression , meaning that it selects an element of the state data (if it exists / or creates a new one if it doesnt)...so yes it has its own logic. it tells you which element of the state data to add/merge the action results with.

so idk if i am writing a runtime here or not but you could impl it with a jq expressions:

  1. if the selected state data element does not exist:

. += {"a": "value"}

  1. if the selected state data element does exist:

.a += "value"

This is a tiny bit more complicated when the selection expression in toStateData is not a top-level selection but perfectly doable imo

@jorgenj
Copy link
Contributor Author

jorgenj commented May 11, 2021

Hmm, it seems like you're ignoring the fact that "${.a}" looks like a workflow expression.

is it not ?

If you were to put the same expression string in stateDataFilter.output it would produce the string ".b" in my example but in this context it somehow instead tells the runtime to put the results at ".a"?

@jorgenj
Copy link
Contributor Author

jorgenj commented May 12, 2021

i dont really get this at all, but others maybe do so please chime in. for me its much less confusing just to say "if a string is supposed to be evaluated as an expression by the runtime put it inside ${ ... }".

This is back to your original response, the point I'm trying to make is that some of these fields are clearly not being treated as a jq expression to be evaluated against some data, for example with actionDataFilter.toStateData it's not evaluating an expression to produce data but rather it's just giving a location where some other data should be merged.

Expressions (denoted by a string with ${...}) typically extract or transform some values. On the other hand, a property like actionDataFilter.toStateData is expecting the user to indicate a 'path' or location in the state data where the results should be inserted/merged. It doesn't make sense to call this an 'expression', or to have the user style it like an expression. The proposal is to fix the descriptions, and remove the surrounding ${} from all of the examples.

In this case, by "all of the examples" I meant that things like toStateData should be fixed to not be styled as an expression, ie; change "${.foo}" to just ".foo".

Perhaps it'd help illustrate my point if I just push a PR with some of the examples being updated and we can review/discuss with some more concrete examples.

@cdavernas
Copy link
Member

cdavernas commented May 12, 2021

@jorgenj As far as I know, toStateData is a jq expression, too, and not simply a path. You actually can execute complex jq expressions at that point. Isn't that correct @tsurdilo ?

@tsurdilo
Copy link
Contributor

tsurdilo commented May 12, 2021

@cdavernas @jorgenj

As far as I know, toStateData is a jq expression, too, and not simply a path. You actually can execute complex jq expressions at that point. Isn't that correct @tsurdilo ?

It has to be an expression...thats the only way its possible to do anything productive with it. having just path is imo useful for custom, simple scenarios only.

a simples example i can come up with is lets say we have a switch state
and depending on data we either call operation state A which calls function X or operation state B which calls function Y, both being 3rd party services...
function X results put our state data as:

{
    "shoppers": [
        {
            "id": 111,
            "age": 15
        },
        {
            "id": 222,
            "age": 21
        }
]
}

and function Y results (since we dont control them really...3rd party remember) gives us a slightly different result

{
    "buyers": [
        {
            "id": 111,
            "age": 15
        },
        {
            "id": 222,
            "age": 21
        }
]
}

if later on in our workflow we want to add some event results or function results ..lets say add another person ..using just paths we have to hard-code "shoppers" or "buyers"...which can produce completely wrong results and even issues.

with using expressions we can deal with third party services different results (that we dont control), for example:

"toStateData": "${ (try .shoppers catch null) as $selected | if ($selected) then $selected else .buyers end }"

this basically says to select ".shoppers"..if shoppers is null (not and object) to use ".buyers"

do that with path only :)

there is a ton more examples like this that actually can solve real world scenarios.

hope this helps.

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 7, 2021

It fails the requirement because it does not provide any inherent way to produce a string that is the result of evaluating an expression with no prefix or suffix, e.g. definitelyString: "${ .state.number }" would become definitelyString: 42 rather than the desired definitelyString: "42".

"numparam": "$expr .aNumber" fails this too (.aNumber can return an array/string/... ..but requires this super ugly way of writing it)
This is really something that users need to make sure they have it right during testing if possible and to make sure they do error checking and handle errors that they run into. this is same really thing with any workflow dsl imo

And it's also extremely counterintuitive to have the data type of a result change because of the presence vs. absence of characters surrounding what looks like string interpolation.

Why? its the same thing as having a method that returns Object and cast it depending on what you want (and it can throw exception if its not):

 String a = (String) getMyData(...);
 Object b = getMyData(....);

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 7, 2021

i think possibly it would be good to be able to provide schema definition for state data..that way we could enforce property types. not easy to do but can take a look

@manuelstein
Copy link
Contributor

manuelstein commented Jun 7, 2021

Unfortunately there's no type casting in JSON. It sounds okay, no? Anything that looks like "abcd$( )abcd" will be a string but "$( )" will be whatever is returned by the expression. Simple. But if I can reproduce the problem, I think you can't distinguish these two:

  • "​$()"
  • "$()"

The first one is "&#8203;$()" and the second is "$()". So, IIUC that's a problem when your workflow definition document uses UTF* and you start wondering why it renders the jq expression as a string although you don't see any additional characters. Not that I was using any other languages that require UTF - I trust that it does cause real problems and I hope that information is dependable. And I'd love to see a convincing argument that this sort of problem wouldn't appear, i.e. the problem that the user is by no means able to see why what is seen does not what it is supposed to do.

@gibson042
Copy link
Contributor

It fails the requirement because it does not provide any inherent way to produce a string that is the result of evaluating an expression with no prefix or suffix, e.g. definitelyString: "${ .state.number }" would become definitelyString: 42 rather than the desired definitelyString: "42".

"numparam": "$expr .aNumber" fails this too (.aNumber can return an array/string/... ..but requires this super ugly way of writing it)

No, the $expr … syntax would be an explicit means of opting in to type replacement that isn't functionality of "${ .aNumber }".

This is really something that users need to make sure they have it right during testing if possible and to make sure they do error checking and handle errors that they run into. this is same really thing with any workflow dsl imo

You're still missing the point. Your proposal does not provide any inherent way to produce a string that is the result of evaluating an expression with no prefix or suffix, and would therefore be a problem even if everyone liked the syntax.

And it's also extremely counterintuitive to have the data type of a result change because of the presence vs. absence of characters surrounding what looks like string interpolation.

Why? its the same thing as having a method that returns Object and cast it depending on what you want (and it can throw exception if its not):

 String a = (String) getMyData(...);
 Object b = getMyData(....);

It's not the same because you aren't providing any way to do the string cast!

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 7, 2021

No, the $expr … syntax would be an explicit means of opting in to type replacement that isn't functionality of "${ .aNumber }".

I don't see how it's just a way of expressing it
"$exp ....."
"${ ....}"
"@@.....@@@"

whats the point? Only one is that our syntax uses the ${ expression } expression holder syntax that is widely adopted and "$exp expression" is from i can tell something you came up with. Json Schema uses this yes but i think its meant for parameter names not as a regex inside values?

If we wanted to adopt something from Json Schema then I'd rather use the
#
syntax as for example used to link to definition elements in param values.

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 7, 2021

It's not the same because you aren't providing any way to do the string cast!

what string cast ..the example i show above does a string cast and even has the capability to check errors better
To show again:

   "abc": "Hello ${.greeting}"

has a return type of string because it contains characters/spaces in addition to the expression. this imo makes things even simpler for runtimes as as soon as you eval an expression that is not string return type you can raise an error
you dont have to evaluate 100 expressions before u know the final type

@gibson042
Copy link
Contributor

gibson042 commented Jun 7, 2021

The same syntax cannot be used for both string interpolation and type replacement—something like "${ .aNumber }" can only do one or the other, resulting in either a string or a number. Assuming it does the former because that is the case for similar-looking "prefix:${ .aNumber }:suffix", a distinct syntax is necessary for type replacement—and I presented two general approaches above, "object with privileged property" (e.g., { "$expr": ".aNumber" }) and "string with privileged prefix" (e.g., "$expr .aNumber"), both of which can be subject to bikeshedding about the particular name of the property or specific characters of the prefix.

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 8, 2021

@gibson042 @manuelstein @cdavernas
Been thinking about this for a while and I think the outcome is that using expressions inside DSL is "evil" and that we should actually move towards offloading expressions to the runtimes as much as possible.

So what I would propose is these simple rules:

  1. If a value of property contains any text, spaces idk what..its considered a string and is NOT considered an expression
    So things like

          "Hello ${.greeting}"
    

is a string ..nothing else.

  1. If a property value has a special syntax, meaning it starts with "${ and end with "}", or whatever other special chars we chose to do, it is considered a reference to an expression function.
    And yes if @gibson042 is hard-stuck on "$expr" type syntax we could do:

          "$expr Is Something Valid" 
    

even (as a replacement to example below)

A concrete example can look like this:

a) Top-level Function definitions:

        "functions": [{
                  "name": "Is Something Valid",
                  "operation": "isSomethingValid",
                  "type": "expression"
        }]

b) Then inside your states control flow logic:

        "myParam": "${ Is Something Valid }"

Since this string has that "special" expression format it itself cannot define an expression but can only be a reference to a single expression function.

This way runtimes can map
"operation": "isSomethingValid"
to whatever the heck they feel like. It could be a Class which contains static mapping to this name and executes whatever language they want.....or it can be a goroutine or whatever the runtimes dream up.

Runtimes can then also provide their own libraries where they can provide users with a list of pre-defined expressiosn that can be used if they want.

I think this is the best idea as if if its not this thing that we are talking about regarding expressions it'll be something else in the future.

Also this way we do no longer require even the use of "expressions" and i would change the function type "expression" to something else even like "customeval" or something.

That said tho, for users that DO want to use expression inside expression type functions, they can still do it.
But then still this eliminates the problem of this issue I believe as then you always use the return type of that expression definition in the "operation" property.

WDYT?

@jorgenj
Copy link
Contributor Author

jorgenj commented Jun 8, 2021

Are you suggesting that stateDataFilter/actionDataFilter/eventDataFilter no longer support expressions as well? Just trying to understand if this applies to any field anywhere in the spec where currently expressions are allowed, or only certain places?

A clarifying question, is your item 1 there explicitly saying that support for interpolation is being removed from the DSL all together?

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 8, 2021

Are you suggesting that stateDataFilter/actionDataFilter/eventDataFilter no longer support expressions as well? Just trying to understand if this applies to any field anywhere in the spec where currently expressions are allowed, or only certain places?

No, no parameters in spec would change. filters work as currently. Runtimes just have the choice again to use expressions or use programming language. So benefit there too.

A clarifying question, is your item 1 there explicitly saying that support for interpolation is being removed from the DSL all together?

Yes it is removed with this approach from the DSL. Runtimes can interpolate all they want tho..up to you.

@manuelstein
Copy link
Contributor

Runtimes can interpolate all they want tho..up to you.

Hmm, I would want a workflow to be portable and not have some unknown magic kick in on another runtime, so I don't think leaving it to the runtime is good for portability.

Regarding the question:

Are you suggesting that stateDataFilter/actionDataFilter/eventDataFilter no longer support expressions as well?

I had a similar question in mind. Can I transform the data like this:

"stateDataFilter": { "input": "${ .customers = (.customers|map(select(.active))) }" }

or does it always have to be a function? Why should the content prefixed by $expr or wrapped between ${ } not be an expression? Do you want to allow multiple expression languages maybe, because I think that was not part of this issue, but it would be convenient.

Regarding

If a property value has a special syntax, meaning it starts with "${ [...]
About that thing that we allow expressions in any property values (like the example that determines the version at runtime). I don't know if I we can stick with that degree of freedom. Do we allow a state like this:

{
    "name": "${ \"MyWorkflowNameWith-\\(.nameSuffix)\" }",
    "type": "operation",
    "actionMode": "${ .actionMode }",
    "actions": "${ [range(4)] | map({ \"functionRef\": \"func\\(.)\" }) }",
    "end": true
}

or even allow to create the entire list of states from a complex expression?
We need to put some limits to where expressions can appear, otherwise we might get some workflows that mutate along their execution and it causes more confusion about when and how often the expressions should be evaluated, e.g. timeouts: if my action timeout depends on workflow data and that workflow data changes between actions, e.g. is increased by action results, do I have to evaluate the timeout expression only when entering the state or before every action? It's all possible, in theory, but it's asking a lot from implementations to support such kind of sorcery and just putting it out there leaves a lot to interpretation.

Let's start with removing the confusion between interpolation and expressions.
We have proposals:

  • A) Remove interpolation and only allow expressions
    • A.1) Expressions are string values that start with $( and end on )
    • A.2) Expressions are string values prefixed by a token $expr
    • A.3) Expressions can only be used in workflow functions but can be referenced
  • B) Separate interpolation from expressions
    • B.1) Use $() for expressions and %() for interpolation only
    • B.2) Prefix the expressions "$expr .[] and bracket the interpolations "Hello %( .name ), how are you?"
  • C) Have a field type expression (like url has a special format or uuid) and use it on fields where expressions are used, namely
    • stateDataFilter.(input|output)
    • actionDataFilter.(fromStateData|results)
    • eventDataFilter.data
    • functionDef.operation (if type=expression)
      Also, we should distinguish them from the path type that points to a unique location in the state data (or that is used as the left hand side of the expression path = result):
    • actionDataFilter.toStateData
    • eventDataFilter.toStateData

In my opinion, only actions should be able to reference functions. Expressions are a different thing.
We could use expressions as action functions, but I don't think we should use any action function for expressions.

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 9, 2021 via email

@manuelstein
Copy link
Contributor

Ok, better to go one step at a time.
I tried to summarize the different proposals that were laid out.
I do understand that you want to do A. remove interpolation.
But I do not understand if you want A.1, A.2 or A.3 ? Is it A.1/2) allow expressions anywhere in the document with some markup or A.3) only allow expression language in the functions and only allow to reference functions in the document. Which one is it you're proposing?

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 9, 2021

But I do not understand if you want A.1, A.2 or A.3 ? Is it A.1/2) allow expressions anywhere in the document with some markup or A.3) only allow expression language in the functions and only allow to reference functions in the document. Which one is it you're proposing?

Ah OK, good question, so what I am saying is let's say you have two expression function definitions:

"functions": [{
                  "name": "Customer Id",
                  "operation": "getCustomerId",
                  "type": "expression"
        },
        {
                  "name": "Customer Name",
                  "operation": ".customer.name",
                  "type": "expression"
        }]

string type properties then can either 1) be a string 2) reference an expression function with "special syntax", meaning it can be either "${ Customer Id }" or "$expr: Customer Name" or whatever is chosen (I prefer to keep it as it is now as its widely used syntax).

From runtime perspective they can treat the "operation" value of the expression function as they feel like. If they want to treat it as an expression given the provided "expressionLang" setting thats fine..if they want to treat it as a string that maps to some internal code in any programming language thats fair as well...either way its return type is the return type of the function and thus becomes the type that is plugged back into the property.
No interpolation, no more expressions inside property values.
It does put a strain on the users but its much more clear why that is than introducing some weird rules that noone can understand, using two or three different syntax in text values ..just confuses things,

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 9, 2021

{
    "name": "${ \"MyWorkflowNameWith-\\(.nameSuffix)\" }",
    "type": "operation",
    "actionMode": "${ .actionMode }",
    "actions": "${ [range(4)] | map({ \"functionRef\": \"func\\(.)\" }) }",
    "end": true
}

or even allow to create the entire list of states from a complex expression?

@manuelstein expressions can only manipulate state data not the workflow definition.
yes you can "blow up" your state data with expressions but so can you with any programming language as well. Am I missing the point here?

@manuelstein
Copy link
Contributor

manuelstein commented Jun 9, 2021

expressions can only manipulate state data not the workflow definition.

There's an example in the section workflow expressions that modifies the workflow version attribute:

For example let's say that we have a workflow data input of:

{
   "inputVersion" : "1.0.0"
}

we can use this expression in the workflow "version" parameter:

{
   "id": "MySampleWorkflow",
   "name": "Sample Workflow",
   "version": "${ .inputVersion }",
   "specVersion": "0.7"
}

And of course, we want to use them to define the arguments of a functionRef, i.e. input to calling services.

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 9, 2021

@manuelstein good point but thats setting the values of the workflow definition, my understanding is you said you could arbitrarily add workflow states, the structure of the workflow def cannot be changed.

this uses in-line expressions in string type prop values. so we would change that to:

"version": "${ Get Input Version }" where you would have to define an expression function that deals with that.

yes this is a "pain" but after all this conversation it is a much cleaner solution than the stuff that was proposed imo. it enhances portability and reusability (you can also reference this "Get Input Version" function in other places) so +100 ;)

The only thing I would also add in the future is the ability for expression type functions to define which "data" they are to be evaluated against. For example it can either be "state data", or "Workflow data inputs" for properties that are not inside data filters, for those its pretty clear what data input they are to be evaluated against.

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 9, 2021

And just to add kinda like last point really...I do not think that what we currently have is really a problem.
Yeah I get it, it has some issues with this "interpolation" but i do also see some big players in this workflow DSL market not addressing it at all...at least we are :)

There are other types of possible solutions too. Since its not possible via JsonSchema to "tag" a property as an "expression type" property we could use a similar approach to what AsyncApi for example does with extension properties (by defining the "x-" prefix on them, we could do like "expr-abc" to tag them as such for example, for properties that we know must be expressions. this could completely eliminate the "special syntax"

So I understand in the end that there is changes requested and I will work on setting up examples to show before-after and let's take a look.

@manuelstein
Copy link
Contributor

Yes, exactly, having a property type (solution C) would eliminate the syntax.
To use "expr-numparam": ".aNumber" requires the schema to define xor on numparam and expr-numparam.
Richard's first suggestion to allow for a special object is just slightly different: "numparam": { "$expr": ".aNumber" }.

Yeah I get it, it has some issues with this "interpolation" but I do also see some big players in this workflow DSL market not addressing it at all...at least we are :)

Wow, that's really good input. So far, I think we have enough arguments that the simplification causes real-life problems, but you make a very good point here. @tsurdilo Could you point to one or two big players' cases to put it in the balance?

As I mentioned last week, I'm afraid I only know of cases that

  • deliberately break the syntax, e.g. Helm templating is non-conformant YAML with its {{ }}
  • so they apply only to arbitrary text documents, e.g. ERB-style templating <% %>

JQ has string interpolation with \( ), so leaving interpolation to the expression language sounds like a good option.

It only requires us to decide how to do expressions.
IIUC we can define JSONschema types that can take either the type or an expression object returning the type, So, we'd be good also casting-wise, because we know what is expected, e.g.

  "definitions": {
    "expression": {
      "type": "object",
      "properties": {
        "$expr": { "type": "string" }
      },
      "additionalProperties": false
    },
    "numberExpression": {
      "oneOf": [
        { "$ref": "#/definitions/expression" },
        { "type": "number" }
      ]
    }
  }

It aligns with how JSONschema uses the same approach to define the reference object type { "$ref": "..." }.

@gibson042
Copy link
Contributor

Yeah I get it, it has some issues with this "interpolation" but I do also see some big players in this workflow DSL market not addressing it at all...at least we are :)

Wow, that's really good input. So far, I think we have enough arguments that the simplification causes real-life problems, but you make a very good point here. @tsurdilo Could you point to one or two big players' cases to put it in the balance?

I'll second this, and specifically request an example of such a big player in which a value is a string in configuration but not necessarily a string at runtime, and a description of how that works. With respect to this issue, I suspect they are "not addressing it at all" because they don't have it—it doesn't exist if strings remain strings or are only used in contexts where the required data type is known and therefore conversion is well-defined.

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 9, 2021

Wow, that's really good input. So far, I think we have enough arguments that the simplification causes real-life problems, but you make a very good point here. @tsurdilo Could you point to one or two big players' cases to put it in the balance?

I'll second this, and specifically request an example of such a big player in which a value is a string in configuration but not necessarily a string at runtime, and a description of how that works. With respect to this issue, I suspect they are "not addressing it at all" because they don't have it—it doesn't exist if strings remain strings or are only used in contexts where the required data type is known and therefore conversion is well-defined.

I don't know what you guys want from me really :) just look at ASL expressions syntax, for example

       "TimestampPath": "$.expirydate",

This is really not how you request changes to any project. You address the issue with examples of whats broken and a PR on how to fix it in your opinion and then people can look at address concrete examples.
This has been so damn backwards and then ontop of that you want me to provide examples and references for you too for other DSLs? Do you want me to wipe your @@# too ? Do you own research (respectfully said :))

Where the heck is any sort of respect for this project? Go to Kubernetes or any other project and start these types of conversations without a PR and concrete tests and then see if anyone in their right mind will respond to your there. We went above and beyond trying to figure this out. Maybe you need to start thinking "Am I not explaining things in a way that its fully understandable to others" before anything else.

On the other hand with what I proposed I will do a PR and update all examples so you can nicely sit back and take a look at it and then put in all your comments on how you just don't like things w/o having a concrete proposal that actually works outside of your "world". Thanks ;)

@gibson042
Copy link
Contributor

gibson042 commented Jun 9, 2021

This is really not how you request changes to any project. You address the issue with examples of whats broken and a PR on how to fix it in your opinion and then people can look at address concrete examples.

I'm not even clear on what the current behavior is, https://github.com/serverlessworkflow/specification/blob/d36bcc5bd84e05c6ad904dd8918f6479653e1fdf/specification.md contains many examples that use "${ … }" strings but I don't see any having a prefix or a suffix outside the ${…} construct, nor an explanation of the precise corresponding semantics with respect to data type.

This has been so damn backwards and then ontop of that you want me to provide examples and references for you too? Do you own research (respectfully said :))

Respectfully, it's on you to back up your claim ("I do also see some big players in this workflow DSL market not addressing it at all"). But I checked anyway, and it turns out that ASL does address this, because the places in their structure which are dynamically interpreted are strings that are checked for prefixes $$ or $ (i.e., just like the "$expr …" pattern above!), and even further than that the name must end with .$ to indicate dynamic nature: https://states-language.net/spec.html#payload-template

@manuelstein manuelstein reopened this Jun 10, 2021
@manuelstein
Copy link
Contributor

To clarify the ASL example:

just look at ASL expressions syntax, for example

      "TimestampPath": "$.expirydate",

I understand it looks like interpolation, but it's not. The parameter TimestampPath only accepts Paths. If you want to set the timeout in wait_until with an ISO-8601 extended offset date-time format string, you'd have to use the parameter Timestamp.
ASL does not support string interpolation; it requires the expression language to interpolate strings. Please take a look at the String.format example in (Appendix B: List of Intrinsic Functions)[https://states-language.net/#appendix-b]. You'll find that, like @gibson042 explained, they're using a parameter name pattern, so any param that ends with .$ is evaluated as an expression. Also, ASL clearly distinguished between a path and an expression.

Personally, I think this has been text book so far: an issue has been raised, some convincing was done and it was triaged, so I'm going to assume for now that this has been verified as a real issue that needs solving.

Luckily, through the extended discussion, we've collected many options:

  1. markup/prefix ($( ), %( ), $expr, etc)
  2. JSONSchema style (using an object { "$expr": "" } as with $ref)
  3. parameter typing (having dedicated parameter names that accept expressions only, like ASL)

1. Markup/prefix

  • supports both interpolation and pure expression, but requires separate markups/prefixes for each
  • difficult to see which parameters support expressions and when/how to evaluate them (e.g. action timeout, top-level parameters outside states)
  • unclear if it should allow recursion, i.e. if the result of an expression returns an expression-formatted string
  • easiest to use

2. JSONSchema style

  • introduces extraneous object structure for expressions ({ "$expr": ... })
  • interpolation would require a different object ({ "$format": "Hi ${ .name }" })
  • is heavy on typing
  • JSONschema can express if it may be used on a parameter

3. parameter typing (like ASL)

  • adds a suffix or prefix on parameter names for expressions
  • interpolation would require a different suffix
  • not as extensible as 2.
  • JSONschema can express if it may be used on a parameter

@tsurdilo
Copy link
Contributor

Thanks all for your updates and great conversations!!!
As it is clear that this needs more time, given our project priorities for things we want to do and the amount of time we have already spent on this issue, we suggest that you do a PR and we can look at it when time allows.

Many thanks for the great and very insightful information that you have provided. It will definitely improve our project in the long term!!

@gibson042
Copy link
Contributor

How do you expect the project to improve by closing issues that are still unresolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: spec Changes in the Specification
Projects
None yet
Development

No branches or pull requests

5 participants