-
Notifications
You must be signed in to change notification settings - Fork 43
experimental.config.utils.schema: schema-aware hierarchical data processing #4279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
2 of 9 tasks
Tracked by
#4505
Comments
Please, transfer the ticket to the doc team after the draft finalization. |
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jun 26, 2024
The module is renamed from `internal.config.utils.schema` to `experimental.config.utils.schema` without changes. It is useful for validation of configuration data in roles and applications. Also, it provides a couple of methods that aim to simplify usual tasks around processing of hierarchical configuration data. For example, * get/set a nested value * apply defaults from the schema * filter data based on annotations from the schema * transform a hierarchical data using a function * merge two hierarchical values * parse environment variable according to its type in the schema See tarantool/doc#4279 for an in-depth description. Fixes tarantool#10117 NO_DOC=tarantool/doc#4279
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jun 26, 2024
The module is renamed from `internal.config.utils.schema` to `experimental.config.utils.schema` without changes. It is useful for validation of configuration data in roles and applications. Also, it provides a couple of methods that aim to simplify usual tasks around processing of hierarchical configuration data. For example, * get/set a nested value * apply defaults from the schema * filter data based on annotations from the schema * transform a hierarchical data using a function * merge two hierarchical values * parse environment variable according to its type in the schema See tarantool/doc#4279 for an in-depth description. Fixes tarantool#10117 NO_DOC=tarantool/doc#4279
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jul 3, 2024
The module is renamed from `internal.config.utils.schema` to `experimental.config.utils.schema` without changes. It is useful for validation of configuration data in roles and applications. Also, it provides a couple of methods that aim to simplify usual tasks around processing of hierarchical configuration data. For example, * get/set a nested value * apply defaults from the schema * filter data based on annotations from the schema * transform a hierarchical data using a function * merge two hierarchical values * parse environment variable according to its type in the schema See tarantool/doc#4279 for an in-depth description. Fixes tarantool#10117 NO_DOC=tarantool/doc#4279
Totktonada
added a commit
to tarantool/tarantool
that referenced
this issue
Jul 3, 2024
The module is renamed from `internal.config.utils.schema` to `experimental.config.utils.schema` without changes. It is useful for validation of configuration data in roles and applications. Also, it provides a couple of methods that aim to simplify usual tasks around processing of hierarchical configuration data. For example, * get/set a nested value * apply defaults from the schema * filter data based on annotations from the schema * transform a hierarchical data using a function * merge two hierarchical values * parse environment variable according to its type in the schema See tarantool/doc#4279 for an in-depth description. Fixes #10117 NO_DOC=tarantool/doc#4279
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jul 5, 2024
This commit implements the `<schema object>:set()` algorithm in a more accurate way and it solves several drawbacks of the previous implementation. * It was impossible to set a field that is nested to a record or a map that has the box.NULL value (tarantool#10190). * It was impossible to set a field to the box.NULL value (tarantool#10193). * It was impossible to delete a field, now `nil` RHS value means the deletion (tarantool#10194). Fixes tarantool#10190 Fixes tarantool#10193 Fixes tarantool#10194 NO_DOC=Included into tarantool/doc#4279
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jul 5, 2024
`<schema object>:get()` now can access a field inside an `any` type if it is a `table` or `nil`/`box.NULL`. `config:get()` now can access fields inside `app.cfg.<key>` and `roles_cfg.<key>`. Fixes tarantool#10205 NO_DOC=The `<schema object>:get()` update is included into tarantool/doc#4279. The `config:get()` reference on the website doesn't mention the constraint, so it doesn't need an update.
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jul 5, 2024
`<schema object>:get()` now can access a field inside an `any` type if it is a `table` or `nil`/`box.NULL`. `config:get()` now can access fields inside `app.cfg.<key>` and `roles_cfg.<key>`. Fixes tarantool#10205 NO_DOC=The `<schema object>:get()` update is included into tarantool/doc#4279. The `config:get()` reference on the website doesn't mention the constraint, so it doesn't need an update.
@sergos I've removed the @tarantool/doc The documentation request is ready to work on. |
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jul 5, 2024
`<schema object>:get()` now can access a field inside the `any` type if it is a `table` or `nil`/`box.NULL`. `config:get()` now can access fields inside `app.cfg.<key>` and `roles_cfg.<key>`. Fixes tarantool#10205 NO_DOC=The `<schema object>:get()` update is included into tarantool/doc#4279. The `config:get()` reference on the website doesn't mention the constraint, so it doesn't need an update.
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jul 22, 2024
`<schema object>:get()` now can access a field inside the `any` type if it is a `table` or `nil`/`box.NULL`. `config:get()` now can access fields inside `app.cfg.<key>` and `roles_cfg.<key>`. Fixes tarantool#10205 NO_DOC=The `<schema object>:get()` update is included into tarantool/doc#4279. The `config:get()` reference on the website doesn't mention the constraint, so it doesn't need an update.
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jul 22, 2024
This commit implements the `<schema object>:set()` algorithm in a more accurate way and it solves several drawbacks of the previous implementation. * It was impossible to set a field that is nested to a record or a map that has the box.NULL value (tarantool#10190). * It was impossible to set a field to the box.NULL value (tarantool#10193). * It was impossible to delete a field, now `nil` RHS value means the deletion (tarantool#10194). Fixes tarantool#10190 Fixes tarantool#10193 Fixes tarantool#10194 NO_DOC=Included into tarantool/doc#4279
Totktonada
added a commit
to tarantool/tarantool
that referenced
this issue
Jul 22, 2024
`<schema object>:get()` now can access a field inside the `any` type if it is a `table` or `nil`/`box.NULL`. `config:get()` now can access fields inside `app.cfg.<key>` and `roles_cfg.<key>`. Fixes #10205 NO_DOC=The `<schema object>:get()` update is included into tarantool/doc#4279. The `config:get()` reference on the website doesn't mention the constraint, so it doesn't need an update.
Totktonada
added a commit
to Totktonada/tarantool
that referenced
this issue
Jul 22, 2024
This commit implements the `<schema object>:set()` algorithm in a more accurate way and it solves several drawbacks of the previous implementation. * It was impossible to set a field that is nested to a record or a map that has the box.NULL value (tarantool#10190). * It was impossible to set a field to the box.NULL value (tarantool#10193). * It was impossible to delete a field, now `nil` RHS value means the deletion (tarantool#10194). Fixes tarantool#10190 Fixes tarantool#10193 Fixes tarantool#10194 NO_DOC=Included into tarantool/doc#4279
Totktonada
added a commit
to tarantool/tarantool
that referenced
this issue
Jul 22, 2024
This commit implements the `<schema object>:set()` algorithm in a more accurate way and it solves several drawbacks of the previous implementation. * It was impossible to set a field that is nested to a record or a map that has the box.NULL value (#10190). * It was impossible to set a field to the box.NULL value (#10193). * It was impossible to delete a field, now `nil` RHS value means the deletion (#10194). Fixes #10190 Fixes #10193 Fixes #10194 NO_DOC=Included into tarantool/doc#4279
p7nov
added a commit
that referenced
this issue
Oct 22, 2024
Resolves #4279. Co-authored-by: Pavel Semyonov <[email protected]> Co-authored-by: Elena Shebunyaeva <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Main related dev. issue:
Other related dev. issues/commits:
--help-env-list
CLI option tarantool#8933)Product: Tarantool
Since: 3.2
Audience/target: module and application developers
Root document:
SME: @ Totktonada
Details
Introduction
Tarantool offers a declarative configuration way since 3.0.0. The configuration has strictly defined schema, but there are sections that accept arbitrary values. These are
app.cfg.*
androles_cfg.*
.An author of an application or a role defines how these values are validated and processed.
Tarantool now offers a tool for validating and processing of data using a declarative schema: the
experimental.config.utils.schema
module.As the name of the module says, it is in the experimental status. The API may be changed in a backward incompatible way. However, tarantool developers are conservative regarding such changes in experimental modules.
Application and role developers are encouraged to use the schema module and provide a feedback.
Brief API description
Click to expand...
(This API description mostly repeats the one from tarantool/tarantool#8725 with some updates and enhancements. It is copied here primarily to ease reading and to have all the related documentation in one place.)
Schema node constructors:
Define a scalar.
Define a record (an object with certain field names and field value types).
Define a map (an object with arbitrary key names, but strict about keys and values types).
Define an array.
Two supplementary schema node constructors are defined:
schema.enum()
andschema.set()
.The schema object constructor wraps a schema node and adds schema name and user-provided methods.
The schema object has the following methods:
Traversing the schema.
Validate data against the schema.
Filter data based on the schema annotations.
Map data based on the schema annotations.
Apply default values.
Get/set a nested value.
Merge two values.
The following annotations are interpreted by the module itself:
Other annotations are ignored by the module and may be used for arbitrary purposes.
The module supports parsing of data from environment variables with conversions to appropriate data types (for example, converting
MAYVAR=3301
to a Lua number), parsing of comma separated array items and key-value pairs (for example,TT_REPLICATION_PEERS=localhost:3301,localhost:3302,localhost:3303
) and decoding JSON values.Detailed API description
Type system
Click to expand...
There are scalar and composite types.
A scalar type accepts a primitive value (except a special scalar type
any
). The scalar types are listed in the next section.A composite type accepts a collection of values. There the following composite types.
record
describes an object with certain fields of certain types.It is an analogue of
struct
in C,record
in Avro Schema andmessage
in Protocol Buffers.A record has the following constraints:
table
.All the fields are optional if there are no additional constraints.
map
describes an object with arbitrary named keys (of the same type) with values of the same type.It is an analogue of
unordered_map
in C++,map
in Avro Schema,map
in Protocol Buffers.A map has the following constraints:
table
.array
describes an ordered collection of elements with the same type.It is an analogue of an array in C,
array
in Avro Schema, arepeated
field in Protocol Buffers.An array has the following constraints:
table
.Scalars
Click to expand...
string
string
number
number
integer
number
x - math.floor(x) == 0
boolean
boolean
string, number
string
ornumber
number, string
string
ornumber
any
any
accepts an arbitrary Lua type, includingtable
. A scalar of theany
type may be used to declare an arbitrary value that doesn't need any validation.Schema node constructors: scalar, record, map, array
Click to expand...
The module provides the following functions to create schema node objects.
A general structure of a schema node is the following.
Derived schema node type constructors: enum, set
Click to expand...
schema.enum
accepts a string from the given set of allowed string values. It uses anallowed_values
annotation, see<schema object>:validate()
for details about this annotation.schema.set
accepts an array of unique string values, where each of the strings is from the given set of allowed string values.It uses
allowed_values
andvalidate
annotations, see<schema object>:validate()
for details about these annotations.Schema object constructor: new
Click to expand...
A schema node can be transformed to a schema object. Unlike a schema node, the schema object has a name, has methods described below and may have user-provided methods.
The given schema node is recursively copied and computed fields are added to each schema node into the
computed
field. See the 'Computed annotations' sections for details.Indexing a schema object is performed as follows:
An example of a user-provided method:
<schema object>:validate()
Click to expand...
Validate the given data against the given schema.
The method performs a recursive type checking. See the 'Type system' and 'Scalars' sections for details how exactly it is performed for each of the given types.
Aside of the type checking the method performs validation based on a user-provided annotations. It is described below in this section.
Nuances:
schema.new('<...>', schema.scalar(<...>))
doesn't acceptnil
andbox.NULL
. However,nil
andbox.NULL
.mt.__serialize
marks in the data are not involved anyhow.nil
values in a middle).Annotations taken into accounts:
allowed_values
(table) -- whitelist of valuesvalidate
(function) -- schema node specific validatorThe
validate
annotation is a user-provided function that accepts the following arguments.The user-provided
validate
function is called after all the type validation is done, including nested nodes, and, also, after theallowed_values
check.The contract is that the
validate
annotation raises an error to fail the check. There is a convenientw.error()
helper that raises an error formatted with schema name and path to the current schema node.Example:
The user-provided
validate
function may use computed annotations. See an example in the 'Computed annotations' section below.Beware: The user-provided
validate
function is not called on a record's field that hasnil
/box.NULL
value. If an error should be raised on a missing field, add thevalidate
annotation on the outer level (on a record, not the field itself).<schema object>:get()
Click to expand...
Get nested data that is pointed by the given path.
Important: the data is assumed as already validated against the given schema.
The indexing is performed in the optional chaining manner (
'foo.bar'
works likefoo?.bar
in TypeScript).The method checks the path against the schema: it doesn't allow to use a non-existing field or index a scalar value.
The path is either array-like table or a string in the dot notation.
Nuances
nil
,''
,{}
in thepath
argument means the root node: IOW, returns thedata
argument as is.any
type can be indexed if it is atable
ornil
/box.NULL
. In this case a tail of the path that is inside theany
type is not checked against a schema. Indexingnil
/box.NULL
always returnsnil
.<schema object>:set()
Click to expand...
Set the given
rhs
value at the given path in thedata
.Important:
data
is assumed as already validated against the given schema, butrhs
is validated by the method before the assignment.The method checks the path against the schema: it doesn't allow to use a non-existing field or index a scalar value.
The path is either array-like table or a string in the dot notation.
Nuances
A scalar of the 'any' type can't be indexed, even when it is a table. It is OK to set the whole value of the 'any' type.(This restriction will be relaxed in 3.2.0 in the scope of config/schema: :set() can't assign/delete a field inside 'any' type tarantool#10204.)rhs
value creates intermediate tables over the givenpath
instead ofnil
orbox.NULL
values.Field deletion
If
rhs
isnil
, it means deletion of the pointed field.How it works (in examples):
Intermediate tables are not created:
Existing tables on the
path
are not removed:<schema object>:filter()
Click to expand...
Filter data based on the schema annotations.
Important: the data is assumed as already validated against the given schema. (A fast type check is performed on composite types, but it is not recommended to lean on it.)
The user-provided filter function
f
receives the following table as the argument:The filter function returns a boolean value that is interpreted as 'accepted' or 'not accepted'.
The user-provided function
f
is called for each schema node, including ones that have box.NULL value (but not nil). A node of a composite type (record/map/array) is not traversed down if it has nil or box.NULL value.The
:filter()
function returns a luafun iterator by allw
values accepted by thef
function.A composite node that is not accepted still traversed down.
Examples:
Nuances
f
is called for it. However, it is not called fornil
values. See details below.w.path
for a map key and a map value are the same. It seems, we should introduce some syntax to point a key in a map, but it is not implemented yet.nil/box.NULL nuances explanation
Let's assume that a record defines three scalar fields: 'foo', 'bar' and 'baz'. Let's name a schema object that wraps the record as
s
.s:filter(nil, f)
callsf
only for the record itself.s:filter(box.NULL, f)
works in the same way.s:filter({foo = box.NULL, bar = nil}, f)
callsf
two times: for the record and for the 'foo' field.This behavior is needed to provide ability to handle box.NULL values in the data somehow. It reflects the pairs() behavior on a usual table, so it looks quite natural.
<schema object>:map()
Click to expand...
Transform data by the given function.
Leave the shape of the data unchanged.
Important: the data is assumed as already validated against the given schema. (A fast type check is performed on composite types, but it is not recommended to lean on it.)
The user-provided transformation function receives the following three arguments in the given order:
data
-- value at the given pathw
-- walkthrough node, described belowctx
-- user-provided context for the transformation functionThe walkthrough node
w
has the following fields:w.schema
-- schema node at the given pathw.path
-- path to the schema nodew.error
-- function that prepends a caller provided error message with context information; use it for nice error messagesAn example of the mapping function:
The :map() method is recursive with certain rules:
All record fields are traversed unconditionally, including ones with nil/box.NULL values. Even if the record itself is nil/box.NULL, its fields are traversed down (assuming their values as nil).
It is important when the original data should be extended using some information from the schema: say, default values.
It is not the case for a map and an array: nil/box.NULL fields and items are preserved as is, they're not traversed down. If the map/the array itself is nil/box.NULL, it is preserved as well.
A map has no list of fields in the schema, so it is not possible to traverse it down. Similarly, an array has no items count in the schema.
The method attempts to preserve the original shape of values of a composite type:
Nuances
w.path
for a map key and a map value are the same. It seems, we should introduce some syntax to point a key in a map, but it is not implemented yet.<schema object>:apply_default()
Click to expand...
Apply default values from the schema.
Important: the data is assumed as already validated against the given schema. (A fast type check is performed on composite types, but it is not recommended to lean on it.)
Annotations taken into accounts:
default
-- the value to be placed instead of a missed oneapply_default_if
(function) -- whether to apply the defaultIf there is no
apply_default_if
annotation, the default is assumed as to be applied.Nuances:
:map()
for such scenarios.<schema object>:merge()
Click to expand...
Merge two hierarical values (prefer the latter).
Important: the data is assumed as already validated against the given schema. (A fast type check is performed on composite types, but it is not recommended to lean on it.)
box.NULL
is preferred overnil
, anyX
whereX ~= nil
is preferred overnil
/box.NULL
.Records and maps are deeply merged. Scalars and arrays are all-or-nothing: the right hand one is chosen if both are not
nil
/box.NULL
.The formal rules are below.
Let's define the merge result for
nil
andbox.NULL
values:merge(nil, nil)
->nil
merge(nil, box.NULL)
->box.NULL
merge(box.NULL, nil)
->box.NULL
merge(box.NULL, box.NULL)
->box.NULL
Let's define
X
as a value that is notnil
and is notbox.NULL
.merge(X, nil)
->X
merge(X, box.NULL)
->X
merge(nil, X)
->X
merge(box.NULL, X)
->X
If the above conditions are not meet, the following type specific rules are in effect.
merge(<scalar A>, <scalar B>)
-><scalar B>
merge(<array A>, <array B>)
-><array B>
merge(<record A>, <record B>)
->deep-merge(A, B)
merge(<map A>, <map B>)
->deep-merge(A, B)
For each key
K
inA
and each keyK
inB
:deep-merge(A, B)[K]
ismerge(A[K], B[K])
.Nuances
A scalar of the
any
type is NOT deeply merged even if it is a table, however it may be useful. We'll consider adding support of such a behaviour in some backward-compatible way in a future.Arrays are not concatenated (the right hand one wins), however it may be useful too. The original idea is that we don't know, whether ordinals in the array are important, but the practice shows that arrays in configuration data are always just sets with an order -- ordinals do not matter.
Also, the given behavior (the right hand one wins) allows to discard items from the left hand side array that may be useful too. A concatenation behavior wouldn't allow it.
Anyway, the concatenation behavior may be considered to implement in some backward-compatible way in a future.
<schema object>:pairs()
Click to expand...
Walk over the schema and return scalar, array and map schema nodes (all nodes except records).
Usage example:
Parse an environment variable
Click to expand...
Parse data from an environment variable as a value of the given type.
Important: the result is not necessarily valid against the given schema node. It should be validated using the
<schema object>:validate()
method before further processing.env_var_name
is used for error messages.raw_value
is to be received usingos.getenv()
oros.environ()
.schema
is a schema node, not a schema object.How the raw value is parsed depends on the schema node type.
Scalars
string
-- return the raw value as isnumber
-- attempt to parse a number usingtonumber()
, fail if unsuccessfulstring, number
(and its aliasnumber, string
) -- attempt to parse a number usingtonumber()
; if it is unsuccessful return the raw value as isinteger
-- attempt to parse an integral number usingtonumber64()
, fail if unsuccessfulboolean
-- accept0
/1
andtrue
/false
(case insensitively), fail on other valuesany
-- parse as JSON, fail if the decoding failsRecord
Not supported.
It is technically possible to implement parsing of records similarly how it is done for maps, but it is not implemented yet.
Map
Accepts two formats:
{
).foo=bar,baz=fiz
object format (otherwise).The simple format is applicable for a map with string keys and scalar values of all types except
any
.In the simple format the field values are parsed according to its type in the schema (see the rules for scalars above).
Array
Accepts two formats:
[
).foo,bar,baz
array format (otherwise).The simple format is applicable for an array with scalar item values of all types except
any
.In the simple format the item values are parsed according to its type in the schema (see the rules for scalars above).
Computed annotations
Click to expand...
The idea is to have information from the ancestor nodes accessible from the given schema node.
Example:
The example demonstrates how the
kind
annotation from the outermost schema node is used in avalidate
function of a nested schema node.schema.new
call prepares each schema node in such a way that thecomputed.annotations
field contains all the annotations merged from the root schema node down to the given one. If the same annotation is present in an ancestor node and in an descendant node, the latter is preferred.There are two classes of schema node table fields that are not considered as annotations to merge into the
computed.annotations
field:type
,fields
,key
,value
,items
allowed_values
,validate
,default
,apply_default_if
Definition of done
This is relatively large topic and it seems logical to split the 'document everything' goal to some subtasks.
schema.fromenv
is documentedPlanning checklist
The text was updated successfully, but these errors were encountered: