You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -18,75 +19,81 @@ This guide does not provide guidance on optimizing for ingest pipeline performan
18
19
19
20
When creating ingest pipelines, there are are few options for accessing fields in conditional statements and scripts. All formats can be used to reference fields, so choose the one that makes your pipeline easier to read and maintain.
20
21
21
-
| Notation | Example | Notes |
22
-
|---|---|---|
23
-
| Dot notation |`ctx.event.action`| Supported in conditionals and painless scripts. |
24
-
| Square bracket notation |`ctx['event']['action']`| Supported in conditionals and painless scripts. |
25
-
| Mixed dot and bracket notation |`ctx.event['action']`| Supported in conditionals and painless scripts. |
| Dot notation |`ctx.event.action`| Supported in conditionals and painless scripts. |
25
+
| Square bracket notation |`ctx['event']['action']`| Supported in conditionals and painless scripts. |
26
+
| Mixed dot and bracket notation |`ctx.event['action']`| Supported in conditionals and painless scripts. |
27
+
| Field API |`field('event.action', '')` or `$('event.action','')`| Supported in conditionals and painless scripts. Only available in versions 9.2+ |
28
+
| Field API |`field('event.action', '')` or `$('event.action','')`| Supported only in painless scripts. |
26
29
27
30
Below are some general guidelines for choosing the right option in a situation.
28
31
32
+
### Field API
33
+
34
+
Starting with version [9.2](https://github.com/elastic/elasticsearch/pull/131581) we have access to the field API that enables the usage of this API in conditionals (the `if` statement of your processor). Otherwise you can always use the field API in the script processor itself.
35
+
36
+
:::{note}
37
+
This is the preferred way to access fields.
38
+
:::
39
+
40
+
**Benefits**
41
+
42
+
- Clean and easy to read
43
+
- Handles null values automatically
44
+
- Adds support for additional functions like `isEmpty()` to ease comparisions.
45
+
- Handles dots as part of field name
46
+
- Handles dots as dot walking for object notation
47
+
- Handles special characters.
48
+
49
+
**Limitations**
50
+
51
+
- Only available starting in 9.2 for conditionals.
52
+
29
53
### Dot notation [dot-notation]
30
54
31
-
**Benefits:**
32
-
* Clean and easy to read.
33
-
* Supports null safety operations `?`. Read more in [Use null safe operators (`?.`)](#null-safe-operators).
55
+
**Benefits**
56
+
57
+
- Clean and easy to read.
58
+
- Supports null safety operations `?`. Read more in [Use null safe operators (`?.`)](#null-safe-operators).
34
59
35
60
**Limitations**
36
-
* Does not support field names that contain a `.` or any special characters such as `@`.
61
+
62
+
- Does not support field names that contain a `.` or any special characters such as `@`.
37
63
Use [Bracket notation](#bracket-notation) instead.
38
64
39
65
### Bracket notation [bracket-notation]
40
66
41
-
**Benefits:**
42
-
* Supports special characters such as `@` in the field name.
67
+
**Benefits**
68
+
69
+
- Supports special characters such as `@` in the field name.
43
70
For example, if there's a field name called `has@!%&chars`, you would use `ctx['has@!%&chars']`.
44
-
* Supports field names that contain `.`.
71
+
- Supports field names that contain `.`.
45
72
For example, if there's a field named `foo.bar`, if you used `ctx.foo.bar` it will try to access the field `bar` in the object `foo` in the object `ctx`. If you used `ctx['foo.bar']` it can access the field directly.
46
73
47
-
**Limitations:**
48
-
* Slightly more verbose than dot notation.
49
-
* No support for null safety operations `?`.
74
+
**Limitations**
75
+
76
+
- Slightly more verbose than dot notation.
77
+
- No support for null safety operations `?`.
50
78
Use [Dot notation](#dot-notation) instead.
51
79
52
80
### Mixed dot and bracket notation
53
81
54
-
**Benefits:**
55
-
* You can also mix dot notation and bracket notation to take advantage of the benefits of both formats.
82
+
**Benefits**
83
+
84
+
- You can also mix dot notation and bracket notation to take advantage of the benefits of both formats.
56
85
For example, you could use `ctx.my.nested.object['has@!%&chars']`. Then you can use the `?` operator on the fields using dot notation while still accessing a field with a name that contains special characters: `ctx.my?.nested?.object['has@!%&chars']`.
Use conditionals (`if` statements) to ensure that an ingest pipeline processor is only applied when specific conditions are met.
65
94
66
95
% In an ingest pipeline, when working with conditionals inside processors. The topic around error processing is a bit more complex, most importantly any errors that are coming from null values, missing keys, missing values, inside the conditional, will lead to an error that is not captured by the `ignore_failure` handler and will exit the pipeline.
67
96
68
-
### Avoid excessive OR conditions
69
-
70
-
When using the [boolean OR operator](elasticsearch://reference/scripting-languages/painless/painless-operators-boolean.md#boolean-or-operator) (`||`), `if` conditions can become unnecessarily complex and difficult to maintain, especially when chaining many OR checks. Instead, consider using array-based checks like `.contains()` to simplify your logic and improve readability.
71
-
72
-
#### **Don't**: Run many ORs
This example only checks for exact matches. Do not use this approach if you need to check for partial matches.
88
-
:::
89
-
90
97
### Use null safe operators (`?.`) [null-safe-operators]
91
98
92
99
Anticipate potential problems with the data, and use the [null safe operator](elasticsearch://reference/scripting-languages/painless/painless-operators-reference.md#null-safe-operator) (`?.`) to prevent data from being processed incorrectly.
@@ -266,6 +273,29 @@ POST _ingest/pipeline/_simulate
266
273
}
267
274
}
268
275
```
276
+
277
+
:::
278
+
279
+
### Avoid excessive OR conditions
280
+
281
+
When using the [boolean OR operator](elasticsearch://reference/scripting-languages/painless/painless-operators-boolean.md#boolean-or-operator) (`||`), `if` conditions can become unnecessarily complex and difficult to maintain, especially when chaining many OR checks. Instead, consider using array-based checks like `.contains()` to simplify your logic and improve readability.
282
+
283
+
#### **Don't**: Run many ORs
This example only checks for exact matches. Do not use this approach if you need to check for partial matches.
269
299
:::
270
300
271
301
## Convert mb/gb values to bytes
@@ -345,10 +375,78 @@ The [rename processor](elasticsearch://reference/enrich-processor/rename-process
345
375
-`ignore_missing`: Useful when you are not sure that the field you want to rename exists.
346
376
-`ignore_failure`: Helps with any failures encountered. For example, the rename processor can only rename to non-existing fields. If you already have the field `abc` and you want to rename `def` to `abc`, the operation will fail.
347
377
348
-
## Use a script processor
378
+
## Script processor
349
379
350
380
If no built-in processor can achieve your goal, you may need to use a [script processor](elasticsearch://reference/enrich-processor/script-processor.md) in your ingest pipeline. Be sure to write scripts that are clear, concise, and maintainable.
351
381
382
+
### Add new fields
383
+
384
+
All of the above discussed ways to [access fields](#access-fields) and retrieve their values is applicable within the script context. [Null handling](#null-safe-operators) is still an important aspect when accessing the fields.
385
+
386
+
:::{tip}
387
+
The fields API is the recommended way to add new fields.
388
+
:::
389
+
390
+
**Fields API**
391
+
We get the following field `cpu.usage` and we want to rename it to `system.cpu.total.norm.pct` which represents a scale from 0-1.0, where 1 is the equivalent of 100%.
1. Our field expects 0-1 and not 0-100, we will have to divide by 100 to get the right representation.
419
+
2. The `field` API is exposed as `field(<field name>)`. The `set(<value>)` is responsible for setting the value. Inside we use the `$(<field name>, fallback)` to read the value out of the existing field. Lastly we divide by `100.0`. The `.0` is important, otherwise it will perform an integer only division and return just 0 instead of 0.9.
420
+
421
+
**No fields API**
422
+
Without the field API this can also be achieved. However there is much more code involved, as we have to ensure that we can walk the full path of `system.cpu.total.norm.pct`.
@@ -405,6 +504,7 @@ POST _ingest/pipeline/_simulate
405
504
}
406
505
}
407
506
```
507
+
408
508
1. Ensure the `event` object exists before assigning to it.
409
509
2. Use `DateTimeFormatter` and `LocalTime` to parse the duration string.
410
510
3. Store the duration in nanoseconds, as expected by ECS.
@@ -428,14 +528,14 @@ When reconstructing or normalizing IP addresses in ingest pipelines, avoid unnec
428
528
}
429
529
}
430
530
```
531
+
431
532
1. Uses square bracket notation for field access instead of dot notation.
432
533
2. Unnecessary casting to `Integer` when parsing string segments.
433
534
3. Allocates an extra variable for the IP string instead of setting the field directly.
434
535
4. Does not check if `destination` is available as an object.
435
536
436
537
#### **Do**: Use concise, readable, and safe scripting
437
538
438
-
439
539
```json
440
540
POST _ingest/pipeline/_simulate
441
541
{
@@ -463,6 +563,7 @@ POST _ingest/pipeline/_simulate
463
563
}
464
564
}
465
565
```
566
+
466
567
1. Uses dot notation for field access.
467
568
2. Avoids unnecessary casting and extra variables.
468
569
3. Uses the null safe operator (`?.`) to check for field existence.
@@ -546,3 +647,31 @@ POST _ingest/pipeline/_simulate
546
647
```
547
648
548
649
In this example, `{{tags.0}}` retrieves the first element of the `tags` array (`"cool-host"`) and assigns it to the `host.alias` field. This approach is necessary when you want to extract a specific value from an array for use elsewhere in your document. Using the correct index ensures you get the intended value, and this pattern works for any array field in your source data.
650
+
651
+
### Transform into a JSON string
652
+
653
+
Whenever you need to store the original `_source` within a field `event.original`, we can use mustache function `{{#toJson}}<field>{{/toJson}}`.
0 commit comments