Skip to content

Commit 10d149a

Browse files
expanding example for split string processor (#11246) (#11498)
1 parent 45355a7 commit 10d149a

File tree

1 file changed

+83
-16
lines changed

1 file changed

+83
-16
lines changed

_data-prepper/pipelines/configuration/processors/split-string.md

Lines changed: 83 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -29,36 +29,103 @@ source | N/A | N/A | The key to split.
2929
delimiter | No | N/A | The separator character responsible for the split. Cannot be defined at the same time as `delimiter_regex`. At least `delimiter` or `delimiter_regex` must be defined.
3030
delimiter_regex | No | N/A | The regex string responsible for the split. Cannot be defined at the same time as `delimiter`. At least `delimiter` or `delimiter_regex` must be defined.
3131

32-
### Usage
32+
### Example
3333

3434
To get started, create the following `pipeline.yaml` file:
3535

3636
```yaml
37-
pipeline:
37+
split-string-all-configs-pipeline:
3838
source:
39-
file:
40-
path: "/full/path/to/logs_json.log"
41-
record_type: "event"
42-
format: "json"
39+
http:
40+
path: /logs
41+
ssl: false
42+
4343
processor:
4444
- split_string:
45+
# 1) The top-level list of split "entries"
4546
entries:
46-
- source: "message"
47+
# 2) Use `source` + `delimiter` (comma)
48+
- source: csv_line
4749
delimiter: ","
50+
51+
# 3) Another `source` + `delimiter` (pipe)
52+
- source: tags
53+
delimiter: "|"
54+
55+
# 4) `source` + `delimiter` (slash) to split a path
56+
- source: path
57+
delimiter: "/"
58+
59+
# 5) `source` + `delimiter_regex` (semicolon + optional spaces)
60+
- source: semicolons
61+
delimiter_regex: ";\\s*"
62+
4863
sink:
49-
- stdout:
64+
- opensearch:
65+
hosts: ["https://opensearch:9200"]
66+
insecure: true
67+
username: admin
68+
password: admin_pass
69+
index_type: custom
70+
index: split-string-demo-%{yyyy.MM.dd}
71+
5072
```
5173
{% include copy.html %}
5274

53-
Next, create a log file named `logs_json.log`. After that, replace the `path` in the file source of your `pipeline.yaml` file with your file path. For more detailed information, see [Configuring OpenSearch Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
75+
You can test the pipeline using the following command:
5476

55-
Before you run Data Prepper, the source appears in the following format:
56-
57-
```json
58-
{"message": "hello,world"}
77+
```bash
78+
curl -sS -X POST "http://localhost:2021/logs" \
79+
-H "Content-Type: application/json" \
80+
-d '[
81+
{
82+
"csv_line": "x,y",
83+
"tags": "beta|test",
84+
"path": "usr/local/bin",
85+
"semicolons": "alpha;beta ; gamma"
86+
}
87+
]'
5988
```
60-
After you run Data Prepper, the source is converted to the following format:
89+
{% include copy.html %}
90+
91+
The document stored in OpenSearch contains the following information:
6192

6293
```json
63-
{"message":["hello","world"]}
64-
```
94+
{
95+
...
96+
"hits": {
97+
"total": {
98+
"value": 1,
99+
"relation": "eq"
100+
},
101+
"max_score": 1,
102+
"hits": [
103+
{
104+
"_index": "split-string-demo-2025.10.15",
105+
"_id": "YSAz6JkBrcmuDURMmTeo",
106+
"_score": 1,
107+
"_source": {
108+
"csv_line": [
109+
"x",
110+
"y"
111+
],
112+
"tags": [
113+
"beta",
114+
"test"
115+
],
116+
"path": [
117+
"usr",
118+
"local",
119+
"bin"
120+
],
121+
"semicolons": [
122+
"alpha",
123+
"beta ",
124+
"gamma"
125+
]
126+
}
127+
}
128+
]
129+
}
130+
}
131+
```

0 commit comments

Comments
 (0)