Description
🐛 Bug Report
When a bulk helper call contains a delete action combined with some other actions (e.g. update), and when one of the actions errors, then it is possible that the onDrop()
callback returns incorrect information about the erroring operation/document, or a deserialization error occurs.
Detailed explanation with background
The tryBulk()
function in the bulk helper makes a bulk API call with multiple operations and documents (contained in bulkBody
), and then loops over the response items. When a response item contains an error, then that item is matched with the corresponding request item from bulkBody
to pass the problematic operation/document to the onDrop
callback.
elasticsearch-js/src/helpers.ts
Line 811 in 4ebffbc
To map the erroring response item to the request item, a variable named indexSlice
is derived based on the response items loop counter. The logic for the calculation of the variable seems incorrect in the case where there are delete operations mixed with others.
elasticsearch-js/src/helpers.ts
Line 833 in 4ebffbc
For example, if we create a bulk request with a delete and an update operation, then the request body will be something like this (1 item for delete, 2 items for update):
{ delete: { _index: '...', _id: '...' } }
{ update: { _index: '...', _id: '...' } }
{ doc: { ... } }
The response to this will contain only two items, one for each operation. Let's say that the update failed, in which case the second item in the response is going to contain an error. To map it to the request item, the indexSlice
is derived as indexSlice = 1*2
. Now, when gathering the operation and document from bulkBody
to return error information through onDrop()
, it is done as such:
elasticsearch-js/src/helpers.ts
Lines 849 to 853 in 4ebffbc
Which for the current example will be:
operation: serializer.deserialize(bulkBody[2]),
document: serializer.deserialize(bulkBody[3])
This is incorrect as bulkBody[2]
contains the document instead of the operation, and bulkBody[3]
does not exist at all. The latter causes a deserialization error.
If there would have been an additional successful action after the update, then the deserialization wouldn't have occurred, but the indexSlice
would still be shifted by one and incorrect information would be returned for operation
and document
fields in onDrop()
.
To Reproduce
Steps to reproduce the behavior:
Here I provide an example of the case I described above, where the goal is to make a single bulk call with a valid delete operation and an erroneous update operation.
- Create a document
- Perform a bulk call with a delete action for the previously created document and an update action for a non-existing document
const myIndex = 'example-index'
// Create one document
await client.helpers.bulk(
{
datasource: [
[
{ update: { _index: myIndex, _id: '1' } },
{
doc: {
name: "Doc1"
},
doc_as_upsert: "true"
},
],
],
onDocument: (action) => action,
},
);
// Delete the previously created document and try to edit a non-existing document
await client.helpers.bulk(
{
datasource: [
{ delete: { _index: myIndex, _id: '1' } },
[
{ update: { _index: myIndex, _id: '2' } },
{
doc: {
name: "this doc should not exist"
},
},
],
],
onDocument: (action) => action,
onDrop: (failure) => {
console.log(failure)
},
},
);
This results in a deserialization error:
/Users/.../node_modules/@elastic/transport/lib/Serializer.js:63
throw new errors_1.DeserializationError(err.message, json);
^
DeserializationError: Unexpected token u in JSON at position 0
at Serializer.deserialize (/Users/.../node_modules/@elastic/transport/lib/Serializer.js:63:19)
at /Users/.../node_modules/@elastic/elasticsearch/lib/helpers.js:727:54
at processTicksAndRejections (node:internal/process/task_queues:94:5) {
data: undefined
}
Expected behavior
Expected the code to run without any errors and information about the failed update action to be logged in onDrop
.
Your Environment
- node version: v15.14.0
@elastic/elasticsearch
version: 8.2.1- os: macOS 12.3.1