Skip to content

Issue with the bulk helper handling errors when a delete action is combined with other actions #1751

Closed
@karlriis

Description

@karlriis

🐛 Bug Report

When a bulk helper call contains a delete action combined with some other actions (e.g. update), and when one of the actions errors, then it is possible that the onDrop() callback returns incorrect information about the erroring operation/document, or a deserialization error occurs.

Detailed explanation with background

The tryBulk() function in the bulk helper makes a bulk API call with multiple operations and documents (contained in bulkBody), and then loops over the response items. When a response item contains an error, then that item is matched with the corresponding request item from bulkBody to pass the problematic operation/document to the onDrop callback.

function tryBulk (bulkBody: string[], callback: (err: Error | null, bulkBody: string[]) => void): void {

To map the erroring response item to the request item, a variable named indexSlice is derived based on the response items loop counter. The logic for the calculation of the variable seems incorrect in the case where there are delete operations mixed with others.

const indexSlice = operation !== 'delete' ? i * 2 : i

For example, if we create a bulk request with a delete and an update operation, then the request body will be something like this (1 item for delete, 2 items for update):

{ delete: { _index: '...', _id: '...' } }
{ update: { _index: '...', _id: '...' } }
{ doc: { ... } }

The response to this will contain only two items, one for each operation. Let's say that the update failed, in which case the second item in the response is going to contain an error. To map it to the request item, the indexSlice is derived as indexSlice = 1*2. Now, when gathering the operation and document from bulkBody to return error information through onDrop(), it is done as such:

operation: serializer.deserialize(bulkBody[indexSlice]),
// @ts-expect-error
document: operation !== 'delete'
? serializer.deserialize(bulkBody[indexSlice + 1])
: null,

Which for the current example will be:

operation: serializer.deserialize(bulkBody[2]),
document: serializer.deserialize(bulkBody[3])

This is incorrect as bulkBody[2] contains the document instead of the operation, and bulkBody[3] does not exist at all. The latter causes a deserialization error.

If there would have been an additional successful action after the update, then the deserialization wouldn't have occurred, but the indexSlice would still be shifted by one and incorrect information would be returned for operation and document fields in onDrop().

To Reproduce

Steps to reproduce the behavior:
Here I provide an example of the case I described above, where the goal is to make a single bulk call with a valid delete operation and an erroneous update operation.

  1. Create a document
  2. Perform a bulk call with a delete action for the previously created document and an update action for a non-existing document
const myIndex = 'example-index'
// Create one document
await client.helpers.bulk(
    {
        datasource: [
            [
                { update: { _index: myIndex, _id: '1' } },
                {
                    doc: {
                        name: "Doc1"
                    },
                    doc_as_upsert: "true"
                },
            ],

        ],
        onDocument: (action) => action,
    },
);

// Delete the previously created document and try to edit a non-existing document
await client.helpers.bulk(
    {
        datasource: [
            { delete: { _index: myIndex, _id: '1' } },
            [
                { update: { _index: myIndex, _id: '2' } },
                {
                    doc: {
                        name: "this doc should not exist"
                    },
                },
            ],

        ],
        onDocument: (action) => action,
        onDrop: (failure) => {
            console.log(failure)
        },
    },
);

This results in a deserialization error:

/Users/.../node_modules/@elastic/transport/lib/Serializer.js:63
            throw new errors_1.DeserializationError(err.message, json);
                  ^

DeserializationError: Unexpected token u in JSON at position 0
    at Serializer.deserialize (/Users/.../node_modules/@elastic/transport/lib/Serializer.js:63:19)
    at /Users/.../node_modules/@elastic/elasticsearch/lib/helpers.js:727:54
    at processTicksAndRejections (node:internal/process/task_queues:94:5) {
  data: undefined
}

Expected behavior

Expected the code to run without any errors and information about the failed update action to be logged in onDrop.

Your Environment

  • node version: v15.14.0
  • @elastic/elasticsearch version: 8.2.1
  • os: macOS 12.3.1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions