Description
Description
Since there is no max_retry
configuration on helpers.parallel_bulk (#645) the default case seems to retry forever and never stop. This is a very strange default behavior and caused my batch processing script to run out of memory.
This situation is caused by having the elasticsearch database run out of storage space which I easily achieved by filling a default elasticsearch docker container with documents until my partition was filled.
That lead to the following error with basic curl insertion.
curl -X POST "localhost:9200/test/_doc/?pretty" -H 'Content-Type: application/json' -d'
{
"key1" : 1,
"key2": "value2"
}
'
..... # Long wait here until I recieved any result
{
"error" : {
"root_cause" : [
{
"type" : "cluster_block_exception",
"reason" : "blocked by: [SERVICE_UNAVAILABLE/2/no master];",
"suppressed" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
]
}
],
"type" : "cluster_block_exception",
"reason" : "blocked by: [SERVICE_UNAVAILABLE/2/no master];",
"suppressed" : [
{
"type" : "master_not_discovered_exception",
"reason" : null
}
]
},
"status" : 503
}
Since the parallel_bulk method retries on 503 responses and it retires an infinite amount of times this becomes a major issue. In my case I need to ingest a large amount of small documents into ES on periodic schedule. To do this quickly I increased the queue_size and chunk_size of the parallel_bulk call according to the Elasticsearch documentation. In my case an optimal configuration looked like this.
for success, info in elasticsearch.helpers.parallel_bulk(
elastic,
documents,
max_retries=2,
queue_size=10,
chunk_size=2000,
raise_on_exception=True,
raise_on_error=True):
if not success:
# Log any exception or database errors.
self.logger.error("Failed to insert document: %s", info)
Despite having raise_on_exception
and raise_on_error
set to true this call continues to consume my iterator and fill up my memory despite every single insertion attempt being stuck on infinite retries. I did especially not expect the iterator to continue to be consumed in such a situation.
Environment:
Linux - Debian Stretch
Python 3.7 with pip install elasticsearch==7.1.0
Elasticsearch database running with docker run --rm -p 9200:9200 elasticsearch:7.4.2
Expected outcome
That the default configuration would be to retry a limited amount of times and that the iterator stops being consumed until the insertion is either aborted or starts working again. I would also very much appreciate getting #645 fixed so that we have control over the amount of retires.