-
Notifications
You must be signed in to change notification settings - Fork 832
Description
Describe the bug
We encountered an issue with Ingester/Prometheus unable to recover from WAL due to invalid block sequence: block time ranges overlap
issue. Repeatedly calling this Ingester with ingestion request results in OOM.
When an ingester service starts up, it tries to recover TSDB for all UserIds. If a TSDB failed to be recovered, ingester will skip it and continue on. When an ingestion request for the corrupted TSDB comes, ingester will try to create a TSDB for this UserID. Since there is already a TSDB for this user, this action will result in Prometheus performing recovery once again.
I noticed that for the same request, ingesters log different errors
level=warn ts=2020-10-01T19:36:19.800983863Z caller=grpc_logging.go:38 method=/cortex.Ingester/Push duration=28.647562844s err="user=some-user-id: failed to open TSDB: /data/tsdb/some-user-id: invalid block sequence: block time ranges overlap: [mint: 1601488821086, maxt: 1601490567894, range: 29m6s, blocks: 2]: <ulid: 01EKG4DJNVRS9KSPQZHVHKD07G, mint: 1601481600000, maxt: 1601490567894, range: 2h29m27s>, <ulid: 01EKGQHG7FP40NRB4SA0GFAV4E, mint: 1601488821086, maxt: 1601496000000, range: 1h59m38s>" msg="gRPC\n"
level=warn ts=2020-10-01T19:15:41.737350414Z caller=grpc_logging.go:38 method=/cortex.Ingester/Push duration=13m46.278392928s err="user=some-user-id: failed to open TSDB: /data/tsdb/some-user-id: mmap files, file: /data/tsdb/some-user-id/chunks_head/000036: mmap: cannot allocate memory" msg="gRPC\n
I see the same chunk-head is memory-mapped about 200 times. I also saw that an ingester is able to handle about 200 requests for this corrupted TSDB before going OOM. I suspect Promethues is memory-mapping files during WAL replay. When Prometheus fails, this mmap is not cleaned up.
/ # sysctl vm.max_map_count
vm.max_map_count = 65530
/ # pmap 1 | wc -l
65523
/ # pmap 1 | grep /data/tsdb/some-user-id/chunks_head/000036 | wc -l
236
To Reproduce
Steps to reproduce the behavior:
- Start Cortex (v1.4.0-rc.1)
- Have a corrupted TSDB in ingester
- Perform write operations for the corrupted TSDB
- Ingester goes OOM
Expected behavior
Ingester should not go OOM
Environment:
- Infrastructure: AWS EKS
- Deployment tool: helm
Storage Engine
- Blocks
- Chunks
Additional Context