Skip to content

High CPU consumption on marshalling/unmarshalling ES json requests/responses #1410

@konradgaluszka

Description

@konradgaluszka

Hi,

during our performance tests we've noticed that there's quite a lot of CPU usage spent on marshalling or unmarshalling jsons to/from ElasticSearch. I would like to know whether there's something wrong with our setup or if that's the issue you also experience.

The output from profiling tool:
https://github.com/konradgaluszka/jaeger-test-results

Test setup:

  • 1 kilobyte spans from multiple pods send via jaeger-agent sidecar using GRPC protocol

  • The ES setup was more or less the minimum HA in hot-warm configuration (not a real large cluster)
    o 3 x hot node (12 CPU cores, 50Gb RAM)
    o 3 x warm node(12 CPU cores, 50Gb RAM) – you can disregard those since data were not moved between hot and warm during the tests

  • Jaeger setup:
    o 2 x collector (7 CPUs, 32Gb memory)
    o 2 x query (1 CPUs, 16Gb memory) – you can disregard Query component since it was not tested during the tests

  • Maximum traffic handled was around 100 000 spans per second (100 MB/s) with resources utilization as follows:
    o 60 CPUs / 300GBs RAM for ES; 8 CPUs / 10 GB RAM for Collector

  • The setup above is one of a few tests (different span sizes, different number of collectors and ES configuration), we have tuned arguments for Collector as below (where the es.bulk.size and es.bulk.action have the most positive influence on the performance)
    - '--collector.queue-size=8000000'
    - '--collector.num-workers=50'
    - '--es.tags-as-fields.all'
    - '--es.bulk.workers=48'
    - '--es.num-replicas=1'
    - '--es.num-shards=3'
    - '--es.bulk.size=50000000'
    - '--es.bulk.actions=10000'

Questions:
What do you think can be done to improve the performance? Maybe is it possible to change how bulk response (a long list of acknowledgement objects from ES) is processed?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions