[API server] handle logs request in coroutine #5366

aylei · 2025-04-25T10:01:59Z

This PR includes the minimal changes that move /logs handling to coroutine:

introduce a coroutine context, which handles cancellation, log redirection and env var overrides;
run /logs in uvicorn's event loop;

Though the task is now executed directly in the unvicorn process, we still maintain a request record for logs request to keep the behavior consistent: user can still cancel a log request sky api cancel and retrieve the log again with sky api logs.

Follow ups:

same approach for sky jobs log
[API server] ctrl-c sky logs does not cancel the logs request on server #5165
make skypilot config contextual

Benchmark

Command: python tests/load_tests/test_load_on_server.py -n 100 --apis tail_logs -c kubernetes under low server concurrency, 1c2g machine (1 long workers + 2 short workers):

# This PR
All requests completed in 16.20 seconds

----------------------------------------------------------------------------------------------------
Kind                 Count    Total(s)   Avg(s)     Min(s)     Max(s)     P95(s)     P99(s)
----------------------------------------------------------------------------------------------------
API /tail_logs       100      1642.00    16.42      16.20      18.26      17.25      18.26

# Master
All requests completed in 229.30 seconds

Latency Statistics:
----------------------------------------------------------------------------------------------------
Kind                 Count    Total(s)   Avg(s)     Min(s)     Max(s)     P95(s)     P99(s)
----------------------------------------------------------------------------------------------------
API /tail_logs       100      11675.38   116.75     3.31       229.30     218.19     227.03

There is a 7x improvement in average. The bottleneck of this PR is that each log task runs in a dedicated thread and there is only 1 uvicorn worker process, GIL contention makes the 100 logs threads cannot be fully concurrent.

Command: python tests/load_tests/test_load_on_server.py -n 100 --apis tail_logs -c aws under unlimited concurrency local mode (burstable worker), 4c16g machine:

# This PR
All requests completed in 56.22 seconds

Latency Statistics:
----------------------------------------------------------------------------------------------------
Kind                 Count    Total(s)   Avg(s)     Min(s)     Max(s)     P95(s)     P99(s)
----------------------------------------------------------------------------------------------------
API /tail_logs       100      5367.31    53.67      53.45      56.06      53.76      56.04

# Master
All requests completed in 90.30 seconds

Latency Statistics:
----------------------------------------------------------------------------------------------------
Kind                 Count    Total(s)   Avg(s)     Min(s)     Max(s)     P95(s)     P99(s)
----------------------------------------------------------------------------------------------------
API /tail_logs       100      7838.55    78.39      43.78      90.20      90.00      90.10

Resources:

# This PR
PEAK USAGE:
Peak CPU: 100.0%
Peak Memory: 1.50GB (11.8%)
Memory Delta: 0.6GB
Peak Short Executor Memory: 0.16GB
Peak Short Executor Memory Average: 0.16GB
Peak Long Executor Memory: 0.00GB
Peak Long Executor Memory Average: 0.00GB

# Master
PEAK USAGE:
Peak CPU: 100.0%
Peak Memory: 7.92GB (53.7%)
Memory Delta: 6.7GB
Peak Short Executor Memory: 0.20GB
Peak Short Executor Memory Average: 0.18GB
Peak Long Executor Memory: 0.19GB
Peak Long Executor Memory Average: 0.19GB

About 10x memory efficiency. However, the test found that logs on aws instance is significantly slower than logs on kubernetes instance (I switch the benchmark env to AWS EC2 for accurate resource usage accounting). This might be related to more RPCs/CPU cycles touched by the AWS code path, I leave this as a followup as it is not actually relevant to this PR.

Tests

Tested (run the relevant ones):

Code formatting: install pre-commit (auto-check on commit) or bash format.sh
Any manual or new tests for this PR (please specify below)
All smoke tests: /smoke-test (CI) or pytest tests/test_smoke.py (local)
Relevant individual tests: /smoke-test -k test_name (CI) or pytest tests/test_smoke.py::test_name (local)
Backward compatibility: /quicktest-core (CI) or pytest tests/smoke_tests/test_backward_compat.py (local)

Signed-off-by: Aylei <[email protected]>

aylei · 2025-04-27T13:06:26Z

/smoke-test -k test_minimal

aylei added 2 commits April 25, 2025 22:53

[API server] handle logs request in event loop

ddb8817

Signed-off-by: Aylei <[email protected]>

Correct log streaming redirection

9d8c577

Signed-off-by: Aylei <[email protected]>

aylei force-pushed the async-log branch from bf48b7d to 9d8c577 Compare April 25, 2025 16:19

aylei added 2 commits April 27, 2025 14:33

Refactor: use contextual stdout/stderr instead

7e9f7ba

Signed-off-by: Aylei <[email protected]>

Refinments

8a2841a

Signed-off-by: Aylei <[email protected]>

aylei changed the title ~~[API server] handle logs request in event loop~~ [API server] handle logs request in coroutine Apr 27, 2025

aylei added 4 commits April 27, 2025 17:12

Lint

190cf35

Signed-off-by: Aylei <[email protected]>

Merge branch 'master' into async-log

36c0b4a

Fix RayCodeGen

8abf811

Signed-off-by: Aylei <[email protected]>

Fix unit test

6eae81f

Signed-off-by: Aylei <[email protected]>

aylei marked this pull request as ready for review April 27, 2025 12:50

aylei requested a review from Michaelvll April 27, 2025 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[API server] handle logs request in coroutine #5366

[API server] handle logs request in coroutine #5366

aylei commented Apr 25, 2025 •

edited

Loading

aylei commented Apr 27, 2025

[API server] handle logs request in coroutine #5366

Are you sure you want to change the base?

[API server] handle logs request in coroutine #5366

Conversation

aylei commented Apr 25, 2025 • edited Loading

Benchmark

Tests

aylei commented Apr 27, 2025

aylei commented Apr 25, 2025 •

edited

Loading