Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
123 commits
Select commit Hold shift + click to select a range
6a12481
Fixing DCO issue and format checker issue
ApostaC Apr 2, 2025
34bea75
fixing pre-commit conflicts
ApostaC Apr 2, 2025
20ef2ac
[fix] fix the runtime error when no kv cache config is provided
ApostaC Apr 2, 2025
430e402
[fix] compatibility with v0 and address review comments
ApostaC Apr 6, 2025
300ddac
[fix] format checker issue and [disable] connector during profile run
ApostaC Apr 6, 2025
519dd3e
Merge remote-tracking branch 'upstream/main' into v1-disagg
robertgshaw2-redhat Apr 7, 2025
7e0695b
updated to remove torch.load
robertgshaw2-redhat Apr 7, 2025
553f416
updated
robertgshaw2-redhat Apr 7, 2025
b18bd8f
Merge pull request #3 from ApostaC/v1-disagg
robertgshaw2-redhat Apr 7, 2025
55d1b5b
updated
robertgshaw2-redhat Apr 7, 2025
c50e620
fixed typo
robertgshaw2-redhat Apr 7, 2025
b22fe38
updated
robertgshaw2-redhat Apr 7, 2025
da257aa
updated
robertgshaw2-redhat Apr 7, 2025
9751e0b
update comments
robertgshaw2-redhat Apr 8, 2025
2b77bcd
updated
robertgshaw2-redhat Apr 8, 2025
5cbd434
updated
robertgshaw2-redhat Apr 8, 2025
4c6a93e
comment
robertgshaw2-redhat Apr 8, 2025
d8ec5a6
updared
robertgshaw2-redhat Apr 8, 2025
fcd2dc9
updated
robertgshaw2-redhat Apr 8, 2025
1f9c252
format
robertgshaw2-redhat Apr 8, 2025
7350244
[fix] typo to pass format checker
ApostaC Apr 8, 2025
1586d58
updated
robertgshaw2-redhat Apr 8, 2025
e2ecc14
Merge branch 'local-dev/v1-disagg' of https://github.com/ApostaC/vllm…
robertgshaw2-redhat Apr 8, 2025
5accb53
stash
robertgshaw2-redhat Apr 8, 2025
31d807e
stash
robertgshaw2-redhat Apr 8, 2025
a73721a
updated
robertgshaw2-redhat Apr 9, 2025
00df670
updated
robertgshaw2-redhat Apr 9, 2025
4ebcc3e
updated
robertgshaw2-redhat Apr 9, 2025
da019df
updated
robertgshaw2-redhat Apr 9, 2025
90e8c53
updated
robertgshaw2-redhat Apr 9, 2025
8b3f606
updated
robertgshaw2-redhat Apr 9, 2025
0163070
Merge pull request #4 from robertgshaw2-redhat/rob-changes
robertgshaw2-redhat Apr 9, 2025
de1e487
fix nit
robertgshaw2-redhat Apr 9, 2025
48c2eb2
updated
robertgshaw2-redhat Apr 9, 2025
e72e5e4
updated
robertgshaw2-redhat Apr 9, 2025
7833645
updared
robertgshaw2-redhat Apr 9, 2025
1881aa5
updated
robertgshaw2-redhat Apr 9, 2025
eca7a49
cleaning
robertgshaw2-redhat Apr 9, 2025
b0629bd
updated
robertgshaw2-redhat Apr 9, 2025
7766ca5
updated
robertgshaw2-redhat Apr 9, 2025
7b64acb
clean up code
robertgshaw2-redhat Apr 9, 2025
b1310fd
updated
robertgshaw2-redhat Apr 9, 2025
689379e
updaed
robertgshaw2-redhat Apr 9, 2025
62e1421
updated
robertgshaw2-redhat Apr 9, 2025
5145566
updated
robertgshaw2-redhat Apr 9, 2025
20decdf
updated
robertgshaw2-redhat Apr 9, 2025
fc58dd5
updated
robertgshaw2-redhat Apr 9, 2025
25c9592
updated
robertgshaw2-redhat Apr 9, 2025
40e5d81
refactor
robertgshaw2-redhat Apr 9, 2025
e64f745
updated
robertgshaw2-redhat Apr 9, 2025
74af233
done with nits
robertgshaw2-redhat Apr 9, 2025
7c31e29
nits
robertgshaw2-redhat Apr 9, 2025
7f57f3c
update lifecycle
robertgshaw2-redhat Apr 9, 2025
05349a5
updated
robertgshaw2-redhat Apr 9, 2025
8e1eadc
updated
robertgshaw2-redhat Apr 10, 2025
54e1491
updated
robertgshaw2-redhat Apr 10, 2025
9c4159c
updated
robertgshaw2-redhat Apr 10, 2025
1d8415d
rename
robertgshaw2-redhat Apr 10, 2025
406d6bf
Add MLA support for v1 disagg connector (#6)
Flechman Apr 10, 2025
3a24897
[Fix] memory leak problem by proper clean up
ApostaC Apr 11, 2025
c6c4368
fixed test failures
robertgshaw2-redhat Apr 11, 2025
5dff6e9
merge
robertgshaw2-redhat Apr 13, 2025
4afa50e
stash
robertgshaw2-redhat Apr 13, 2025
09be260
clean up typing
robertgshaw2-redhat Apr 14, 2025
3f7844d
cleanup nits
robertgshaw2-redhat Apr 14, 2025
d44f699
updated
robertgshaw2-redhat Apr 14, 2025
329f2e7
updated
robertgshaw2-redhat Apr 14, 2025
72041ca
finish docstring
robertgshaw2-redhat Apr 14, 2025
f9f87f2
updated
robertgshaw2-redhat Apr 14, 2025
33f6e60
make pr easier to read
robertgshaw2-redhat Apr 14, 2025
db28310
updated
robertgshaw2-redhat Apr 14, 2025
3701b5d
stash
robertgshaw2-redhat Apr 14, 2025
deb1323
type checking is wrong for ReqMeta
robertgshaw2-redhat Apr 14, 2025
be789bf
add todo for the morning
robertgshaw2-redhat Apr 14, 2025
a3e5762
revery by id
robertgshaw2-redhat Apr 14, 2025
a03d707
revery by id
robertgshaw2-redhat Apr 14, 2025
f696000
revery by id
robertgshaw2-redhat Apr 14, 2025
1d85e63
readabilty
robertgshaw2-redhat Apr 14, 2025
521ed14
updared
robertgshaw2-redhat Apr 14, 2025
6709943
nits
robertgshaw2-redhat Apr 14, 2025
44ea156
cleanup
robertgshaw2-redhat Apr 14, 2025
8180101
cleaning
robertgshaw2-redhat Apr 14, 2025
c3a2cc6
fix bug
robertgshaw2-redhat Apr 14, 2025
5273e24
updated
robertgshaw2-redhat Apr 14, 2025
913325f
update name
robertgshaw2-redhat Apr 14, 2025
75c24d3
cleanup
robertgshaw2-redhat Apr 14, 2025
17a3618
updated
robertgshaw2-redhat Apr 14, 2025
b4bd117
updated
robertgshaw2-redhat Apr 14, 2025
01caf61
updated
robertgshaw2-redhat Apr 14, 2025
d8549cb
updated
robertgshaw2-redhat Apr 14, 2025
b362ef1
trying to fix mm, added tests
robertgshaw2-redhat Apr 15, 2025
485b22e
Merge remote-tracking branch 'upstream/main' into local-dev/v1-disagg
robertgshaw2-redhat Apr 15, 2025
78d523e
update comment
robertgshaw2-redhat Apr 15, 2025
4c38138
updated
robertgshaw2-redhat Apr 15, 2025
7af6ce2
commit test improvements
robertgshaw2-redhat Apr 15, 2025
1ad993b
remove disaggregated tests
robertgshaw2-redhat Apr 15, 2025
3a08dda
updated
robertgshaw2-redhat Apr 15, 2025
e49874d
update comment
robertgshaw2-redhat Apr 15, 2025
dd7969a
fix test case
robertgshaw2-redhat Apr 15, 2025
e1f130e
improve test code quality
robertgshaw2-redhat Apr 15, 2025
611b782
added better testing
robertgshaw2-redhat Apr 15, 2025
f6b8bff
update comments
robertgshaw2-redhat Apr 15, 2025
9609115
updated
robertgshaw2-redhat Apr 15, 2025
7ce3bd6
updated
robertgshaw2-redhat Apr 15, 2025
c3f38d7
cleanup
robertgshaw2-redhat Apr 15, 2025
6dfda44
updated
robertgshaw2-redhat Apr 15, 2025
81d008a
cosmetic
robertgshaw2-redhat Apr 15, 2025
6d35884
clean up
robertgshaw2-redhat Apr 15, 2025
79fe730
updated
robertgshaw2-redhat Apr 15, 2025
ad18a3b
update nits
robertgshaw2-redhat Apr 15, 2025
edefdff
Merge remote-tracking branch 'upstream/main' into local-dev/v1-disagg
robertgshaw2-redhat Apr 15, 2025
c1a1169
updated
robertgshaw2-redhat Apr 15, 2025
1b8ec0b
updated
robertgshaw2-redhat Apr 16, 2025
ff4b98f
updated
robertgshaw2-redhat Apr 16, 2025
17b61fb
updated
robertgshaw2-redhat Apr 16, 2025
ac0660d
updated
robertgshaw2-redhat Apr 16, 2025
ecfb4ea
updated
robertgshaw2-redhat Apr 16, 2025
abdddf0
cleanup
robertgshaw2-redhat Apr 16, 2025
8695d96
cleanup
robertgshaw2-redhat Apr 16, 2025
7b5ba2c
updated
robertgshaw2-redhat Apr 16, 2025
6be9cf9
fixed preemption
robertgshaw2-redhat Apr 17, 2025
5363ed0
Update vllm/distributed/kv_transfer/kv_connector/factory.py
robertgshaw2-redhat Apr 17, 2025
247195d
fix pre-commit
robertgshaw2-redhat Apr 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# SPDX-License-Identifier: Apache-2.0

from vllm import LLM, SamplingParams
from vllm.config import KVTransferConfig

# Read prompts from output.txt
prompts = []
try:
with open("output.txt") as f:
for line in f:
prompts.append(line.strip())
print(f"Loaded {len(prompts)} prompts from output.txt")
except FileNotFoundError:
print("Error: output.txt file not found")
exit(-1)

sampling_params = SamplingParams(temperature=0, top_p=0.95, max_tokens=10)

llm = LLM(
model="meta-llama/Llama-3.2-1B-Instruct",
enforce_eager=True,
gpu_memory_utilization=0.8,
max_num_batched_tokens=64,
max_num_seqs=16,
kv_transfer_config=KVTransferConfig.from_cli(
'{"kv_connector":"SharedStorageConnector","kv_role":"kv_both",'
'"kv_connector_extra_config": {"shared_storage_path": "local_storage"}}'
)) #, max_model_len=2048, max_num_batched_tokens=2048)

# 1ST generation (prefill instance)
outputs = llm.generate(prompts, sampling_params)

for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# SPDX-License-Identifier: Apache-2.0

from vllm import LLM, SamplingParams
from vllm.config import KVTransferConfig

context = "Hi " * 1000
context2 = "Hey " * 500
prompts = [
context + "Hello, my name is",
context + "The capital of France is",
context2 + "Your name is",
context2 + "The capital of China is",
]

sampling_params = SamplingParams(temperature=0, top_p=0.95, max_tokens=1)

llm = LLM(model="meta-llama/Llama-3.2-1B-Instruct",
enforce_eager=True,
gpu_memory_utilization=0.8,
kv_transfer_config=KVTransferConfig.from_cli(
'{"kv_connector":"SharedStorageConnector","kv_role":"kv_both", '
'"kv_connector_extra_config": '
'{"shared_storage_path": "local_storage"}}')
) #, max_model_len=2048, max_num_batched_tokens=2048)

# 1ST generation (prefill instance)
outputs = llm.generate(
prompts,
sampling_params,
)

new_prompts = []
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
new_prompts.append(prompt + generated_text)
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

# Write new_prompts to output.txt
with open("output.txt", "w") as f:
for prompt in new_prompts:
f.write(prompt + "\n")
print(f"Saved {len(new_prompts)} prompts to output.txt")
5 changes: 5 additions & 0 deletions examples/offline_inference/disaggregated-prefill-v1/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
rm -rf local_storage/
rm output.txt

VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=0 python3 prefill_example.py
VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=0 python3 decode_example.py
Loading