Skip to content

Commit f50aeba

Browse files
bythew3iprincepride
authored andcommitted
[TPU] Fix the test_sampler (vllm-project#17820)
Signed-off-by: 汪志鹏 <[email protected]>
1 parent 30eef1d commit f50aeba

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

tests/v1/tpu/test_sampler.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ def test_sampler_different(model_name: str):
2626
enforce_eager=False,
2727
max_num_seqs=1,
2828
max_model_len=512,
29-
max_num_batched_tokens=512)
29+
max_num_batched_tokens=256)
3030
prompts = [
3131
"Write a short story about a robot that dreams for the first time."
3232
]

vllm/v1/attention/backends/pallas.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ class PallasMetadata:
9595
block_tables: torch.Tensor
9696
context_lens: torch.Tensor
9797
query_start_loc: torch.Tensor
98-
num_seqs: int
98+
num_seqs: torch.Tensor
9999

100100

101101
class PallasAttentionBackendImpl(AttentionImpl):

0 commit comments

Comments
 (0)