Skip to content

A critical bug in attention kernel after refactoring #66

@WoosukKwon

Description

@WoosukKwon

It seems there's a critical bug introduced by #53
Running the single_query_cached_kv_attention kernel with certain configurations leads to CUDA illegal memory access errors. I found the bug in the unit tests.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions