Skip to content

[BUG] We may need to remove max_memory arg  #115

@Qubitium

Description

@Qubitium

For very large models, multiple GPU may be needed for quantization but max_memory arg appears to be broken. Everything should be handled by accelerate and there should be no need for this arg. Investigate.

delete max_memory=max_memory can run.

Originally posted by @Xu-Chen in #48 (comment)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions