[BUG] We may need to remove max_memory arg 

For very large models, multiple GPU  may be needed for quantization but `max_memory` arg appears to be broken. Everything should be handled by `accelerate` and there should be no need for this arg. Investigate. 

> delete max_memory=max_memory can run.

_Originally posted by @Xu-Chen in https://github.com/ModelCloud/GPTQModel/issues/48#issuecomment-2197972919_