Skip to content

Commit cc09cec

Browse files
committed
Fix LinearInt8 recursive quantization (#791)
Test plan: ``` % python3 torchchat.py generate llama2 --dtype float16 --quantize '{"linear:int8": {"groupsize": 0}}' --prompt "Once upon a time," --device mps Using device=mps Loading model... Time to load model: 29.03 seconds Quantizing the model with: {'linear:int8': {'groupsize': 0}} Time to quantize model: 14.37 seconds ``` Fixes #788
1 parent 185d697 commit cc09cec

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

quantize.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -410,8 +410,8 @@ def quantize(self, module):
410410
groupsize=self.groupsize,
411411
),
412412
)
413-
else:
414-
self.quantize(child)
413+
else:
414+
self.quantize(child)
415415

416416
return module
417417

0 commit comments

Comments
 (0)