[Tracker] Outstanding Issues and WIP Features for version 0.1 #20

HDCharles · 2023-12-05T06:47:01Z

This issue tracks outstanding issues for a torchao 0.1 release

New Functionality
- Test compatibility with PyTorch 2.2 and 2.3rc1 (@cpuhrsch)
- Fix tests marked as flaky (@cpuhrsch)
- int4, int8 weight only quantization support (only need one of the paths to work)
  - path 1: int4, int8 weight quantization subclass API works with TorchTune (@jerryzh168), blocked by tensor subclass save load
  - path 2: int4, int8 weight quantization module swap API works with TorchTune (@jerryzh168), WIP
- Add GPTQuantizer workflow for 4-bit weight quantization (W4A16) for GPU that works for gpt-fast (and executorch) (@jerryzh168, @HDCharles)
  - remove lm-eval from GPTQ code (@HDCharles)
  - Only one of the following need to happen
    - change torchtune code to be compatible with current implementation of GPTQ, specifically change this: https://github.com/pytorch/torchtune/blob/main/torchtune/modules/kv_cache.py#L61-L62 to use index_put_ op (note this is not needed since we don't need to turn on cache during GPTQ)
    - refactor GPTQ to use tensor subclass to remove dependency on export
- Add workflow for 4-bit weight, 8-bit activation quantization (W4A8) with/without GPTQ for executorch (@jerryzh168)
  - without GPTQ path is working, still verifying the GPTQ path
- NF4 Dtype that works for QLoRA in TorchTune (@cpuhrsch)
- Fix API so it works with LoRACompatibleLinear
- Allow apply_quant_api()
  - it currently looks for the children of the module and so doesn't do anything
Tutorials/BE
- Using/Writing a quantization technique using torchao (@jerryzh168)
- Using kernels written in torchao with PyTorch
- Replace Int8WeightOnlyQuantizedLinearWeight and Int8DynamicallyQuantizedLinearWeight with a single class
- Reconsider using class method for Int8DynamicallyQuantizedLinearWeight.from_float
- Remove / guard catch all forward args, kwargs for module swap API
- Land Tutorial Adding tutorial for gpu quantization using torchao tutorials#2730
If time permits (or v0.2)
- Enable test_8da4w_quantize for 2.4 @jerryzh168
- 4-bit quantization CPU perf numbers
- Feature parity between module swap api and subclass api
- Align smoothquant api with others
- - Add high level auto quant API for int8 dynamic and weight-only quantization with benchmarks (@HDCharles)

cpuhrsch · 2024-03-18T18:29:43Z

We still require users to set a private API to get good performance: torch._inductor.config.force_fuse_int_mm_with_mul = True . When do we think we can solve that by? cc @eellison

supriyar · 2024-04-12T18:56:09Z

tracking spillover and new features for 0.2 in #132

* Bring `torch.compile` to `quant_block_v2_`. (#18) Signed-off-by: yiliu30 <[email protected]> * Add `AO_USE_DETERMINISTIC_ALGORITHMS` for reproducing results (#19) Signed-off-by: yiliu30 <[email protected]> * Add `gradient_accumulate_steps` and update results (#20) Signed-off-by: yiliu30 <[email protected]> * update the readme Signed-off-by: yiliu30 <[email protected]> * udpate Signed-off-by: yiliu30 <[email protected]> * update the desc Signed-off-by: yiliu30 <[email protected]> * rename `train_bs` to `batch_size` Signed-off-by: yiliu30 <[email protected]> * update the eval Signed-off-by: yiliu30 <[email protected]> * update Signed-off-by: yiliu30 <[email protected]> --------- Signed-off-by: yiliu30 <[email protected]>

HDCharles changed the title ~~[Tracker]~~ [Tracker] Outstanding Issues and WIP Features Dec 5, 2023

HDCharles changed the title ~~[Tracker] Outstanding Issues and WIP Features~~ [Tracker] Outstanding Issues and WIP Features for version 0.1 Dec 5, 2023

supriyar closed this as completed Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Tracker] Outstanding Issues and WIP Features for version 0.1 #20

[Tracker] Outstanding Issues and WIP Features for version 0.1 #20

HDCharles commented Dec 5, 2023 •

edited by jerryzh168

Loading

cpuhrsch commented Mar 18, 2024

Uh oh!

supriyar commented Apr 12, 2024

Uh oh!

[Tracker] Outstanding Issues and WIP Features for version 0.1 #20

[Tracker] Outstanding Issues and WIP Features for version 0.1 #20

Comments

HDCharles commented Dec 5, 2023 • edited by jerryzh168 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

cpuhrsch commented Mar 18, 2024

Uh oh!

supriyar commented Apr 12, 2024

Uh oh!

HDCharles commented Dec 5, 2023 •

edited by jerryzh168

Loading