File tree Expand file tree Collapse file tree 1 file changed +43
-2
lines changed Expand file tree Collapse file tree 1 file changed +43
-2
lines changed Original file line number Diff line number Diff line change @@ -4,11 +4,52 @@ The torchao package contains apis and workflows used to apply AO techniques like
4
4
5
5
## Installation
6
6
7
- tbd
7
+ clone repository and install package:
8
+
9
+ ```
10
+ git clone https://github.com/pytorch-labs/ao
11
+ cd ao
12
+ python setup.py install
13
+ ```
14
+
15
+ verify installation:
16
+
17
+ ```
18
+ pip list | grep torchao
19
+ ```
20
+
21
+ should show
22
+ ```
23
+ torchao 0.0.1 <install dir>
24
+ ```
8
25
9
26
## Usage
10
27
11
- tbd
28
+ Relevant APIs can be found in torchao.quantization.quant_api
29
+
30
+ Example
31
+
32
+ ```
33
+ import torch
34
+ from torchao.quantization import quant_api
35
+
36
+ # some user model
37
+ model = torch.nn.Sequential(torch.nn.Linear(32, 64)).cuda().to(torch.bfloat16)
38
+ # some example input
39
+ input = torch.randn(32,32, dtype=torch.bfloat16, device='cuda')
40
+
41
+ # convert linear modules to quantized linear modules
42
+ # insert quantization method/api of choice
43
+ quant_api.apply_weight_only_int8_quant(model)
44
+ # quant_api.apply_dynamic_quant(model)
45
+ # quant_api.change_linear_weights_to_dqtensors(model)
46
+
47
+ # compile the model to improve performance
48
+ torch.compile(model, mode='max-autotune')
49
+ model(input)
50
+ ```
51
+
52
+ ### A16W8 WeightOnly Quantization
12
53
13
54
## License
14
55
You can’t perform that action at this time.
0 commit comments