Skip to content

Commit e16898d

Browse files
committed
Adding installation and usage to readme
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: f63aa75 Pull Request resolved: #7
1 parent f2e2f7e commit e16898d

File tree

1 file changed

+43
-2
lines changed

1 file changed

+43
-2
lines changed

README.md

Lines changed: 43 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,52 @@ The torchao package contains apis and workflows used to apply AO techniques like
44

55
## Installation
66

7-
tbd
7+
clone repository and install package:
8+
9+
```
10+
git clone https://github.com/pytorch-labs/ao
11+
cd ao
12+
python setup.py install
13+
```
14+
15+
verify installation:
16+
17+
```
18+
pip list | grep torchao
19+
```
20+
21+
should show
22+
```
23+
torchao 0.0.1 <install dir>
24+
```
825

926
## Usage
1027

11-
tbd
28+
Relevant APIs can be found in torchao.quantization.quant_api
29+
30+
Example
31+
32+
```
33+
import torch
34+
from torchao.quantization import quant_api
35+
36+
# some user model
37+
model = torch.nn.Sequential(torch.nn.Linear(32, 64)).cuda().to(torch.bfloat16)
38+
# some example input
39+
input = torch.randn(32,32, dtype=torch.bfloat16, device='cuda')
40+
41+
# convert linear modules to quantized linear modules
42+
# insert quantization method/api of choice
43+
quant_api.apply_weight_only_int8_quant(model)
44+
# quant_api.apply_dynamic_quant(model)
45+
# quant_api.change_linear_weights_to_dqtensors(model)
46+
47+
# compile the model to improve performance
48+
torch.compile(model, mode='max-autotune')
49+
model(input)
50+
```
51+
52+
### A16W8 WeightOnly Quantization
1253

1354
## License
1455

0 commit comments

Comments
 (0)