Skip to content

Commit efcb293

Browse files
Update for API 3.0 online doc (#1940)
Co-authored-by: ZhangJianyu <[email protected]>
1 parent b787940 commit efcb293

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+219
-160
lines changed

README.md

Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -39,21 +39,21 @@ pip install neural-compressor[pt]
3939
# Install 2.X API + Framework extension API + TensorFlow dependency
4040
pip install neural-compressor[tf]
4141
```
42-
> **Note**:
42+
> **Note**:
4343
> Further installation methods can be found under [Installation Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/installation_guide.md). check out our [FAQ](https://github.com/intel/neural-compressor/blob/master/docs/source/faq.md) for more details.
4444
4545
## Getting Started
4646

47-
Setting up the environment:
47+
Setting up the environment:
4848
```bash
4949
pip install "neural-compressor>=2.3" "transformers>=4.34.0" torch torchvision
5050
```
5151
After successfully installing these packages, try your first quantization program.
5252

5353
### Weight-Only Quantization (LLMs)
54-
Following example code demonstrates Weight-Only Quantization on LLMs, it supports Intel CPU, Intel Gaudi2 AI Accelerator, Nvidia GPU, best device will be selected automatically.
54+
Following example code demonstrates Weight-Only Quantization on LLMs, it supports Intel CPU, Intel Gaudi2 AI Accelerator, Nvidia GPU, best device will be selected automatically.
5555

56-
To try on Intel Gaudi2, docker image with Gaudi Software Stack is recommended, please refer to following script for environment setup. More details can be found in [Gaudi Guide](https://docs.habana.ai/en/latest/Installation_Guide/Bare_Metal_Fresh_OS.html#launch-docker-image-that-was-built).
56+
To try on Intel Gaudi2, docker image with Gaudi Software Stack is recommended, please refer to following script for environment setup. More details can be found in [Gaudi Guide](https://docs.habana.ai/en/latest/Installation_Guide/Bare_Metal_Fresh_OS.html#launch-docker-image-that-was-built).
5757
```bash
5858
# Run a container with an interactive shell
5959
docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.14.0/ubuntu22.04/habanalabs/pytorch-installer-2.1.1:latest
@@ -91,9 +91,9 @@ woq_conf = PostTrainingQuantConfig(
9191
)
9292
quantized_model = fit(model=float_model, conf=woq_conf, calib_dataloader=dataloader)
9393
```
94-
**Note:**
94+
**Note:**
9595

96-
To try INT4 model inference, please directly use [Intel Extension for Transformers](https://github.com/intel/intel-extension-for-transformers), which leverages Intel Neural Compressor for model quantization.
96+
To try INT4 model inference, please directly use [Intel Extension for Transformers](https://github.com/intel/intel-extension-for-transformers), which leverages Intel Neural Compressor for model quantization.
9797

9898
### Static Quantization (Non-LLMs)
9999

@@ -121,10 +121,10 @@ quantized_model = fit(model=float_model, conf=static_quant_conf, calib_dataloade
121121
</thead>
122122
<tbody>
123123
<tr>
124-
<td colspan="2" align="center"><a href="./docs/3x/design.md#architecture">Architecture</a></td>
125-
<td colspan="2" align="center"><a href="./docs/3x/design.md#workflow">Workflow</a></td>
124+
<td colspan="2" align="center"><a href="./docs/source/3x/design.md#architecture">Architecture</a></td>
125+
<td colspan="2" align="center"><a href="./docs/source/3x/design.md#workflow">Workflow</a></td>
126126
<td colspan="2" align="center"><a href="https://intel.github.io/neural-compressor/latest/docs/source/api-doc/apis.html">APIs</a></td>
127-
<td colspan="1" align="center"><a href="./docs/3x/llm_recipes.md">LLMs Recipes</a></td>
127+
<td colspan="1" align="center"><a href="./docs/source/3x/llm_recipes.md">LLMs Recipes</a></td>
128128
<td colspan="1" align="center">Examples</td>
129129
</tr>
130130
</tbody>
@@ -135,15 +135,15 @@ quantized_model = fit(model=float_model, conf=static_quant_conf, calib_dataloade
135135
</thead>
136136
<tbody>
137137
<tr>
138-
<td colspan="2" align="center"><a href="./docs/3x/PyTorch.md">Overview</a></td>
139-
<td colspan="2" align="center"><a href="./docs/3x/PT_StaticQuant.md">Static Quantization</a></td>
140-
<td colspan="2" align="center"><a href="./docs/3x/PT_DynamicQuant.md">Dynamic Quantization</a></td>
141-
<td colspan="2" align="center"><a href="./docs/3x/PT_SmoothQuant.md">Smooth Quantization</a></td>
138+
<td colspan="2" align="center"><a href="./docs/source/3x/PyTorch.md">Overview</a></td>
139+
<td colspan="2" align="center"><a href="./docs/source/3x/PT_StaticQuant.md">Static Quantization</a></td>
140+
<td colspan="2" align="center"><a href="./docs/source/3x/PT_DynamicQuant.md">Dynamic Quantization</a></td>
141+
<td colspan="2" align="center"><a href="./docs/source/3x/PT_SmoothQuant.md">Smooth Quantization</a></td>
142142
</tr>
143143
<tr>
144-
<td colspan="4" align="center"><a href="./docs/3x/PT_WeightOnlyQuant.md">Weight-Only Quantization</a></td>
145-
<td colspan="2" align="center"><a href="./docs/3x/PT_MXQuant.md">MX Quantization</a></td>
146-
<td colspan="2" align="center"><a href="./docs/3x/PT_MixedPrecision.md">Mixed Precision</a></td>
144+
<td colspan="4" align="center"><a href="./docs/source/3x/PT_WeightOnlyQuant.md">Weight-Only Quantization</a></td>
145+
<td colspan="2" align="center"><a href="./docs/source/3x/PT_MXQuant.md">MX Quantization</a></td>
146+
<td colspan="2" align="center"><a href="./docs/source/3x/PT_MixedPrecision.md">Mixed Precision</a></td>
147147
</tr>
148148
</tbody>
149149
<thead>
@@ -153,9 +153,9 @@ quantized_model = fit(model=float_model, conf=static_quant_conf, calib_dataloade
153153
</thead>
154154
<tbody>
155155
<tr>
156-
<td colspan="3" align="center"><a href="./docs/3x/TensorFlow.md">Overview</a></td>
157-
<td colspan="3" align="center"><a href="./docs/3x/TF_Quant.md">Static Quantization</a></td>
158-
<td colspan="2" align="center"><a href="./docs/3x/TF_SQ.md">Smooth Quantization</a></td>
156+
<td colspan="3" align="center"><a href="./docs/source/3x/TensorFlow.md">Overview</a></td>
157+
<td colspan="3" align="center"><a href="./docs/source/3x/TF_Quant.md">Static Quantization</a></td>
158+
<td colspan="2" align="center"><a href="./docs/source/3x/TF_SQ.md">Smooth Quantization</a></td>
159159
</tr>
160160
</tbody>
161161
<thead>
@@ -165,24 +165,24 @@ quantized_model = fit(model=float_model, conf=static_quant_conf, calib_dataloade
165165
</thead>
166166
<tbody>
167167
<tr>
168-
<td colspan="4" align="center"><a href="./docs/3x/autotune.md">Auto Tune</a></td>
169-
<td colspan="4" align="center"><a href="./docs/3x/benchmark.md">Benchmark</a></td>
168+
<td colspan="4" align="center"><a href="./docs/source/3x/autotune.md">Auto Tune</a></td>
169+
<td colspan="4" align="center"><a href="./docs/source/3x/benchmark.md">Benchmark</a></td>
170170
</tr>
171171
</tbody>
172172
</table>
173173

174-
> **Note**:
174+
> **Note**:
175175
> From 3.0 release, we recommend to use 3.X API. Compression techniques during training such as QAT, Pruning, Distillation only available in [2.X API](https://github.com/intel/neural-compressor/blob/master/docs/source/2x_user_guide.md) currently.
176176
177177
## Selected Publications/Events
178-
* Blog by Intel: [Neural Compressor: Boosting AI Model Efficiency](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Neural-Compressor-Boosting-AI-Model-Efficiency/post/1604740) (June 2024)
178+
* Blog by Intel: [Neural Compressor: Boosting AI Model Efficiency](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Neural-Compressor-Boosting-AI-Model-Efficiency/post/1604740) (June 2024)
179179
* Blog by Intel: [Optimization of Intel AI Solutions for Alibaba Cloud’s Qwen2 Large Language Models](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-ai-solutions-accelerate-alibaba-qwen2-llms.html) (June 2024)
180180
* Blog by Intel: [Accelerate Meta* Llama 3 with Intel AI Solutions](https://www.intel.com/content/www/us/en/developer/articles/technical/accelerate-meta-llama3-with-intel-ai-solutions.html) (Apr 2024)
181181
* EMNLP'2023 (Under Review): [TEQ: Trainable Equivalent Transformation for Quantization of LLMs](https://openreview.net/forum?id=iaI8xEINAf&referrer=%5BAuthor%20Console%5D) (Sep 2023)
182182
* arXiv: [Efficient Post-training Quantization with FP8 Formats](https://arxiv.org/abs/2309.14592) (Sep 2023)
183183
* arXiv: [Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs](https://arxiv.org/abs/2309.05516) (Sep 2023)
184184

185-
> **Note**:
185+
> **Note**:
186186
> View [Full Publication List](https://github.com/intel/neural-compressor/blob/master/docs/source/publication_list.md).
187187
188188
## Additional Content
@@ -192,8 +192,8 @@ quantized_model = fit(model=float_model, conf=static_quant_conf, calib_dataloade
192192
* [Legal Information](./docs/source/legal_information.md)
193193
* [Security Policy](SECURITY.md)
194194

195-
## Communication
195+
## Communication
196196
- [GitHub Issues](https://github.com/intel/neural-compressor/issues): mainly for bug reports, new feature requests, question asking, etc.
197-
- [Email](mailto:[email protected]): welcome to raise any interesting research ideas on model compression techniques by email for collaborations.
197+
- [Email](mailto:[email protected]): welcome to raise any interesting research ideas on model compression techniques by email for collaborations.
198198
- [Discord Channel](https://discord.com/invite/Wxk3J3ZJkU): join the discord channel for more flexible technical discussion.
199199
- [WeChat group](/docs/source/imgs/wechat_group.jpg): scan the QA code to join the technical discussion.

docs/3x/get_started.md

Lines changed: 0 additions & 88 deletions
This file was deleted.

docs/build_docs/build.sh

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -84,17 +84,18 @@ cp -rf ../docs/ ./source
8484
cp -f "../README.md" "./source/docs/source/Welcome.md"
8585
cp -f "../SECURITY.md" "./source/docs/source/SECURITY.md"
8686

87+
8788
all_md_files=`find ./source/docs -name "*.md"`
8889
for md_file in ${all_md_files}
8990
do
9091
sed -i 's/.md/.html/g' ${md_file}
9192
done
9293

9394

94-
sed -i 's/.\/docs\/source\/_static/./g' ./source/docs/source/Welcome.md ./source/docs/source/user_guide.md
95-
sed -i 's/.md/.html/g; s/.\/docs\/source\//.\//g' ./source/docs/source/Welcome.md ./source/docs/source/user_guide.md
96-
sed -i 's/\/examples\/README.html/https:\/\/github.com\/intel\/neural-compressor\/blob\/master\/examples\/README.md/g' ./source/docs/source/user_guide.md
97-
sed -i 's/https\:\/\/intel.github.io\/neural-compressor\/lates.\/api-doc\/apis.html/https\:\/\/intel.github.io\/neural-compressor\/latest\/docs\/source\/api-doc\/apis.html/g' ./source/docs/source/Welcome.md ./source/docs/source/user_guide.md
95+
# sed -i 's/.\/docs\/source\/_static/./g' ./source/docs/source/Welcome.md ./source/docs/source/user_guide.md
96+
#sed -i 's/.md/.html/g; s/.\/docs\/source\//.\//g' ./source/docs/source/Welcome.md ./source/docs/source/user_guide.md
97+
#sed -i 's/\/examples\/README.html/https:\/\/github.com\/intel\/neural-compressor\/blob\/master\/examples\/README.md/g' ./source/docs/source/user_guide.md
98+
#sed -i 's/https\:\/\/intel.github.io\/neural-compressor\/lates.\/api-doc\/apis.html/https\:\/\/intel.github.io\/neural-compressor\/latest\/docs\/source\/api-doc\/apis.html/g' ./source/docs/source/Welcome.md ./source/docs/source/user_guide.md
9899

99100
sed -i 's/examples\/README.html/https:\/\/github.com\/intel\/neural-compressor\/blob\/master\/examples\/README.md/g' ./source/docs/source/Welcome.md
100101

@@ -130,6 +131,8 @@ if [[ ${UPDATE_VERSION_FOLDER} -eq 1 ]]; then
130131
cp -r ${SRC_FOLDER}/* ${DST_FOLDER}
131132
python update_html.py ${DST_FOLDER} ${VERSION}
132133
cp -r ./source/docs/source/imgs ${DST_FOLDER}/docs/source
134+
cp -r ./source/docs/source/3x/imgs ${DST_FOLDER}/docs/source/3x
135+
133136

134137
cp source/_static/index.html ${DST_FOLDER}
135138
else
@@ -143,6 +146,7 @@ if [[ ${UPDATE_LATEST_FOLDER} -eq 1 ]]; then
143146
cp -r ${SRC_FOLDER}/* ${LATEST_FOLDER}
144147
python update_html.py ${LATEST_FOLDER} ${VERSION}
145148
cp -r ./source/docs/source/imgs ${LATEST_FOLDER}/docs/source
149+
cp -r ./source/docs/source/3x/imgs ${LATEST_FOLDER}/docs/source/3x
146150
cp source/_static/index.html ${LATEST_FOLDER}
147151
else
148152
echo "skip to create ${LATEST_FOLDER}"
@@ -152,7 +156,7 @@ echo "Create document is done"
152156

153157
if [[ ${CHECKOUT_GH_PAGES} -eq 1 ]]; then
154158
git clone -b gh-pages --single-branch https://github.com/intel/neural-compressor.git ${RELEASE_FOLDER}
155-
159+
156160
if [[ ${UPDATE_VERSION_FOLDER} -eq 1 ]]; then
157161
python update_version.py ${ROOT_DST_FOLDER} ${VERSION}
158162
cp -rf ${DST_FOLDER} ${RELEASE_FOLDER}
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)