Skip to content

Commit a032d40

Browse files
authored
[Docs] Iterate model prebuilts docs (mlc-ai#1043)
* Iterate model prebuilts docs * small fix
1 parent 85001ed commit a032d40

File tree

3 files changed

+59
-90
lines changed

3 files changed

+59
-90
lines changed

README.md

Lines changed: 26 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,26 @@ Machine Learning Compilation for Large Language Models (MLC LLM) is a high-perfo
5252
</tbody>
5353
</table>
5454

55-
**Prebuilt model support.** MLC LLM supports a wide range of model architectures and variants. We have the following prebuilts which you can
56-
use off-the-shelf. Visit [Prebuilt Models](https://llm.mlc.ai/docs/prebuilt_models.html) to see the full list.
55+
## News
56+
57+
* [08/25/2023] CodeLlama support is up.
58+
* [08/14/2023] [[Post]](https://blog.mlc.ai/2023/08/09/GPU-Accelerated-LLM-on-Orange-Pi) Mali GPU support is up on Orange Pi.
59+
* [08/09/2023] [[Post]](https://blog.mlc.ai/2023/08/09/Making-AMD-GPUs-competitive-for-LLM-inference) ROCm backend is mature to use.
60+
* [08/02/2023] [Dockerfile](https://github.com/mlc-ai/llm-perf-bench/) is released for CUDA performance benchmarking.
61+
* [07/19/2023] Support for Llama2-7B/13B/70B is up.
62+
* [05/22/2023] [[Post]](https://blog.mlc.ai/2023/05/22/bringing-open-large-language-models-to-consumer-devices) RedPajama support is up.
63+
* [05/08/2023] [[Post]](https://blog.mlc.ai/2023/05/08/bringing-hardware-accelerated-language-models-to-android-devices) MLC LLM is now available on Android.
64+
* [05/01/2023] [[Post]](https://blog.mlc.ai/2023/05/01/bringing-accelerated-llm-to-consumer-hardware) MLC LLM is released with Metal, Vulkan and CUDA backends.
65+
* [04/14/2023] [WebLLM](https://github.com/mlc-ai/web-llm) is released prior to MLC LLM with WebGPU and WebAssembly backend.
66+
67+
## Getting Started
68+
69+
Please visit our [documentation](https://llm.mlc.ai/docs/index.html#getting-started) for detailed instructions.
70+
71+
## Prebuilt model support
72+
73+
MLC LLM supports a wide range of model architectures and variants. We have the following prebuilts which you can
74+
use off-the-shelf. Visit [Prebuilt Models](https://llm.mlc.ai/docs/prebuilt_models.html) to see the full list, and [Compile Models via MLC](https://llm.mlc.ai/docs/compilation/compile_models.html) to see how to use models not on this list.
5775

5876
<table style="width:100%">
5977
<thead>
@@ -64,29 +82,8 @@ use off-the-shelf. Visit [Prebuilt Models](https://llm.mlc.ai/docs/prebuilt_mode
6482
</thead>
6583
<tbody>
6684
<tr>
67-
<td rowspan=8>Llama</td>
68-
<td>Llama-2</td>
69-
</tr>
70-
<tr>
71-
<td>Code Llama</td>
72-
</tr>
73-
<tr>
74-
<td>Vicuna</td>
75-
</tr>
76-
<tr>
77-
<td>WizardLM</td>
78-
</tr>
79-
<tr>
80-
<td>WizardMath</td>
81-
</tr>
82-
<tr>
83-
<td>OpenOrca Platypus2</td>
84-
</tr>
85-
<tr>
86-
<td>FlagAlpha Llama-2 Chinese</td>
87-
</tr>
88-
<tr>
89-
<td>georgesung Llama-2 Uncensored</td>
85+
<td>Llama</td>
86+
<td>Llama-2, Code Llama, Vicuna, WizardLM, WizardMath, OpenOrca Platypus2, FlagAlpha Llama-2 Chinese, georgesung Llama-2 Uncensored</td>
9087
</tr>
9188
<tr>
9289
<td>GPT-NeoX</td>
@@ -112,25 +109,13 @@ use off-the-shelf. Visit [Prebuilt Models](https://llm.mlc.ai/docs/prebuilt_mode
112109
<td>ChatGLM</td>
113110
<td></td>
114111
</tr>
112+
<tr>
113+
<td>StableLM</td>
114+
<td></td>
115+
</tr>
115116
</tbody>
116117
</table>
117118

118-
## News
119-
120-
* [08/25/2023] CodeLlama support is up.
121-
* [08/14/2023] [[Post]](https://blog.mlc.ai/2023/08/09/GPU-Accelerated-LLM-on-Orange-Pi) Mali GPU support is up on Orange Pi.
122-
* [08/09/2023] [[Post]](https://blog.mlc.ai/2023/08/09/Making-AMD-GPUs-competitive-for-LLM-inference) ROCm backend is mature to use.
123-
* [08/02/2023] [Dockerfile](https://github.com/mlc-ai/llm-perf-bench/) is released for CUDA performance benchmarking.
124-
* [07/19/2023] Support for Llama2-7B/13B/70B is up.
125-
* [05/22/2023] [[Post]](https://blog.mlc.ai/2023/05/22/bringing-open-large-language-models-to-consumer-devices) RedPajama support is up.
126-
* [05/08/2023] [[Post]](https://blog.mlc.ai/2023/05/08/bringing-hardware-accelerated-language-models-to-android-devices) MLC LLM is now available on Android.
127-
* [05/01/2023] [[Post]](https://blog.mlc.ai/2023/05/01/bringing-accelerated-llm-to-consumer-hardware) MLC LLM is released with Metal, Vulkan and CUDA backends.
128-
* [04/14/2023] [WebLLM](https://github.com/mlc-ai/web-llm) is released prior to MLC LLM with WebGPU and WebAssembly backend.
129-
130-
## Getting Started
131-
132-
Please visit our [this page](https://llm.mlc.ai/docs/index.html#getting-started) for detailed instructions.
133-
134119
## Universal Deployment APIs
135120

136121
MLC LLM provides multiple sets of APIs across platforms and environments. These include

docs/compilation/distribute_compiled_models.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,7 @@ Download the Distributed Models and Run in iOS App
161161

162162
For iOS app, model libraries are statically packed into the app at the time of app building.
163163
Therefore, the iOS app supports running any models whose model libraries are integrated into the app.
164-
You can check the :ref:`list of supported model libraries <prebuilt-models-ios>`.
164+
You can check the :ref:`list of supported model libraries <using-prebuilt-models-ios>`.
165165

166166
To download and run the compiled RedPajama-3B instruct model on iPhone, we need to reuse the integrated ``RedPajama-INCITE-Chat-3B-v1-q4f16_1`` model library.
167167
Please revisit :ref:`distribute-model-step3-specify-model-lib` and make sure the ``model_lib`` field of `mlc-chat-config.json` is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_1``.

docs/prebuilt_models.rst

Lines changed: 32 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -256,16 +256,13 @@ MLC-LLM supports the following model architectures:
256256
:widths: 10 10 15 15
257257
:header-rows: 1
258258

259-
* - Code
260-
- Architecture
261-
- Variants w/ MLC prebuilts
262-
- Variants w/o MLC prebuilts
263-
* - ``llama``
264-
- LLaMa
265-
266-
* :ref:`Prebuilt library table <llama_library_table>`
267-
* `Official link <https://github.com/facebookresearch/llama>`__
268-
* `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/llama.py>`__
259+
* - Model Architecture
260+
- Support
261+
- Available MLC Prebuilts
262+
- Unavailable in MLC Prebuilts
263+
* - `LLaMA <https://github.com/facebookresearch/llama>`__
264+
- * :ref:`Prebuilt Model Library <llama_library_table>`
265+
* `MLC Implementation <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/llama.py>`__
269266
- * :ref:`Llama-2 <llama2_variant_table>`
270267
* :ref:`Code Llama <code_llama_variant_table>`
271268
* :ref:`Vicuna <vicuna_variant_table>`
@@ -280,64 +277,51 @@ MLC-LLM supports the following model architectures:
280277
* `Gorilla <https://huggingface.co/gorilla-llm/gorilla-7b-hf-delta-v0>`__
281278
* `YuLan-Chat <https://github.com/RUC-GSAI/YuLan-Chat>`__
282279
* `WizardCoder (new) <https://github.com/nlpxucan/WizardLM/tree/main/WizardCoder>`__
283-
* - ``gpt-neox``
284-
- GPT-NeoX
285-
286-
* :ref:`Prebuilt library table <gpt_neox_library_table>`
287-
* `Official link <https://github.com/EleutherAI/gpt-neox>`__
288-
* `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/gpt_neox.py>`__
280+
* - `GPT-NeoX <https://github.com/EleutherAI/gpt-neox>`__
281+
- * :ref:`Prebuilt Model Library <gpt_neox_library_table>`
282+
* `MLC Implementation <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/gpt_neox.py>`__
289283
- * :ref:`RedPajama <red_pajama_variant_table>`
290284
- * `Dolly <https://github.com/databrickslabs/dolly>`__
291285
* `Pythia <https://huggingface.co/EleutherAI/pythia-1.4b>`__
292286
* `StableCode <https://huggingface.co/stabilityai/stablecode-instruct-alpha-3b>`__
293-
* - ``gptj``
294-
- GPT-J
295-
296-
* Prebuilt not compiled yet
297-
* `Official link <https://huggingface.co/EleutherAI/gpt-j-6b>`__
298-
* `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/gptj.py>`__
287+
* - `GPT-J <https://huggingface.co/EleutherAI/gpt-j-6b>`__
288+
- * Prebuilt not compiled yet
289+
* `MLC Implementation <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/gptj.py>`__
299290
-
300291
- * `MOSS <https://github.com/OpenLMLab/MOSS>`__
301-
* - ``rwkv``
302-
- RWKV
303-
304-
* :ref:`Prebuilt library table <rwkv_library_table>`
305-
* `Official link <https://github.com/BlinkDL/RWKV-LM>`__
306-
* `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/rwkv.py>`__
292+
* - `RWKV <https://github.com/BlinkDL/RWKV-LM>`__
293+
- * :ref:`Prebuilt Model Library <rwkv_library_table>`
294+
* `MLC Implementation <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/rwkv.py>`__
307295
- * :ref:`RWKV-raven <rwkv_raven_variant_table>`
308296
-
309-
* - ``minigpt``
310-
- MiniGPT
311-
312-
* Prebuilt not compiled yet
313-
* `Official link <https://huggingface.co/Vision-CAIR/MiniGPT-4>`__
314-
* `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/minigpt.py>`__
297+
* - `MiniGPT <https://huggingface.co/Vision-CAIR/MiniGPT-4>`__
298+
- * Prebuilt not compiled yet
299+
* `MLC Implementation <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/minigpt.py>`__
315300
-
316301
- * `MiniGPT-4 <https://huggingface.co/Vision-CAIR/MiniGPT-4>`__
317-
* - ``gpt_bigcode``
318-
- GPTBigCode
319-
320-
* :ref:`Prebuilt library table <gpt_big_code_library_table>`
321-
* `Official link <https://huggingface.co/docs/transformers/model_doc/gpt_bigcode>`__
322-
* `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/gpt_bigcode.py>`__
302+
* - `GPTBigCode <https://huggingface.co/docs/transformers/model_doc/gpt_bigcode>`__
303+
- * :ref:`Prebuilt Model Library <gpt_big_code_library_table>`
304+
* `MLC Implementation <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/gpt_bigcode.py>`__
323305
- * :ref:`WizardCoder (old) <wizard_coder_variant_table>`
324306
- * `StarCoder <https://huggingface.co/bigcode/starcoder>`__
325307
* `SantaCoder <https://huggingface.co/bigcode/gpt_bigcode-santacoder>`__
326-
* - ``chatglm``
327-
- ChatGLM
328-
329-
* Prebuilt not compiled yet
330-
* `Official link <https://github.com/THUDM/ChatGLM-6B/blob/main/README_en.md>`__
331-
* `Relax Code <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/chatglm.py>`__
308+
* - `ChatGLM <https://github.com/THUDM/ChatGLM-6B/blob/main/README_en.md>`__
309+
- * Prebuilt not compiled yet
310+
* `MLC Implementation <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/chatglm.py>`__
332311
-
333312
- * `ChatGLM2 <https://huggingface.co/THUDM/chatglm2-6b>`__
334313
* `CodeGeeX2 <https://huggingface.co/THUDM/codegeex2-6b>`__
314+
* - `StableLM <https://huggingface.co/stabilityai>`__
315+
- * Prebuilt not compiled yet
316+
* `MLC Implementation <https://github.com/mlc-ai/mlc-llm/blob/main/mlc_llm/relax_model/stablelm_3b.py>`__
317+
-
318+
- * `StableLM <https://huggingface.co/collections/stabilityai/stable-lm-650852cfd55dd4e15cdcb30a>`__
335319

336-
If the model variant you are interested in is in one of these model architectures we support (but we have not provided the prebuilt weights yet), you can check the :doc:`model compilation page </compilation/compile_models>` on how to compile your own models. Note that you only need to compile the weights for your model variant and reuse the library file found in Level 2 tables.
320+
If the model variant you are interested in uses one of these model architectures we support (but we have not provided the prebuilt weights yet), you can check out :doc:`/compilation/compile_models` on how to compile your own models. Afterwards, you may follow :doc:`/compilation/distribute_compiled_models` to upload your prebuilt weights to hugging face, and submit a PR that adds an entry to this page, contributing to the community.
337321

338322
For models structured in an architecture we have not supported yet, you could:
339323

340-
- Either `create a new issue <https://github.com/mlc-ai/mlc-llm/issues/new/choose>`_ to request a new model architecture.
324+
- Either `create a [Model Request] issue <https://github.com/mlc-ai/mlc-llm/issues/new?assignees=&labels=new-models&projects=&template=model-request.md&title=%5BModel+Request%5D+>`__ which automatically shows up on our `Model Request Tracking Board <https://github.com/orgs/mlc-ai/projects/2>`__.
341325

342326
- Or follow our tutorial :doc:`Define New Models </tutorials/customize/define_new_models>`, which introduces how to bring a new model architecture to MLC-LLM.
343327

0 commit comments

Comments
 (0)