You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+26-41Lines changed: 26 additions & 41 deletions
Original file line number
Diff line number
Diff line change
@@ -52,8 +52,26 @@ Machine Learning Compilation for Large Language Models (MLC LLM) is a high-perfo
52
52
</tbody>
53
53
</table>
54
54
55
-
**Prebuilt model support.** MLC LLM supports a wide range of model architectures and variants. We have the following prebuilts which you can
56
-
use off-the-shelf. Visit [Prebuilt Models](https://llm.mlc.ai/docs/prebuilt_models.html) to see the full list.
55
+
## News
56
+
57
+
*[08/25/2023] CodeLlama support is up.
58
+
*[08/14/2023][[Post]](https://blog.mlc.ai/2023/08/09/GPU-Accelerated-LLM-on-Orange-Pi) Mali GPU support is up on Orange Pi.
59
+
*[08/09/2023][[Post]](https://blog.mlc.ai/2023/08/09/Making-AMD-GPUs-competitive-for-LLM-inference) ROCm backend is mature to use.
60
+
*[08/02/2023][Dockerfile](https://github.com/mlc-ai/llm-perf-bench/) is released for CUDA performance benchmarking.
61
+
*[07/19/2023] Support for Llama2-7B/13B/70B is up.
62
+
*[05/22/2023][[Post]](https://blog.mlc.ai/2023/05/22/bringing-open-large-language-models-to-consumer-devices) RedPajama support is up.
63
+
*[05/08/2023][[Post]](https://blog.mlc.ai/2023/05/08/bringing-hardware-accelerated-language-models-to-android-devices) MLC LLM is now available on Android.
64
+
*[05/01/2023][[Post]](https://blog.mlc.ai/2023/05/01/bringing-accelerated-llm-to-consumer-hardware) MLC LLM is released with Metal, Vulkan and CUDA backends.
65
+
*[04/14/2023][WebLLM](https://github.com/mlc-ai/web-llm) is released prior to MLC LLM with WebGPU and WebAssembly backend.
66
+
67
+
## Getting Started
68
+
69
+
Please visit our [documentation](https://llm.mlc.ai/docs/index.html#getting-started) for detailed instructions.
70
+
71
+
## Prebuilt model support
72
+
73
+
MLC LLM supports a wide range of model architectures and variants. We have the following prebuilts which you can
74
+
use off-the-shelf. Visit [Prebuilt Models](https://llm.mlc.ai/docs/prebuilt_models.html) to see the full list, and [Compile Models via MLC](https://llm.mlc.ai/docs/compilation/compile_models.html) to see how to use models not on this list.
57
75
58
76
<tablestyle="width:100%">
59
77
<thead>
@@ -64,29 +82,8 @@ use off-the-shelf. Visit [Prebuilt Models](https://llm.mlc.ai/docs/prebuilt_mode
@@ -112,25 +109,13 @@ use off-the-shelf. Visit [Prebuilt Models](https://llm.mlc.ai/docs/prebuilt_mode
112
109
<td>ChatGLM</td>
113
110
<td></td>
114
111
</tr>
112
+
<tr>
113
+
<td>StableLM</td>
114
+
<td></td>
115
+
</tr>
115
116
</tbody>
116
117
</table>
117
118
118
-
## News
119
-
120
-
*[08/25/2023] CodeLlama support is up.
121
-
*[08/14/2023][[Post]](https://blog.mlc.ai/2023/08/09/GPU-Accelerated-LLM-on-Orange-Pi) Mali GPU support is up on Orange Pi.
122
-
*[08/09/2023][[Post]](https://blog.mlc.ai/2023/08/09/Making-AMD-GPUs-competitive-for-LLM-inference) ROCm backend is mature to use.
123
-
*[08/02/2023][Dockerfile](https://github.com/mlc-ai/llm-perf-bench/) is released for CUDA performance benchmarking.
124
-
*[07/19/2023] Support for Llama2-7B/13B/70B is up.
125
-
*[05/22/2023][[Post]](https://blog.mlc.ai/2023/05/22/bringing-open-large-language-models-to-consumer-devices) RedPajama support is up.
126
-
*[05/08/2023][[Post]](https://blog.mlc.ai/2023/05/08/bringing-hardware-accelerated-language-models-to-android-devices) MLC LLM is now available on Android.
127
-
*[05/01/2023][[Post]](https://blog.mlc.ai/2023/05/01/bringing-accelerated-llm-to-consumer-hardware) MLC LLM is released with Metal, Vulkan and CUDA backends.
128
-
*[04/14/2023][WebLLM](https://github.com/mlc-ai/web-llm) is released prior to MLC LLM with WebGPU and WebAssembly backend.
129
-
130
-
## Getting Started
131
-
132
-
Please visit our [this page](https://llm.mlc.ai/docs/index.html#getting-started) for detailed instructions.
133
-
134
119
## Universal Deployment APIs
135
120
136
121
MLC LLM provides multiple sets of APIs across platforms and environments. These include
Copy file name to clipboardExpand all lines: docs/compilation/distribute_compiled_models.rst
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -161,7 +161,7 @@ Download the Distributed Models and Run in iOS App
161
161
162
162
For iOS app, model libraries are statically packed into the app at the time of app building.
163
163
Therefore, the iOS app supports running any models whose model libraries are integrated into the app.
164
-
You can check the :ref:`list of supported model libraries <prebuilt-models-ios>`.
164
+
You can check the :ref:`list of supported model libraries <using-prebuilt-models-ios>`.
165
165
166
166
To download and run the compiled RedPajama-3B instruct model on iPhone, we need to reuse the integrated ``RedPajama-INCITE-Chat-3B-v1-q4f16_1`` model library.
167
167
Please revisit :ref:`distribute-model-step3-specify-model-lib` and make sure the ``model_lib`` field of `mlc-chat-config.json` is set to ``RedPajama-INCITE-Chat-3B-v1-q4f16_1``.
If the model variant you are interested in is in one of these model architectures we support (but we have not provided the prebuilt weights yet), you can check the:doc:`model compilation page </compilation/compile_models>` on how to compile your own models. Note that you only need to compile the weights for your model variant and reuse the library file found in Level 2 tables.
320
+
If the model variant you are interested in uses one of these model architectures we support (but we have not provided the prebuilt weights yet), you can check out:doc:`/compilation/compile_models` on how to compile your own models. Afterwards, you may follow :doc:`/compilation/distribute_compiled_models` to upload your prebuilt weights to hugging face, and submit a PR that adds an entry to this page, contributing to the community.
337
321
338
322
For models structured in an architecture we have not supported yet, you could:
339
323
340
-
- Either `create a new issue <https://github.com/mlc-ai/mlc-llm/issues/new/choose>`_ to request a new model architecture.
324
+
- Either `create a [Model Request] issue <https://github.com/mlc-ai/mlc-llm/issues/new?assignees=&labels=new-models&projects=&template=model-request.md&title=%5BModel+Request%5D+>`__ which automatically shows up on our `Model Request Tracking Board <https://github.com/orgs/mlc-ai/projects/2>`__.
341
325
342
326
- Or follow our tutorial :doc:`Define New Models </tutorials/customize/define_new_models>`, which introduces how to bring a new model architecture to MLC-LLM.
0 commit comments