[Feature Request] Make cuda-multiarch the default cuda target for mlc-llm.build

cc @junrushao

## 🚀 Feature

MLC-LLM has the capability to build two types of CUDA binaries:
 * slim CUDA binaries, which only work for the CUDA GPU you built on
    currently invoked by running `mlc-llm.build <your_build_options> -target cuda`
 * full CUDA binaries, which work for most CUDA GPU architectures
    currently invoked by running `mlc-llm.build <your_build_options> -target cuda-multiarch`

`cuda-multiarch` is not publicly listed in the MLC-LLM docs, and I only found out about it by talking to Junru directly.

I propose that we switch the `-target cuda` to build the full CUDA binaries, have `-target auto` build full CUDA binaries if it detects a CUDA GPU, and have the user explicitly target something like `-target cuda-slim` if they want to build a binary for their current CUDA GPU only.

## Motivation

Many users are interested in building the full CUDA binaries, and might be surprised to learn that the `-target cuda` only works for the current CUDA device. It does not take much additional time or disk space to build for `-target cuda-multiarch` as compared to `-target cuda`.

## Alternatives

Keep things the way they are today - with the `cuda` target generating a slim CUDA binary and the `cuda-multiarch` target generating a full CUDA binary, but document the difference clearly so that users know that the multiarch option exists.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Make cuda-multiarch the default cuda target for mlc-llm.build #1020

🚀 Feature

Motivation

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Make cuda-multiarch the default cuda target for mlc-llm.build #1020

Description

🚀 Feature

Motivation

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions