Skip to content

Higher level API for post-processing/optimization #2261

Open
@titaiwangms

Description

@titaiwangms

We are having more and more graph-level passes for users to combine their own needs. But it could be inconvenient for the users who only want a standard combination (clear metadata, lift constants, and fold transpose): https://github.com/kunal-vaishnavi/phi4-mm/pull/1/files#r2069106112.

The optimize_model api currently used in onnxruntime.

https://github.com/microsoft/onnxruntime/blob/9194cadbb02cb89bd0706f654f9abc06b39ce8bf/onnxruntime/python/tools/transformers/optimizer.py#L275

def optimize_model(
    input: str | ModelProto,
    model_type: str = "bert",
    num_heads: int = 0,
    hidden_size: int = 0,
    optimization_options: FusionOptions | None = None,
    opt_level: int | None = None,
    use_gpu: bool = False,
    only_onnxruntime: bool = False,
    verbose: bool = False,
    *,
    provider: str | None = None,
) -> OnnxModel:

cc @xadupre @gramalingam @shubhambhokare1 @justinchuby

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions