Skip to content

↔ [Converter] Add operations to accelerate Transformer encoder #1754

Closed
@narendasan

Description

@narendasan

torch.nn.TransformerEncoder

  • Function Schema:

    • aten::chunk(Tensor(a -> *) self, int chunks, int dim=0) -> Tensor(a)[]
    • aten::_native_multi_head_attention(Tensor query, Tensor key, Tensor value, int embed_dim, int num_head, Tensor qkv_weight, Tensor qkv_bias, Tensor proj_weight, Tensor proj_bias, Tensor? mask=None, bool need_weights=True, bool average_attn_weights=True, int? mask_type=None) -> (Tensor, Tensor)
    • aten::add.t(t[] a, t[] b) -> t[]
    • aten::all.bool(bool[] self) -> bool
    • prim::requires_grad(Tensor a) -> bool
    • aten::any.bool(bool[] self) -> bool
    • aten::_transformer_encoder_layer_fwd(Tensor src, int embed_dim, int num_heads, Tensor qkv_weight, Tensor qkv_bias, Tensor proj_weight, Tensor proj_bias, bool use_gelu, bool norm_first, float eps, Tensor norm_weight_1, Tensor norm_bias_1, Tensor norm_weight_2, Tensor norm_bias_2, Tensor ffn_weight_1, Tensor ffn_bias_1, Tensor ffn_weight_2, Tensor ffn_bias_2, Tensor? mask=None, int? mask_type=None) -> Tensor
    • aten::len.str(str s) -> int
    • aten::str(t elem) -> str
    • prim::device(Tensor a) -> Device
    • aten::is_grad_enabled() -> bool
    • aten::find(str self, str substr, int start=0, int end=-1) -> int
    • prim::is_cuda(Tensor a) -> bool
  • Original PyTorch API:

  • Relevant TensorRT Documentation:

Alternatives

Additional context

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions