Skip to content

Any plans to add ModelScope's 1.7B text2video synthesis diffusion model? #2736

@kabachuha

Description

@kabachuha

Model/Pipeline/Scheduler description

Hello!

There seems to be a new 1.7B-parameter Diffusion-based model by ModelScope allowing text2video synthesis as noted by AKHaliq https://twitter.com/_akhaliq/status/1637321077553606657?s=20. Both the model implementation and weights (downloaded with their pipeline) are in open access and it's already possible to launch it via HuggingFace's spaces. However, the model lacks a lot of possible optimizations, especially concerning LowVRAM mode, and accessibility options, and I believe it would benefit greatly from the help of Diffusers community.

Example: monkey playing on drums

tmp2tkrr492.mp4

At this time the model should be fitting around 16 gbs of VRAM, but since it's a combination of 4 gb, 6 gb, and 5 gb models, I believe with half precision and sequential pipeline it will be eventually possible to launch it on modern consumer hardware.

The license is Apache-2.0 license, so there will be no problems with using the code as the reference.

Open source status

  • The model implementation is available
  • The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

HuggingFace space:

https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis

All the parts of the model at HuggingFace:

https://huggingface.co/damo-vilab/modelscope-damo-text-to-video-synthesis/tree/main

The model PyTorch implementation:

https://github.com/modelscope/modelscope/tree/master/modelscope/models/multi_modal/video_synthesis

Google Colab from the devs:

https://colab.research.google.com/drive/1uW1ZqswkQ9Z9bp5Nbo5z59cAn7I0hE6R?usp=sharing

License: Apache-2.0 license

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions