Skip to content

Support for Hugginface multimodal models #4330

Open
@vincentclaes

Description

@vincentclaes

Describe the feature you'd like

Being able to deploy huggingface multimodal models to a sagemaker endpoint.
Currently only language models are supported that require a prompt as input.
Multimodal models like Llava / CLIP / ... require a prompt and an image as input and this is currently not supported.

How would this feature be used? Please describe.

This is how the feature will be used by the end user:

import sagemaker
from sagemaker.huggingface.model import HuggingFaceModel

huggingface_model = HuggingFaceModel(
   model_data="some-repo/some-multimodal-model",   # <<----- specify the model
   role=sagemaker.get_execution_role()
   transformers_version="4.28.1", 
   pytorch_version="2.0.0",      
   py_version='py310',        
   model_server_workers=1
)

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.12xlarge",
    container_startup_health_check_timeout=900, # increase timeout for large models
    model_data_download_timeout=900, # increase timeout for large models
)
----------------!
### Call Llava
import base64
import requests

# request
data = {
    "image" : "some base64 encoded image", # <---- specify the image
    "question" : "Describe the image and color details.",
}
output = predictor.predict(data)
print(output)

Describe alternatives you've considered

You can package the model yourself and provide an inference.py script, but you have to download the model and tar.gz which takes a lot of time.

Additional context

I came up with this idea when I created a tar.gz for llava with an inference.py and made it available to the world. See my LinkedIn post here: https://www.linkedin.com/posts/vincent-claes-0b346337_aws-sagemaker-huggingface-activity-7141776348963885056-Uv0g?utm_source=share&utm_medium=member_desktop

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions