Skip to content

Create Awaitable predict capability #3973

Open
@hooman-bayer

Description

@hooman-bayer

Describe the feature you'd like
Like many other inference libraries in python (e.g. OpenAI), create a real awaitable version of Predict for realtime sagemaker inference endpoints. This will help python applications that use FastAPI and asyncio to deliver realtime responses while not blocking the main event loop. Please note that this feature is different that the one currently available here where the predictions are written to a S3 bucket. This feature would work exactly like https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.Predictor.predict but with an await in real asyncio style.

Sagemaker is an amazing library and it would be just way better for production environments using FastAPI to have this feature.

How would this feature be used? Please describe.
In this case, currently, the sync version looks like this:

response = predictor.predict(input_data)

The async might be looking like

response = await predictor.apredict(input_data)

Describe alternatives you've considered
I considered subclassing the predictor and add the async version.

Additional context
For modern python applications building on top of FastAPI and Asyncio, it is crucial to use async modalities do avoid blocking the main event-loop in the server (in case of scalable applications). Therefore, having a real awaitable functionality would avoid blocking the main event loop of the applications that leverage sagemaker.

Thanks alot

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions