Skip to content

[inference provider] Add wavespeed.ai as an inference provider #1424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

arabot777
Copy link

@arabot777 arabot777 commented May 5, 2025

What’s in this PR
WaveSpeedAI is a high-performance AI image and video generation service platform, offering industry-leading generation speeds. Now, want to be listed as an Inference Provider on the Hugging Face Hub

The JS Client Integration was completed based on the inference-providers help documentation and passed the test. I am submitting the pr now and look forward to further communication with you

Test

pnpm --filter @huggingface/inference test "test/InferenceClient.spec.ts" -t "^Wavespeed AI"

> @huggingface/[email protected] test /Users/shanliu/work/huggingface.js/packages/inference
> vitest run --config vitest.config.mts "test/InferenceClient.spec.ts"


 RUN  v0.34.6 /Users/shanliu/work/huggingface.js/packages/inference

 ✓ test/InferenceClient.spec.ts (104) 198160ms
   ✓ InferenceClient (104) 198160ms
     ✓ backward compatibility (1)
       ✓ works with old HfInference name
     ↓ HF Inference (49) [skipped]
       ↓ throws error if model does not exist [skipped]
       ↓ fillMask [skipped]
       ↓ works without model [skipped]
       ↓ summarization [skipped]
       ↓ questionAnswering [skipped]
       ↓ tableQuestionAnswering [skipped]
       ↓ documentQuestionAnswering [skipped]
       ↓ documentQuestionAnswering with non-array output [skipped]
       ↓ visualQuestionAnswering [skipped]
       ↓ textClassification [skipped]
       ↓ textGeneration - gpt2 [skipped]
       ↓ textGeneration - openai-community/gpt2 [skipped]
       ↓ textGenerationStream - meta-llama/Llama-3.2-3B [skipped]
       ↓ textGenerationStream - catch error [skipped]
       ↓ textGenerationStream - Abort [skipped]
       ↓ tokenClassification [skipped]
       ↓ translation [skipped]
       ↓ zeroShotClassification [skipped]
       ↓ sentenceSimilarity [skipped]
       ↓ FeatureExtraction [skipped]
       ↓ FeatureExtraction - auto-compatibility sentence similarity [skipped]
       ↓ FeatureExtraction - facebook/bart-base [skipped]
       ↓ FeatureExtraction - facebook/bart-base, list input [skipped]
       ↓ automaticSpeechRecognition [skipped]
       ↓ audioClassification [skipped]
       ↓ audioToAudio [skipped]
       ↓ textToSpeech [skipped]
       ↓ imageClassification [skipped]
       ↓ zeroShotImageClassification [skipped]
       ↓ objectDetection [skipped]
       ↓ imageSegmentation [skipped]
       ↓ imageToImage [skipped]
       ↓ imageToImage blob data [skipped]
       ↓ textToImage [skipped]
       ↓ textToImage with parameters [skipped]
       ↓ imageToText [skipped]
       ↓ request - openai-community/gpt2 [skipped]
       ↓ tabularRegression [skipped]
       ↓ tabularClassification [skipped]
       ↓ endpoint - makes request to specified endpoint [skipped]
       ↓ endpoint - makes request to specified endpoint - alternative syntax [skipped]
       ↓ chatCompletion modelId - OpenAI Specs [skipped]
       ↓ chatCompletionStream modelId - OpenAI Specs [skipped]
       ↓ chatCompletionStream modelId Fail - OpenAI Specs [skipped]
       ↓ chatCompletion - OpenAI Specs [skipped]
       ↓ chatCompletionStream - OpenAI Specs [skipped]
       ↓ custom mistral - OpenAI Specs [skipped]
       ↓ custom openai - OpenAI Specs [skipped]
       ↓ OpenAI client side routing - model should have provider as prefix [skipped]
     ↓ Fal AI (4) [skipped]
       ↓ textToImage - black-forest-labs/FLUX.1-schnell [skipped]
       ↓ textToImage - SD LoRAs [skipped]
       ↓ textToImage - Flux LoRAs [skipped]
       ↓ automaticSpeechRecognition - openai/whisper-large-v3 [skipped]
     ↓ Featherless (3) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textGeneration [skipped]
     ↓ Replicate (10) [skipped]
       ↓ textToImage canonical - black-forest-labs/FLUX.1-schnell [skipped]
       ↓ textToImage canonical - black-forest-labs/FLUX.1-dev [skipped]
       ↓ textToImage canonical - stabilityai/stable-diffusion-3.5-large-turbo [skipped]
       ↓ textToImage versioned - ByteDance/SDXL-Lightning [skipped]
       ↓ textToImage versioned - ByteDance/Hyper-SD [skipped]
       ↓ textToImage versioned - playgroundai/playground-v2.5-1024px-aesthetic [skipped]
       ↓ textToImage versioned - stabilityai/stable-diffusion-xl-base-1.0 [skipped]
       ↓ textToSpeech versioned [skipped]
       ↓ textToSpeech OuteTTS -  usually Cold [skipped]
       ↓ textToSpeech Kokoro [skipped]
     ↓ SambaNova (3) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ featureExtraction [skipped]
     ↓ Together (4) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textToImage [skipped]
       ↓ textGeneration [skipped]
     ↓ Nebius (3) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textToImage [skipped]
     ↓ 3rd party providers (1) [skipped]
       ↓ chatCompletion - fails with unsupported model [skipped]
     ↓ Fireworks (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ Hyperbolic (4) [skipped]
       ↓ chatCompletion - hyperbolic [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textToImage [skipped]
       ↓ textGeneration [skipped]
     ↓ Novita (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ Black Forest Labs (2) [skipped]
       ↓ textToImage [skipped]
       ↓ textToImage URL [skipped]
     ↓ Cohere (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ Cerebras (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ Nscale (3) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textToImage [skipped]
     ↓ Groq (2) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
     ↓ OVHcloud (4) [skipped]
       ↓ chatCompletion [skipped]
       ↓ chatCompletion stream [skipped]
       ↓ textGeneration [skipped]
       ↓ textGeneration stream [skipped]
     ✓ Wavespeed AI (5) 89033ms
       ✓ textToImage - wavespeed-ai/flux-schnell 89032ms
       ✓ textToImage - wavespeed-ai/flux-dev-lora 12369ms
       ✓ textToImage - wavespeed-ai/flux-dev-lora-ultra-fast 17936ms
       ✓ textToVideo - wavespeed-ai/wan-2.1/t2v-480p 79507ms
       ✓ imageToImage - wavespeed-ai/hidream-e1-full 74481ms

 Test Files  1 passed (1)
      Tests  5 passed | 103 skipped (108)
   Start at  14:33:17
   Duration  89.62s (transform 315ms, setup 14ms, collect 368ms, tests 89.03s, environment 0ms, prepare 74ms)

Copy link
Contributor

@SBrandeis SBrandeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thank you for your contribution
The code is of great quality overall - I left a few comments regarding our code style.
Please make sure the client can be used to query your API for all supported tasks, and that the payload are matching your API.
Thanks again!

Comment on lines +101 to 102
- [HF Inference API (serverless)](https://huggingface.co/models?inference=warm&sort=trending)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [HF Inference API (serverless)](https://huggingface.co/models?inference=warm&sort=trending)

hfModelId: "wavespeed-ai/wan-2.1/i2v-480p",
providerId: "wavespeed-ai/wan-2.1/i2v-480p",
status: "live",
task: "image-to-video",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this task is not supported in the client code - let's remove it for now

Comment on lines +1 to +13
import { InferenceOutputError } from "../lib/InferenceOutputError";
import { ImageToImageArgs } from "../tasks";
import type { BodyParams, HeaderParams, RequestArgs, UrlParams } from "../types";
import { delay } from "../utils/delay";
import { omit } from "../utils/omit";
import { base64FromBytes } from "../utils/base64FromBytes";
import {
TaskProviderHelper,
TextToImageTaskHelper,
TextToVideoTaskHelper,
ImageToImageTaskHelper,
} from "./providerHelper";

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use import type when the import is only used as a type

Suggested change
import { InferenceOutputError } from "../lib/InferenceOutputError";
import { ImageToImageArgs } from "../tasks";
import type { BodyParams, HeaderParams, RequestArgs, UrlParams } from "../types";
import { delay } from "../utils/delay";
import { omit } from "../utils/omit";
import { base64FromBytes } from "../utils/base64FromBytes";
import {
TaskProviderHelper,
TextToImageTaskHelper,
TextToVideoTaskHelper,
ImageToImageTaskHelper,
} from "./providerHelper";
import { InferenceOutputError } from "../lib/InferenceOutputError";
import type { ImageToImageArgs } from "../tasks";
import type { BodyParams, HeaderParams, RequestArgs, UrlParams } from "../types";
import { delay } from "../utils/delay";
import { omit } from "../utils/omit";
import { base64FromBytes } from "../utils/base64FromBytes";
import type {
TaskProviderHelper,
TextToImageTaskHelper,
TextToVideoTaskHelper,
ImageToImageTaskHelper,
} from "./providerHelper";

};
}

type WaveSpeedAIResponse<T = WaveSpeedAITaskResponse> = WaveSpeedAICommonResponse<T>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this type alias is needed, can we remove it?

Suggested change
type WaveSpeedAIResponse<T = WaveSpeedAITaskResponse> = WaveSpeedAICommonResponse<T>;

WaveSpeedAICommonResponse can be renamed to WaveSpeedAIResponse

Comment on lines +124 to +133
case "completed": {
// Get the video data from the first output URL
if (!taskResult.outputs?.[0]) {
throw new InferenceOutputError("No video URL in completed response");
}
const videoResponse = await fetch(taskResult.outputs[0]);
if (!videoResponse.ok) {
throw new InferenceOutputError("Failed to fetch video data");
}
return await videoResponse.blob();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, the payload can be something else than a video (eg an image)
Let's update the error message to reflect that

Comment on lines +170 to +192
if (!args.parameters) {
return {
...args,
model: args.model,
data: args.inputs,
};
} else {
return {
...args,
inputs: base64FromBytes(
new Uint8Array(args.inputs instanceof ArrayBuffer ? args.inputs : await (args.inputs as Blob).arrayBuffer())
),
};
}
}

override preparePayload(params: BodyParams): Record<string, unknown> {
return {
...omit(params.args, ["inputs", "parameters"]),
...(params.args.parameters as Record<string, unknown>),
image: params.args.inputs,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only one of the two ( preparePayload or preparePayloadAsync) should be responsible for building the payload, meaning, I'd rather move the rename of inputs to image in preparePayloadAsync an have preparePayload as dumb as possible

cc @hanouticelina - would love your opinion on that specific point

Comment on lines +179 to +181
inputs: base64FromBytes(
new Uint8Array(args.inputs instanceof ArrayBuffer ? args.inputs : await (args.inputs as Blob).arrayBuffer())
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the wavespeed API support base64-encoded images as inputs?

@hanouticelina hanouticelina added the inference-providers integration of a new or existing Inference Provider label May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inference-providers integration of a new or existing Inference Provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants