Available models
The following models are currently available:Model | Description | Endpoint URL | Type | Price |
---|---|---|---|---|
Qwen3 32B AWQ | The latest LLM in the Qwen series, offering advancements in reasoning, instruction-following, agent capabilities, and multilingual support. | https://api.runpod.ai/v2/qwen3-32b-awq/ | Text | $0.01 per 1000 tokens |
Flux Dev | Offers exceptional prompt adherence, high visual fidelity, and rich image detail. | https://api.runpod.ai/v2/black-forest-labs-flux-1-dev/ | Image | $.02 per megapixel |
Flux Schnell | Fastest and most lightweight FLUX model, ideal for local development, prototyping, and personal use. | https://api.runpod.ai/v2/black-forest-labs-flux-1-schnell/ | Image | $.0024 per megapixel |
Flux Kontext Dev | A 12 billion parameter rectified flow transformer capable of editing images based on text instructions. | https://api.runpod.ai/v2/black-forest-labs-flux-1-kontext-dev/ | Image | $0.03 per megapixel |
Qwen Image | Image generation foundation model with advanced text rendering. | https://api.runpod.ai/v2/qwen-image-t2i/ | Image | $0.02 per megapixel |
Qwen Image LoRA | Image generation with LoRA support and advanced text rendering. | https://api.runpod.ai/v2/qwen-image-t2i-lora/ | Image | $0.02 per megapixel |
Qwen Image Edit | Image editing with unique text rendering capabilities. | https://api.runpod.ai/v2/qwen-image-edit/ | Image | $0.02 per megapixel |
Seedream 4.0 T2I | New-generation image creation with unified generation and editing architecture. | https://api.runpod.ai/v2/seedream-v4-t2i/ | Image | $0.027 per megapixel |
Seedream 4.0 Edit | New-generation image editing with unified generation and editing architecture. | https://api.runpod.ai/v2/seedream-v4-edit/ | Image | $0.027 per megapixel |
Seedream 3.0 | Native high-resolution bilingual image generation (Chinese-English). | https://api.runpod.ai/v2/seedream-3-0-t2i/ | Image | $0.03 per megapixel |
Nano Banana Edit | Google’s state-of-the-art image editing model. | https://api.runpod.ai/v2/nano-banana-edit/ | Image | $0.027 per megapixel |
Seedance 1.0 Pro | High-performance video generation with multi-shot storytelling. | https://api.runpod.ai/v2/seedance-1-0-pro/ | Video | $0.62 per 5 seconds of video |
WAN 2.5 | Image-to-video generation model. | https://api.runpod.ai/v2/wan-2-5/ | Video | $0.50 per 5 seconds of video |
WAN 2.2 I2V 720p LoRA | Open-source video generation with LoRA support. | https://api.runpod.ai/v2/wan-2-2-t2v-720-lora/ | Video | $0.35 per 5 seconds of video |
WAN 2.2 I2V 720p | Open-source AI video generation model that uses a diffusion transformer architecture for image-to-video generation. | https://api.runpod.ai/v2/wan-2-2-i2v-720/ | Video | $0.30 per 5 seconds of video |
WAN 2.2 T2V 720p | Open-source AI video generation model that uses a diffusion transformer architecture for text-to-video generation. | https://api.runpod.ai/v2/wan-2-2-t2v-720/ | Video | $0.30 per 5 seconds of video |
WAN 2.1 I2V 720p | Open-source AI video generation model that uses a diffusion transformer architecture for image-to-video generation. | https://api.runpod.ai/v2/wan-2-1-i2v-720/ | Video | $0.30 per 5 seconds of video |
WAN 2.1 T2V 720p | Open-source AI video generation model that uses a diffusion transformer architecture for text-to-video generation. | https://api.runpod.ai/v2/wan-2-1-t2v-720/ | Video | $0.30 per 5 seconds of video |
Kling v2.1 I2V Pro | Professional-grade image-to-video with enhanced visual fidelity. | https://api.runpod.ai/v2/kling-v2-1-i2v-pro/ | Video | $0.36 per 5 seconds of video |
Whisper V3 Large | State-of-the-art automatic speech recognition. | https://api.runpod.ai/v2/whisper-v3-large/ | Audio | $0.05 per 1000 characters of audio transcribed |
Minimax Speech 02 HD | High-definition text-to-speech model. | https://api.runpod.ai/v2/minimax-speech-02-hd/ | Audio | $0.05 per 1000 characters of audio generated |
Model-specific parameters
Each Public Endpoint accepts a different set of parameters to control the generation process.Flux Dev
Flux Dev is optimized for high-quality, detailed image generation. The model accepts several parameters to control the generation process:Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
prompt | string | Yes | - | - | Text description of the desired image. |
negative_prompt | string | No | - | - | Elements to exclude from the image. |
width | integer | No | 1024 | 256-1536 | Image width in pixels. Must be divisible by 64. |
height | integer | No | 1024 | 256-1536 | Image height in pixels. Must be divisible by 64. |
num_inference_steps | integer | No | 28 | 1-50 | Number of denoising steps. |
guidance | float | No | 7.5 | 0.0-10.0 | How closely to follow the prompt. |
seed | integer | No | -1 | - | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
image_format | string | No | ”jpeg" | "png” or “jpeg” | Output format. |
Flux Schnell
Flux Schnell is optimized for speed and real-time applications:Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
prompt | string | Yes | - | - | Text description of the desired image. |
negative_prompt | string | No | - | - | Elements to exclude from the image. |
width | integer | No | 1024 | 256-1536 | Image width in pixels. Must be divisible by 64. |
height | integer | No | 1024 | 256-1536 | Image height in pixels. Must be divisible by 64. |
num_inference_steps | integer | No | 4 | 1-8 | Number of denoising steps. |
guidance | float | No | 7.5 | 0.0-10.0 | How closely to follow the prompt. |
seed | integer | No | -1 | - | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
image_format | string | No | ”jpeg" | "png” or “jpeg” | Output format. |
Flux Schnell is optimized for speed and works best with lower step counts. Using higher values may not improve quality significantly.
Qwen3 32B AWQ
Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models.Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
prompt | string | Yes | - | - | Prompt for text generation. |
max_tokens | integer | No | 512 | - | Maximum number of tokens to output. |
temperature | float | No | 0.7 | 0.0 - 1.0 | Randomness of the output. Lower temperature makes the output more predictable and deterministic. |
top_p | integer | No | - | Samples from the smallest set of words whose cumulative probability exceeds a given threshold (P). | |
top_k | integer | No | - | 1-8 | Restricts sampling to the top K most probable words. |
stop | string | No | - | - | Stops generation if the given string is encountered. |
OpenAI API request example
OpenAI API request example
OpenAI API streaming example
OpenAI API streaming example
You can stream responses from the OpenAI API using the
stream
and stream_options
parameters:stream_options={"include_usage": True}
is required for streaming to work with vLLM Public Endpoints.Response format
Response format
Qwen Image
Qwen Image is an image generation foundation model with advanced text rendering capabilities.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Text description of the desired image. |
negative_prompt | string | No | - | Elements to exclude from the image. |
size | string | No | ”1024*1024” | Image dimensions. |
seed | integer | No | -1 | Seed for reproducible results. -1 generates a random seed. |
enable_safety_checker | boolean | No | true | Enable content safety checking. |
Qwen Image LoRA
Qwen Image with LoRA support allows you to customize generation with fine-tuned LoRA models.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Text description of the desired image. |
loras | array | No | [] | Array of LoRA configurations to apply. |
loras[].path | string | Yes | - | URL or path to the LoRA model file. |
loras[].scale | number | Yes | - | Scale factor for the LoRA influence (typically 0-1). |
size | string | No | ”1024*1024” | Image dimensions. |
seed | integer | No | -1 | Seed for reproducible results. -1 generates a random seed. |
enable_safety_checker | boolean | No | true | Enable content safety checking. |
Seedream 3.0
Seedream 3.0 is a native high-resolution bilingual image generation model supporting both Chinese and English prompts.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Text description of the desired image. |
seed | integer | No | -1 | Seed for reproducible results. -1 generates a random seed. |
guidance | number | No | 2 | Guidance scale for generation control. |
size | string | No | ”1024x1024” | Image dimensions. |
Seedream 4.0 T2I
Seedream 4.0 is a new-generation image creation model that integrates both generation and editing capabilities.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Text description of the desired image. |
negative_prompt | string | No | - | Elements to exclude from the image. |
size | string | No | ”1024*1024” | Image dimensions. |
seed | integer | No | -1 | Seed for reproducible results. -1 generates a random seed. |
enable_safety_checker | boolean | No | true | Enable content safety checking. |
Nano Banana Edit
Google’s Nano Banana Edit is a state-of-the-art image editing model that combines multiple source images.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Editing instructions describing the desired transformation. |
images | array | Yes | - | Array of image URLs to edit or combine. |
enable_safety_checker | boolean | No | true | Enable content safety checking. |
Qwen Image Edit
Qwen Image Edit extends the text rendering capabilities to image editing tasks, enabling precise text editing.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Editing instructions describing the desired changes. |
negative_prompt | string | No | - | Elements to exclude from the edited image. |
seed | integer | No | -1 | Seed for reproducible results. -1 generates a random seed. |
image | string | Yes | - | URL of the image to edit. |
output_format | string | No | ”jpeg” | Output format (“png” or “jpeg”). |
enable_safety_checker | boolean | No | true | Enable content safety checking. |
Seedream 4.0 Edit
Seedream 4.0 Edit provides advanced image editing capabilities with the same unified architecture as Seedream 4.0 T2I.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Editing instructions describing the desired transformation. |
images | array | Yes | - | Array of image URLs to edit or combine. |
size | string | No | ”1024*1024” | Output image dimensions. |
enable_safety_checker | boolean | No | true | Enable content safety checking. |
Kling v2.1 I2V Pro
Kling 2.1 Pro generates videos from static images with additional control parameters.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Text description of the desired video. |
image | string | Yes | - | URL of the source image to animate. |
negative_prompt | string | No | - | Elements to exclude from the video. |
guidance_scale | float | No | 0.5 | How closely to follow the prompt. |
duration | integer | No | 5 | Video duration in seconds. |
enable_safety_checker | boolean | No | true | Enable content safety checking. |
Seedance 1.0 Pro
Seedance 1.0 Pro is a high-performance video generation model with multi-shot storytelling capabilities.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Text description of the desired video scene. |
duration | integer | No | 5 | Video duration in seconds. |
fps | integer | No | 24 | Frames per second for the output video. |
size | string | No | ”1920x1080” | Video dimensions. |
image | string | No | "" | Optional source image URL for image-to-video generation. |
Whisper V3 Large
Whisper V3 Large is a state-of-the-art automatic speech recognition model that transcribes audio to text.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | No | "" | Optional context or prompt to guide transcription. |
audio | string | Yes | - | URL of the audio file to transcribe. |
Minimax Speech 02 HD
Minimax Speech 02 HD is a high-definition text-to-speech model with emotional control and voice customization.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Text to convert to speech. |
voice_id | string | No | ”Wise_Woman” | Voice identifier for the desired voice. |
speed | number | No | 1 | Speech speed multiplier. |
volume | number | No | 1 | Volume level. |
pitch | number | No | 0 | Pitch adjustment. |
emotion | string | No | ”neutral” | Emotion to convey (e.g., “happy”, “sad”). |
english_normalization | boolean | No | false | Enable English text normalization. |
default_audio_url | string | No | "" | Fallback audio URL. |
Flux Kontext Dev
A 12 billion parameter model for editing images based on text instructions.Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
prompt | string | Yes | - | - | Text instructions describing the desired edits to the image. |
negative_prompt | string | No | "" | - | Elements to exclude from the edited image. |
image | string | Yes | - | - | URL of the input image to edit. |
size | string | No | ”1024*1024” | - | Output image size in format “width*height”. |
num_inference_steps | integer | No | 28 | 1-50 | Number of denoising steps. |
guidance | float | No | 2 | 0.0-10.0 | How closely to follow the prompt. |
seed | integer | No | -1 | - | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
output_format | string | No | ”png" | "png” or “jpeg” | Output image format. |
enable_safety_checker | boolean | No | true | - | Whether to run safety checks on the output. |
WAN 2.5
WAN 2.5 generates videos from static images.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Text description of the desired video. |
image | string | Yes | - | URL of the source image to animate. |
negative_prompt | string | No | - | Elements to exclude from the video. |
size | string | No | ”1280*720” | Video dimensions. |
duration | integer | No | 5 | Video duration in seconds. |
seed | integer | No | -1 | Seed for reproducible results. -1 generates a random seed. |
enable_prompt_expansion | boolean | No | false | Automatically expand and enhance the prompt. |
enable_safety_checker | boolean | No | true | Enable content safety checking. |
Wan 2.2 I2V 720p LoRA
Wan 2.2 is an open-source video generation model with LoRA support for customized camera movements and effects.Parameter | Type | Required | Default | Description |
---|---|---|---|---|
prompt | string | Yes | - | Text description of the desired video motion. |
image | string | Yes | - | URL of the source image to animate. |
high_noise_loras | array | No | [] | LoRA configurations for high-noise stages. |
high_noise_loras[].path | string | Yes | - | URL or path to the LoRA model file. |
high_noise_loras[].scale | number | Yes | - | Scale factor for the LoRA influence. |
low_noise_loras | array | No | [] | LoRA configurations for low-noise stages. |
low_noise_loras[].path | string | Yes | - | URL or path to the LoRA model file. |
low_noise_loras[].scale | number | Yes | - | Scale factor for the LoRA influence. |
duration | integer | No | 5 | Video duration in seconds. |
seed | integer | No | -1 | Seed for reproducible results. -1 generates a random seed. |
enable_safety_checker | boolean | No | true | Enable content safety checking. |
Wan 2.2 I2V 720p
An open-source image-to-video generation model that creates 720p video content from static images.Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
prompt | string | Yes | - | - | Text description of the desired video motion and content. |
image | string | Yes | - | - | URL of the input image to animate. |
negative_prompt | string | No | "" | - | Elements to exclude from the generated video. |
size | string | No | ”1280*720” | - | Video resolution in format “width*height”. |
num_inference_steps | integer | No | 30 | 1-50 | Number of denoising steps. |
guidance | float | No | 5 | 0.0-10.0 | How closely to follow the prompt. |
duration | integer | No | 5 | - | Video duration in seconds. |
flow_shift | integer | No | 5 | - | Controls the motion flow in the generated video. |
seed | integer | No | -1 | - | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
enable_prompt_optimization | boolean | No | false | - | Whether to automatically optimize the prompt. |
enable_safety_checker | boolean | No | true | - | Whether to run safety checks on the output. |
Wan 2.2 T2V 720p
Open-source model for generating 720p videos from text prompts.Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
prompt | string | Yes | - | - | Text description of the desired video content. |
negative_prompt | string | No | "" | - | Elements to exclude from the generated video. |
size | string | No | ”1280*720” | - | Video resolution in format “width*height”. |
num_inference_steps | integer | No | 30 | 1-50 | Number of denoising steps. |
guidance | float | No | 5 | 0.0-10.0 | How closely to follow the prompt. |
duration | integer | No | 5 | - | Video duration in seconds. |
flow_shift | integer | No | 5 | - | Controls the motion flow in the generated video. |
seed | integer | No | -1 | - | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
enable_prompt_optimization | boolean | No | false | - | Whether to automatically optimize the prompt. |
enable_safety_checker | boolean | No | true | - | Whether to run safety checks on the output. |
Wan 2.1 I2V 720p
Open-source image-to-video generation model that converts static images into 720p videos.Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
prompt | string | Yes | - | - | Text description of the desired video motion and content. |
image | string | Yes | - | - | URL of the input image to animate. |
negative_prompt | string | No | "" | - | Elements to exclude from the generated video. |
size | string | No | ”1280*720” | - | Video resolution in format “width*height”. |
num_inference_steps | integer | No | 30 | 1-50 | Number of denoising steps. |
guidance | float | No | 5 | 0.0-10.0 | How closely to follow the prompt. |
duration | integer | No | 5 | - | Video duration in seconds. |
flow_shift | integer | No | 5 | - | Controls the motion flow in the generated video. |
seed | integer | No | -1 | - | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
enable_prompt_optimization | boolean | No | false | - | Whether to automatically optimize the prompt. |
enable_safety_checker | boolean | No | true | - | Whether to run safety checks on the output. |
Wan 2.1 T2V 720p
An open-source video generation model for creating 720p videos from text prompts.Parameter | Type | Required | Default | Range | Description |
---|---|---|---|---|---|
prompt | string | Yes | - | - | Text description of the desired video content. |
negative_prompt | string | No | "" | - | Elements to exclude from the generated video. |
size | string | No | ”1280*720” | - | Video resolution in format “width*height”. |
num_inference_steps | integer | No | 30 | 1-50 | Number of denoising steps. |
guidance | float | No | 5 | 0.0-10.0 | How closely to follow the prompt. |
duration | integer | No | 5 | - | Video duration in seconds. |
flow_shift | integer | No | 5 | - | Controls the motion flow in the generated video. |
seed | integer | No | -1 | - | Provide a seed for reproducible results. The default value (-1) will generate a random seed. |
enable_prompt_optimization | boolean | No | false | - | Whether to automatically optimize the prompt. |
enable_safety_checker | boolean | No | true | - | Whether to run safety checks on the output. |