Skip to main content
Runpod Public Endpoints provide instant access to state-of-the-art AI models through simple API calls. Generate images, videos, audio, and text without deploying infrastructure or managing GPU resources.
Public Endpoints are pre-deployed models hosted by Runpod. If you want to deploy your own models or custom code, use Runpod Serverless.

Why use Public Endpoints?

  • No deployment required. Start generating immediately with a single API call. No containers, GPUs, or infrastructure to configure.
  • Production-ready models. Access optimized versions of Flux, Whisper, Qwen, and other popular models, tuned for performance.
  • Pay per use. Pay only for what you generate, with transparent per-megapixel, per-second, or per-token pricing.
  • Simple integration. Standard REST API with OpenAI-compatible endpoints for LLMs. Works with any HTTP client or SDK.

When to use Public Endpoints

Public Endpoints are ideal when you want to use popular AI models without managing infrastructure. Choose Public Endpoints when:
  • You need quick access to standard models. Generate images with Flux, transcribe audio with Whisper, or chat with Qwen without setup.
  • You want predictable pricing. Pay-per-output pricing makes costs easy to estimate and budget.
  • You’re prototyping or building MVPs. Test ideas quickly before committing to custom infrastructure.
Consider Runpod Serverless instead if you need custom models, specialized preprocessing, or full control over your inference environment.

Get started

How it works

When you call a Public Endpoint, Runpod routes your request to a pre-deployed model running on optimized GPU infrastructure. The model processes your input and returns the result.
Public Endpoints support two request modes:
  • Synchronous (/runsync): Wait for the result and receive it in the response. Best for quick generations.
  • Asynchronous (/run): Receive a job ID immediately and poll for results. Best for longer generations or batch processing.
For JavaScript and TypeScript projects, the @runpod/ai-sdk-provider package integrates Public Endpoints with the Vercel AI SDK, providing a streamlined interface for text generation, streaming, and image generation.

Available model types

Public Endpoints offer models across four categories:
TypeModelsUse cases
ImageFlux Dev, Flux Schnell, Qwen Image, SeedreamText-to-image generation, image editing
VideoWAN 2.5, Kling, Seedance, SORA 2Image-to-video, text-to-video generation
AudioWhisper V3, Minimax SpeechSpeech-to-text transcription, text-to-speech
TextQwen3 32B, IBM GraniteChat, code generation, text completion
For a complete list of models with endpoint URLs and parameters, see the model reference.

Pricing

Public Endpoints use transparent, usage-based pricing:
Model typeExamplePrice
Image generationFlux Dev$0.02 per megapixel
Image generationFlux Schnell$0.0024 per megapixel
Video generationWAN 2.5$0.50 per 5 seconds
Audio transcriptionWhisper V3$0.05 per 1000 characters
Text generationQwen3 32B$10.00 per 1M tokens
Pricing is calculated based on actual output. You will not be charged for failed generations.
Example cost calculations for image generation:
  • 512x512 image (0.26 MP) with Flux Dev: ~$0.005
  • 1024x1024 image (1.05 MP) with Flux Dev: ~$0.021
  • 1024x1024 image (1.05 MP) with Flux Schnell: ~$0.0025
For complete pricing information, see the model reference.

Next steps