Skip to main content
Runpod is a cloud computing platform built for AI, machine learning, and general compute needs. Whether you’re or AI models, or deploying cloud-based applications for , Runpod provides scalable, high-performance GPU and CPU resources to power your workloads.

Access GPUs instantly

Quickstart

Create an account, deploy your first GPU Pod, and use it to execute code.

Create an API key

Create API keys to manage your access to Runpod resources.

Concepts

Learn about the key concepts and terminology for the Runpod platform.

Flash (Beta)

Run Python functions on remote GPUs directly from your local terminal.

Serverless

Pay-per-second computing with automatic scaling for production AI/ML apps.

Pods

Dedicated GPU or CPU instances for containerized AI/ML workloads.

Use our model endpoints

Runpod offers Public Endpoints for instant API access to pre-deployed AI models for image, video, audio, and text generation. No deployment or infrastructure required—just create an API key and make a request:
import requests

response = requests.post(
    "https://api.runpod.ai/v2/black-forest-labs-flux-1-schnell/runsync",
    headers={
        "Authorization": "Bearer YOUR_API_KEY", # Replace YOUR_API_KEY with your actual API key
        "Content-Type": "application/json"
    },
    json={
        "input": {
            "prompt": "A beautiful sunset over mountains", # Customize your prompt
            "width": 1024,
            "height": 1024
        }
    }
)

result = response.json()
print(result["output"]["image_url"])
For a list of available models, see the model reference.

Guides and examples

Generate images with ComfyUI

Deploy a dedicated GPU with ComfyUI pre-installed and start generating images.

Generate images at scale

Build a ComfyUI worker and deploy it as a Serverless endpoint.

Generate images with Flash scripts

Use a hybrid local/remote script to generate images with SDXL.

Text-to-video pipeline

Create a multi-model pipeline for video generation.

Build a load balancing API

Create a REST API with automatic load balancing using Flash.

Deploy vLLM for text generation

Deploy a large language model in minutes using vLLM on Serverless.

High-performance clusters

Create a multi-node Instant Cluster for fully managed distributed GPU computing with high-speed networking between nodes.

Overview

Learn how Instant Clusters work and when to use them.

Deploy a Slurm cluster

Set up managed Slurm for HPC workloads.

Deploy a PyTorch cluster

Run distributed PyTorch training across multiple nodes.

Support

Contact

Submit a support request using our contact page.

Status page

Check the status of Runpod services and infrastructure.

Discord

Join the Runpod community on Discord.