> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runpod.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Product updates

> New features, fixes, and improvements for the Runpod platform.

<Update label="March 2026">
  ## Flash beta: Run Python functions on cloud GPUs

  [Flash](/flash/overview) is now in public beta. Flash is a Python SDK that lets you run functions on Runpod Serverless GPUs with a single decorator:

  ```python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from runpod_flash import Endpoint, GpuType

  @Endpoint(
      name="hello-gpu", 
      gpu=GpuType.NVIDIA_GEFORCE_RTX_4090,
      dependencies=["torch"]
  ) 
  async def hello():  # This function runs on Runpod
      import torch
      gpu_name = torch.cuda.get_device_name(0)
      print(f"Hello from your GPU! ({gpu_name})")
      return {"gpu": gpu_name}

  asyncio.run(hello())
  print("Done!") # This runs locally
  ```

  **Key features:**

  * **Remote execution**: Mark functions with `@Endpoint` to run on GPUs/CPUs automatically.
  * **Auto-scaling**: Workers scale from 0 to N based on demand.
  * **Dependency management**: Packages install automatically on remote workers.
  * **Two patterns**: Queue-based endpoints for batch work, load-balanced endpoints for REST APIs
  * **Flash apps**: Build production-ready APIs with `flash init`, `flash dev`, and `flash deploy`

  **Get started:**

  <CardGroup cols={2}>
    <Card title="Overview" href="/flash/overview" icon="book">
      Learn more about Flash.
    </Card>

    <Card title="Quickstart" href="/flash/quickstart" icon="bolt">
      Run your first GPU workload in 5 minutes.
    </Card>

    <Card title="Create endpoints" href="/flash/create-endpoints" icon="code">
      Learn queue-based and load-balanced patterns.
    </Card>

    <Card title="Flash CLI" href="/flash/cli/overview" icon="terminal">
      Development and deployment commands.
    </Card>
  </CardGroup>

  ## Flash: Multi-datacenter deployments

  Flash now supports deploying endpoints to [multiple datacenters](/flash/configuration/parameters#datacenter) simultaneously. Pass a list of datacenters to distribute your workload across regions for improved availability and reduced latency. You can also attach [network volumes per datacenter](/flash/configuration/storage#multi-datacenter-volumes) for region-specific data access.
</Update>

<Update label="February 2026">
  ## New Public Endpoints and expanded examples

  **[New Public Endpoints](/public-endpoints/reference):** Expansion of available models across all categories.

  * **Video:** [SORA 2](/public-endpoints/models/sora-2) and [SORA 2 Pro](/public-endpoints/models/sora-2-pro), [Kling v2.1](/public-endpoints/models/kling-v2-1) and [v2.6 Motion Control](/public-endpoints/models/kling-v2-6-motion-control), [WAN 2.6](/public-endpoints/models/wan-2-6-t2v).
  * **Image:** [Seedream 4.0](/public-endpoints/models/seedream-4-t2i).
  * **Text:** [Qwen3 32B](/public-endpoints/models/qwen3-32b), [IBM Granite 4.0](/public-endpoints/models/granite-4).
  * **Audio:** [Chatterbox Turbo](/public-endpoints/models/chatterbox-turbo) for text-to-speech.

  **New integrations and guides:**

  * [Vercel AI SDK integration](/public-endpoints/ai-sdk): New `@runpod/ai-sdk-provider` package for TypeScript projects with streaming, text generation, and image generation support.
  * [AI coding tools guide](/public-endpoints/ai-coding-tools): Configure OpenCode, Cursor, and Cline to use Runpod Public Endpoints as your model provider.

  **[New tutorials](/tutorials/introduction/overview):**

  * [Build a text-to-video pipeline](/tutorials/public-endpoints/text-to-video-pipeline): Chain multiple Public Endpoints to generate videos from text prompts.
  * [Deploy cached models](/tutorials/serverless/model-caching-text): Reduce cold start times with model caching.
  * [Integrate Serverless with web applications](/tutorials/serverless/generate-sdxl-turbo): Build a complete image generation app.
  * [Build a chatbot with Gemma 3](/tutorials/serverless/run-gemma-7b): Deploy vLLM with OpenAI API compatibility.
  * [Run Ollama on Pods](/tutorials/pods/run-ollama): Set up Ollama for LLM inference.
  * [Build Docker images with Bazel](/tutorials/pods/build-docker-images): Containerize your applications.
</Update>

<Update label="January 2026">
  ## GitHub release rollback GA and load balancing Serverless repos in beta

  * [GitHub release rollback](/serverless/workers/github-integration#roll-back-to-a-previous-build): Roll back your Serverless endpoint to any previous build from the console. Restore an earlier version when you encounter issues without waiting for a new GitHub release.
  * [Load balancing Serverless repos (beta)](/hub/publishing-guide): Load balancing endpoints are now available in the Hub. Publish or convert any listing to load balancer type by setting `"endpointType": "LB"` in your hub.json file, then deploy as a Serverless endpoint or Pod from the Hub page. Maintain a single listing for your model and let users choose their deployment method—autoscaling Serverless or dedicated Pod resources.
</Update>

<Update label="December 2025">
  ## Pod migration in beta and Serverless development guides

  * [Pod migration (beta)](/references/troubleshooting/pod-migration): Migrate your Pod to a new machine when your stopped Pod's GPU is occupied. Provisions a new Pod with the same specifications and automatically transfers your data to an available machine.
  * [New Serverless development guides](/serverless/overview): We've added a comprehensive new set of guides for developing, testing, and debugging Serverless endpoints.
</Update>

<Update label="September 2025">
  ## Slurm Clusters GA, cached models in beta, and new Public Endpoints available

  * [Slurm Clusters are now generally available](/instant-clusters/slurm-clusters): Deploy production-ready HPC clusters in seconds. These clusters support multi-node performance for distributed training and large-scale simulations with pay-as-you-go billing and no idle costs.
  * [Cached models are now in beta](/serverless/endpoints/model-caching): Eliminate model download times when starting workers. The system places cached models on host machines before workers start, prioritizing hosts with your model already available for instant startup.
  * [New Public Endpoints available](/public-endpoints/overview): [WAN 2.5](/public-endpoints/models/wan-2-5) combines image and audio to create lifelike videos, while [Nano Banana](/public-endpoints/models/nano-banana-edit) merges multiple images for composite creations.
</Update>

<Update label="August 2025">
  ## Hub revenue sharing launches and Pods UI gets refreshed

  * [Hub revenue share model](/hub/revenue-sharing): Publish to the Runpod Hub and earn credits when others deploy your repo. Earn up to 7% of compute revenue through monthly tiers with credits auto-deposited into your account.
  * [Pods UI updated](/pods/overview): Refreshed modern interface for interacting with Runpod Pods.
</Update>

<Update label="July 2025">
  ## Public Endpoints arrive, Slurm Clusters in beta

  * [Public Endpoints](/public-endpoints/overview): Access state-of-the-art AI models through simple API calls with an integrated playground. Available endpoints include [Qwen Image Edit](/public-endpoints/models/qwen-image-edit), [Flux Kontext](/public-endpoints/models/flux-kontext-dev), [Cogito 671B](/public-endpoints/models/cogito-671b), and [Minimax Speech](/public-endpoints/models/minimax-speech).
  * [Slurm Clusters (beta)](/instant-clusters/slurm-clusters): Create on-demand multi-node clusters instantly with full Slurm scheduling support.
</Update>

<Update label="June 2025">
  ## S3-compatible storage and updated referral program

  * [S3-compatible API for network volumes](/storage/s3-api): Upload and retrieve files from your network volumes without compute using AWS S3 CLI or Boto3. Integrate Runpod storage into any AI pipeline with zero-config ease and object-level control.
  * [Referral program revamp](/references/referrals): Updated rewards and tiers with clearer dashboards to track performance.
</Update>

<Update label="May 2025">
  ## Port labeling, price drops, Runpod Hub, and Tetra beta test

  * [Port labeling](/pods/overview): Name exposed ports in the UI and API to help team members identify services like Jupyter or TensorBoard.
  * [Price drops](/pods/pricing): Additional price reductions on popular GPU SKUs to lower training and inference costs.
  * [Runpod Hub](/hub/overview): A curated catalog of one-click endpoints and templates for deploying community projects without starting from scratch.
  * **Tetra beta test**: A Python library for running code on GPU with Runpod. Add a `@remote()` decorator to functions that need GPU power while the rest of your code runs locally.
</Update>

<Update label="April 2025">
  ## GitHub login, RTX 5090s, and global networking expansion

  * **Login with GitHub**: OAuth sign-in and linking for faster onboarding and repo-driven workflows.
  * **RTX 5090s on Runpod**: High-performance RTX 5090 availability for cost-efficient training and inference.
  * [Global networking expansion](/pods/networking): Rollout to additional data centers approaching full global coverage.
</Update>

<Update label="March 2025">
  ## Enterprise features arrive, REST API goes GA, Instant Clusters in beta, and APAC expansion

  * [CPU Pods get network storage access](/storage/network-volumes): GA support for network volumes on CPU Pods for persistent, shareable storage.
  * **SOC 2 Type I certification**: Independent attestation of security controls for enterprise readiness.
  * [REST API release](/api-reference/overview): REST API GA with broad resource coverage for full infrastructure-as-code workflows.
  * [Instant Clusters](/instant-clusters): Spin up multi-node GPU clusters in minutes with private interconnect and per-second billing.
  * **Bare metal**: Reserve dedicated GPU servers for maximum control, performance, and long-term savings.
  * **AP-JP-1**: New Fukushima region for low-latency APAC access and in-country data residency.
</Update>

<Update label="February 2025">
  ## REST API enters beta with full-time community manager

  * [REST API beta test](/api-reference/overview): RESTful endpoints for Pods, endpoints, and volumes for simpler automation than GraphQL.
  * **Full-time community manager hire**: Dedicated programs, content, and faster community response.
  * [Serverless GitHub integration release](/serverless/workers/github-integration): GA for GitHub-based Serverless deploys with production-ready stability.
</Update>

<Update label="January 2025">
  ## New silicon and LLM-focused Serverless upgrades

  * **CPU Pods v2**: Docker runtime parity with GPU Pods for faster starts with network volume support.
  * [H200s on Runpod](/references/gpu-types): NVIDIA H200 GPUs available for larger models and higher memory bandwidth.
  * [Serverless upgrades](/serverless/overview): Higher GPU counts per worker, new quick-deploy runtimes, and simpler model selection.
</Update>

<Update label="November 2024">
  ## Global networking expands and GitHub deploys enter beta

  * [Global networking expansion](/pods/networking): Added to CA-MTL-3, US-GA-1, US-GA-2, and US-KS-2 for expanded private mesh coverage.
  * [Serverless GitHub integration beta test](/serverless/workers/github-integration): Deploy endpoints directly from GitHub repos with automatic builds.
  * **Scoped API keys**: Least-privilege tokens with fine-grained scopes and expirations for safer automation.
  * **Passkey auth**: Passwordless WebAuthn sign-in for phishing-resistant account access.
</Update>

<Update label="August 2024">
  ## Storage expansion and private cross-data-center connectivity

  * [US-GA-2 added to network storage](/storage/network-volumes): Enable network volumes in US-GA-2.
  * [Global networking](/pods/networking): Private cross-data-center networking with internal DNS for secure service-to-service traffic.
</Update>

<Update label="July 2024">
  ## Storage coverage grows with major price cuts and revamped referrals

  * **US-TX-3 and EUR-IS-1 added to network storage**: Network volumes available in more regions for local persistence.
  * **Runpod slashes GPU prices**: Broad GPU price reductions to lower training and inference total cost of ownership.
  * [Referral program revamp](/references/referrals): Updated commissions and bonuses with an affiliate tier and improved tracking.
</Update>

<Update label="May 2024">
  ## \$20M seed round, community event, and broader Serverless options

  * **\$20M seed by Intel Capital and Dell Technologies Capital**: Funds infrastructure expansion and product acceleration.
  * **First in-person hackathon**: Community projects, workshops, and real-world feedback.
  * [Serverless CPU Pods](/references/cpu-types): Scale-to-zero CPU endpoints for services that don't need a GPU.
  * [AMD GPUs](/references/gpu-types): AMD ROCm-compatible GPU SKUs as cost and performance alternatives to NVIDIA.
</Update>

<Update label="February 2024">
  ## CPU compute and first-class automation tooling

  * **CPU Pods**: CPU-only instances with the same networking and storage primitives for cheaper non-GPU stages.
  * [runpodctl](/runpodctl/overview): Official CLI for Pods, endpoints, and volumes to enable scripting and CI/CD workflows.
</Update>

<Update label="January 2024">
  ## Console navigation overhaul and documentation refresh

  * **New navigational changes to Runpod UI**: Consolidated menus, consistent action placement, and fewer clicks for common tasks.
  * **Docs revamp**: New information architecture, improved search, and more runnable examples and quickstarts.
  * **Zhen AMA**: Roadmap Q\&A and community feedback session.
</Update>

<Update label="December 2023">
  ## New regions and investment in community support

  * **US-OR-1**: Additional US region for lower latency and more capacity in the Pacific Northwest.
  * **CA-MTL-1**: New Canadian region to improve latency and meet in-country data needs.
  * **First community manager hire**: Dedicated community programs and faster feedback loops.
  * **Building out the support team**: Expanded coverage and expertise for complex issues.
</Update>

<Update label="October 2023">
  ## Faster template starts and better multi-region hygiene

  * **Serverless quick deploy**: One-click deploy of curated model templates with sensible defaults.
  * **EU domain for Serverless**: EU-specific domain briefly offered for data residency, superseded by other region controls.
  * **Data-center filter for Serverless**: Filter and manage endpoints by region for multi-region fleets.
</Update>

<Update label="September 2023">
  ## Self-service upgrades, clearer metrics, new pricing model, and cost visibility

  * **Self-service worker upgrade**: Rebuild and roll workers from the dashboard without support tickets.
  * **Edit template from endpoint page**: Inline edit and redeploy the underlying template directly from the endpoint view.
  * **Improved Serverless metrics page**: Refinements to charts and filters for quicker root-cause analysis.
  * [Flex and active workers](/serverless/pricing): Always-on "active" workers for baseline load with on-demand "flex" workers for bursts.
  * **Billing explorer**: Inspect costs by resource, region, and time to identify optimization opportunities.
</Update>

<Update label="August 2023">
  ## Team governance, storage expansion, and better debugging

  * [Teams](/get-started/manage-accounts): Organization workspaces with role-based access control for Pods, endpoints, and billing.
  * [Savings plans](/pods/pricing): Plans surfaced prominently in console with easier purchase and management for steady usage.
  * **Network storage to US-KS-1**: Enable network volumes in US-KS-1 for local, persistent data workflows.
  * [Serverless log view](/serverless/development/logs): Stream worker stdout and stderr in the UI and API for real-time debugging.
  * **Serverless health endpoint**: Lightweight /health probe returning endpoint and worker status without creating a billable job.
  * **SOC 2 Type II compliant**: Security and compliance certification for enterprise customers.
</Update>

<Update label="June 2023">
  ## Observability, top-tier GPUs, and commitment-based savings

  * **Serverless metrics page**: Time-series charts for pXX latencies, queue delay, throughput, and worker states for faster debugging and tuning.
  * [H100s on Runpod](/references/gpu-types): NVIDIA H100 instances for higher throughput and larger model footprints.
  * [Savings plans](/pods/pricing): Commitment-based discounts for predictable workloads to lower effective hourly rates.
</Update>

<Update label="May 2023">
  ## Smoother auth and multi-region Serverless with persistent storage

  * **The new and improved Runpod login experience**: Streamlined sign-in and team access for faster, more consistent auth flows.
  * [Network volumes added to Serverless](/storage/network-volumes): Attach persistent storage to Serverless workers to retain models and artifacts across restarts and speed cold starts through caching.
  * **Serverless region support**: Pin or allow specific regions for endpoints to reduce latency and meet data-residency needs.
</Update>

<Update label="April 2023">
  ## Deeper autoscaling controls, richer metrics, persistent storage, and job cancellation

  * **Serverless scaling strategies**: Scale by queue delay and/or concurrency with min/max worker bounds to balance latency and cost.
  * **Queue delay**: Expose time-in-queue as a first-class metric to drive autoscaling and SLO monitoring.
  * **Request count**: Track success and failure totals over windows for quick health checks and alerting.
  * **runsync**: Synchronous invocation path that returns results in the same HTTP call for short-running jobs.
  * **Network storage beta**: Region-scoped, attachable volumes shareable across Pods and endpoints for model caches and datasets.
  * **Job cancel API**: Programmatically terminate queued or running jobs to free capacity and enforce client timeouts.
</Update>

<Update label="April 1, 2023">
  ## Serverless platform hardens with cleaner API

  * **Serverless API v2**: Revised request and response schema with improved error semantics and new endpoints for better control over job lifecycle and observability.
</Update>

<Update label="February 1, 2023">
  ## Better control over notifications and GPU allocation

  * **Notification preferences**: Configure which platform events trigger alerts to reduce noise for teams and CI systems.
  * **GPU priorities**: Influence scheduling by marking workloads as higher priority to reduce queue time for critical jobs.
</Update>

<Update label="July 1, 2022">
  ## Encrypted volumes for persistent data

  * **Runpod now offers encrypted volumes**: Enable at-rest encryption for persistent volumes with no application changes required using platform-managed keys.
</Update>