Skip to main content
September 2025

Slurm Clusters GA, cached models in beta, and new Public Endpoints available

  • Slurm Clusters are now generally available: Deploy production-ready HPC clusters in seconds. These clusters support multi-node performance for distributed training and large-scale simulations with pay-as-you-go billing and no idle costs.
  • Cached models are now in beta: Eliminate model download times when starting workers. The system places cached models on host machines before workers start, prioritizing hosts with your model already available for instant startup.
  • New Public Endpoints available: Wan 2.5 combines image and audio to create lifelike videos, while Nano Banana merges multiple images for composite creations.
August 2025

Hub revenue sharing launches and Pods UI gets refreshed

  • Hub revenue share model: Publish to the Runpod Hub and earn credits when others deploy your repo. Earn up to 7% of compute revenue through monthly tiers with credits auto-deposited into your account.
  • Pods UI updated: Refreshed modern interface for interacting with Runpod Pods.
July 2025

Public Endpoints arrive, Slurm Clusters in beta

  • Public Endpoints: Access state-of-the-art AI models through simple API calls with an integrated playground. Available endpoints include Whisper-V3-Large, Seedance 1.0 pro, Seedream 3.0, Qwen Image Edit, FLUX.1 Kontext, Deep Cogito v2 Llama 70B, and Minimax Speech.
  • Slurm Clusters (beta): Create on-demand multi-node clusters instantly with full Slurm scheduling support.
June 2025

S3-compatible storage and updated referral program

  • S3-compatible API for network volumes: Upload and retrieve files from your network volumes without compute using AWS S3 CLI or Boto3. Integrate Runpod storage into any AI pipeline with zero-config ease and object-level control.
  • Referral program revamp: Updated rewards and tiers with clearer dashboards to track performance.
May 2025

Port labeling, price drops, Runpod Hub, and Tetra beta test

  • Port labeling: Name exposed ports in the UI and API to help team members identify services like Jupyter or TensorBoard.
  • Price drops: Additional price reductions on popular GPU SKUs to lower training and inference costs.
  • Runpod Hub: A curated catalog of one-click endpoints and templates for deploying community projects without starting from scratch.
  • Tetra beta test: A Python library for running code on GPU with Runpod. Add a @remote() decorator to functions that need GPU power while the rest of your code runs locally.
April 2025

GitHub login, RTX 5090s, and global networking expansion

  • Login with GitHub: OAuth sign-in and linking for faster onboarding and repo-driven workflows.
  • RTX 5090s on Runpod: High-performance RTX 5090 availability for cost-efficient training and inference.
  • Global networking expansion: Rollout to additional data centers approaching full global coverage.
March 2025

Enterprise features arrive, REST API goes GA, Instant Clusters in beta, and APAC expansion

  • CPU Pods get network storage access: GA support for network volumes on CPU Pods for persistent, shareable storage.
  • SOC 2 Type I certification: Independent attestation of security controls for enterprise readiness.
  • REST API release: REST API GA with broad resource coverage for full infrastructure-as-code workflows.
  • Instant Clusters: Spin up multi-node GPU clusters in minutes with private interconnect and per-second billing.
  • Bare metal: Reserve dedicated GPU servers for maximum control, performance, and long-term savings.
  • AP-JP-1: New Fukushima region for low-latency APAC access and in-country data residency.
February 2025

REST API enters beta with full-time community manager

  • REST API beta test: RESTful endpoints for Pods, endpoints, and volumes for simpler automation than GraphQL.
  • Full-time community manager hire: Dedicated programs, content, and faster community response.
  • Serverless GitHub integration release: GA for GitHub-based Serverless deploys with production-ready stability.
January 2025

New silicon and LLM-focused Serverless upgrades

  • CPU Pods v2: Docker runtime parity with GPU Pods for faster starts with network volume support.
  • H200s on Runpod: NVIDIA H200 GPUs available for larger models and higher memory bandwidth.
  • Serverless upgrades: Higher GPU counts per worker, new quick-deploy runtimes, and simpler model selection.
November 2024

Global networking expands and GitHub deploys enter beta

  • Global networking expansion: Added to CA-MTL-3, US-GA-1, US-GA-2, and US-KS-2 for expanded private mesh coverage.
  • Serverless GitHub integration beta test: Deploy endpoints directly from GitHub repos with automatic builds.
  • Scoped API keys: Least-privilege tokens with fine-grained scopes and expirations for safer automation.
  • Passkey auth: Passwordless WebAuthn sign-in for phishing-resistant account access.
August 2024

Storage expansion and private cross-data-center connectivity

July 2024

Storage coverage grows with major price cuts and revamped referrals

  • US-TX-3 and EUR-IS-1 added to network storage: Network volumes available in more regions for local persistence.
  • Runpod slashes GPU prices: Broad GPU price reductions to lower training and inference total cost of ownership.
  • Referral program revamp: Updated commissions and bonuses with an affiliate tier and improved tracking.
May 2024

$20M seed round, community event, and broader Serverless options

  • $20M seed by Intel Capital and Dell Technologies Capital: Funds infrastructure expansion and product acceleration.
  • First in-person hackathon: Community projects, workshops, and real-world feedback.
  • Serverless CPU Pods: Scale-to-zero CPU endpoints for services that don’t need a GPU.
  • AMD GPUs: AMD ROCm-compatible GPU SKUs as cost and performance alternatives to NVIDIA.
February 2024

CPU compute and first-class automation tooling

  • CPU Pods: CPU-only instances with the same networking and storage primitives for cheaper non-GPU stages.
  • runpodctl: Official CLI for Pods, endpoints, and volumes to enable scripting and CI/CD workflows.
January 2024

Console navigation overhaul and documentation refresh

  • New navigational changes to Runpod UI: Consolidated menus, consistent action placement, and fewer clicks for common tasks.
  • Docs revamp: New information architecture, improved search, and more runnable examples and quickstarts.
  • Zhen AMA: Roadmap Q&A and community feedback session.
December 2023

New regions and investment in community support

  • US-OR-1: Additional US region for lower latency and more capacity in the Pacific Northwest.
  • CA-MTL-1: New Canadian region to improve latency and meet in-country data needs.
  • First community manager hire: Dedicated community programs and faster feedback loops.
  • Building out the support team: Expanded coverage and expertise for complex issues.
October 2023

Faster template starts and better multi-region hygiene

  • Serverless quick deploy: One-click deploy of curated model templates with sensible defaults.
  • EU domain for Serverless: EU-specific domain briefly offered for data residency, superseded by other region controls.
  • Data-center filter for Serverless: Filter and manage endpoints by region for multi-region fleets.
September 2023

Self-service upgrades, clearer metrics, new pricing model, and cost visibility

  • Self-service worker upgrade: Rebuild and roll workers from the dashboard without support tickets.
  • Edit template from endpoint page: Inline edit and redeploy the underlying template directly from the endpoint view.
  • Improved Serverless metrics page: Refinements to charts and filters for quicker root-cause analysis.
  • Flex and active workers: Discounted always-on “active” capacity for baseline load with on-demand “flex” workers for bursts.
  • Billing explorer: Inspect costs by resource, region, and time to identify optimization opportunities.
August 2023

Team governance, storage expansion, and better debugging

  • Teams: Organization workspaces with role-based access control for Pods, endpoints, and billing.
  • Savings plans: Plans surfaced prominently in console with easier purchase and management for steady usage.
  • Network storage to US-KS-1: Enable network volumes in US-KS-1 for local, persistent data workflows.
  • Serverless log view: Stream worker stdout and stderr in the UI and API for real-time debugging.
  • Serverless health endpoint: Lightweight /health probe returning endpoint and worker status without creating a billable job.
  • SOC 2 Type II compliant: Security and compliance certification for enterprise customers.
June 2023

Observability, top-tier GPUs, and commitment-based savings

  • Serverless metrics page: Time-series charts for pXX latencies, queue delay, throughput, and worker states for faster debugging and tuning.
  • H100s on Runpod: NVIDIA H100 instances for higher throughput and larger model footprints.
  • Savings plans: Commitment-based discounts for predictable workloads to lower effective hourly rates.
May 2023

Smoother auth and multi-region Serverless with persistent storage

  • The new and improved Runpod login experience: Streamlined sign-in and team access for faster, more consistent auth flows.
  • Network volumes added to Serverless: Attach persistent storage to Serverless workers to retain models and artifacts across restarts and speed cold starts through caching.
  • Serverless region support: Pin or allow specific regions for endpoints to reduce latency and meet data-residency needs.
April 2023

Deeper autoscaling controls, richer metrics, persistent storage, and job cancellation

  • Serverless scaling strategies: Scale by queue delay and/or concurrency with min/max worker bounds to balance latency and cost.
  • Queue delay: Expose time-in-queue as a first-class metric to drive autoscaling and SLO monitoring.
  • Request count: Track success and failure totals over windows for quick health checks and alerting.
  • runsync: Synchronous invocation path that returns results in the same HTTP call for short-running jobs.
  • Network storage beta: Region-scoped, attachable volumes shareable across Pods and endpoints for model caches and datasets.
  • Job cancel API: Programmatically terminate queued or running jobs to free capacity and enforce client timeouts.
April 1, 2023

Serverless platform hardens with cleaner API

  • Serverless API v2: Revised request and response schema with improved error semantics and new endpoints for better control over job lifecycle and observability.
February 1, 2023

Better control over notifications and GPU allocation

  • Notification preferences: Configure which platform events trigger alerts to reduce noise for teams and CI systems.
  • GPU priorities: Influence scheduling by marking workloads as higher priority to reduce queue time for critical jobs.
July 1, 2022

Encrypted volumes for persistent data

  • Runpod now offers encrypted volumes: Enable at-rest encryption for persistent volumes with no application changes required using platform-managed keys.