Skip to main content
Selecting the right Pod configuration maximizes performance and cost efficiency for your workload. This guide helps you match your requirements to the right GPU, VRAM, and storage configuration.

Quick selection by workload

Start by identifying your primary workload type:
WorkloadRecommended GPU tierMinimum VRAMNotes
LLM inference (7B–13B params)Mid-range (RTX 4090, L4)24 GBSufficient for most quantized models
LLM inference (30B–70B params)High-end (A100, H100)48–80 GBMay require multi-GPU setup
LLM training/fine-tuningHigh-end (A100, H100)40–80 GBMemory bandwidth critical
Image generation (SDXL, Flux)Mid-range (RTX 4090, L4)16–24 GBBenefits from fast inference
Computer visionEntry to mid-range8–16 GBDepends on model and batch size
3D renderingMid-range with RT cores16–24 GBRT cores accelerate ray tracing
Data processingCPU-focused or entry GPU8 GB+Prioritize CPU cores and RAM
For a full list of available GPUs and their specifications, see GPU types.

Estimate VRAM requirements

VRAM is the most common bottleneck. Use these guidelines: For LLMs: Allocate approximately 2 GB of VRAM per billion parameters. For example:
  • 7B model → ~14 GB VRAM
  • 13B model → ~26 GB VRAM
  • 70B model → ~140 GB VRAM (requires multi-GPU)
Quantization reduces VRAM requirements significantly. A 4-bit quantized 70B model can run on ~35 GB VRAM.
For image models: SDXL requires ~8 GB minimum, but 16–24 GB provides headroom for larger batch sizes and LoRA training.

Resource calculators

Use these tools to estimate your specific requirements:

Storage configuration

Choose storage based on your data persistence needs:
Storage typePersists after stop?Persists after delete?Best for
Container diskNoNoOS, temporary files
Volume diskYesNoWorking files, checkpoints
Network volumeYesYesDatasets, model weights, long-term storage
For data-intensive workloads, ensure sufficient volume disk or network volume capacity for your datasets, model weights, and output files.

Optimize for cost

  1. Right-size your resources: Start with the minimum viable configuration, then scale up based on actual usage. Development and testing often need less power than production.
  2. Use spot instances: For fault-tolerant workloads like batch processing or training with checkpoints, spot instances offer significant savings.
  3. Consider savings plans: For extended usage, Runpod’s savings plans reduce costs for committed usage.

Secure Cloud vs Community Cloud

Secure CloudCommunity Cloud
InfrastructureT3/T4 data centersPeer-to-peer providers
ReliabilityHigh redundancyVariable
Best forProduction, sensitive dataCost-sensitive workloads
PricingStandardCompetitive
Runpod is no longer accepting new hosts for Community Cloud. Existing Community Cloud resources remain available.

Next steps

Deploy a Pod

Create your first Pod with your chosen configuration.

GPU types reference

Compare all available GPUs and specifications.

Storage options

Learn more about storage types and pricing.

Manage Pods

Learn how to create, start, stop, and delete Pods.