Skip to main content
Flash provides access to CPU-only compute instances for workloads that don’t require GPU acceleration. This reference lists all available CPU instance types.

Using CPU instances

Specify a CPU instance using the cpu parameter. You can use either a string shorthand or the CpuInstanceType enum:
from runpod_flash import Endpoint, CpuInstanceType

# String shorthand
@Endpoint(name="data-processor", cpu="cpu5c-4-8")
async def process(data: dict) -> dict:
    ...

# Using enum
@Endpoint(name="data-processor", cpu=CpuInstanceType.CPU5C_4_8)
async def process(data: dict) -> dict:
    ...

Available CPU instance types

CPU instances are organized by generation and optimization profile.

5th generation compute-optimized

Latest generation, optimized for compute-intensive workloads:
CpuInstanceTypeIDvCPURAMBest For
CPU5C_1_2cpu5c-1-212GBLightweight APIs, simple tasks
CPU5C_2_4cpu5c-2-424GBSmall APIs, data validation
CPU5C_4_8cpu5c-4-848GBGeneral APIs, data processing
CPU5C_8_16cpu5c-8-16816GBHeavy processing, parallel tasks

3rd generation compute-optimized

Balanced compute focus:
CpuInstanceTypeIDvCPURAMBest For
CPU3C_1_2cpu3c-1-212GBBasic endpoints, webhooks
CPU3C_2_4cpu3c-2-424GBSimple data processing
CPU3C_4_8cpu3c-4-848GBModerate workloads
CPU3C_8_16cpu3c-8-16816GBCPU-intensive tasks

3rd generation general purpose

Balanced CPU and memory:
CpuInstanceTypeIDvCPURAMBest For
CPU3G_1_4cpu3g-1-414GBMemory-light tasks
CPU3G_2_8cpu3g-2-828GBGeneral workloads
CPU3G_4_16cpu3g-4-16416GBMemory-intensive processing
CPU3G_8_32cpu3g-8-32832GBHigh-memory workloads

Common configurations

APIs and webhooks

# Lightweight API
@Endpoint(name="webhook", cpu="cpu5c-2-4")
async def handle_webhook(data: dict) -> dict:
    ...

# Production API
@Endpoint(name="api", cpu="cpu5c-4-8", workers=(1, 10))
async def handle_request(data: dict) -> dict:
    ...

Data processing

# Light processing
@Endpoint(name="processor", cpu="cpu3g-2-8")  # More RAM per vCPU
async def process(data: dict) -> dict:
    ...

# Heavy processing
@Endpoint(name="heavy-processor", cpu="cpu5c-8-16")
async def heavy_process(data: dict) -> dict:
    ...

Memory-intensive tasks

# High memory requirement
@Endpoint(name="memory-worker", cpu="cpu3g-8-32")  # 8 vCPU, 32GB RAM
async def process_large_data(data: dict) -> dict:
    ...

Load-balanced CPU API

from runpod_flash import Endpoint

api = Endpoint(
    name="cpu-api",
    cpu="cpu5c-4-8",
    workers=(1, 10)
)

@api.post("/process")
async def process(data: dict) -> dict:
    return {"result": "processed"}

@api.get("/health")
async def health():
    return {"status": "ok"}

Container disk sizing

CPU endpoints automatically adjust container disk size based on instance limits:
  • CPU3G and CPU3C instances: vCPU count × 10GB (e.g., 2 vCPU = 20GB)
  • CPU5C instances: vCPU count × 15GB (e.g., 4 vCPU = 60GB)
If you specify a custom size via PodTemplate that exceeds the instance limit, deployment will fail with a validation error.