Overview

Workers are container instances that execute your code when a request is made to your Serverless endpoint. Workers process request inputs using a handler function and store results for retrieval. Runpod automatically manages the worker lifecycle, starting them when needed and stopping them when idle to optimize resource usage.

Build your first worker

Follow this tutorial for a step-by-step walkthrough of how to create and deploy a Serverless worker. This guide walks you through:

Creating a handler function to process your inputs.
Packaging your worker in a Docker container.
Deploying your worker to a Serverless endpoint.
Testing your worker locally and on the Runpod console.

Worker configurations

When deploying a worker to a Serverless endpoint, you can configure various parameters:

GPU selection: Choose the appropriate GPU type for your workload.
Worker count: Set minimum and maximum number of workers.
Memory allocation: Configure memory available to each worker.
Environment variables: Set parameters for worker behavior.
Storage options: Add network volumes for persistent storage between workers.

To learn more, see Endpoint configurations.

Worker types

Active (min) workers: “Always on” workers that eliminate cold start delays. The system charges you immediately but offers up to 30% discount. (Default: 0).
Flex workers: “Sometimes on” workers that scale during traffic surges. They transition to idle after completing jobs. (Default: Max - Active = 3).
Extra workers: Additional workers that the system adds during traffic spikes when Docker images are cached on host servers. (Default: 2).

Worker states

Workers move through different states as they handle requests and respond to changes in traffic patterns. Understanding these states helps you monitor and troubleshoot your Serverless endpoints effectively.

Initializing: The worker starts up while the system downloads and prepares the Docker image. The container starts and loads your code.
Idle: The worker is ready but not processing requests. No charges apply while idle.
Running: The worker actively processes requests. Billing occurs per second.
Throttled: The worker is ready but temporarily unable to run due to host machine resource constraints.
Outdated: The system marks the worker for replacement after endpoint updates. It continues processing current jobs during rolling updates (10% of max workers at a time).
Unhealthy: The worker has crashed due to Docker image issues, incorrect start commands, or machine problems. The system automatically retries with exponential backoff for up to 7 days.

You can view the state of your workers using the Workers tab of a Serverless endpoint. This tab provides real-time information about each worker’s current state, resource utilization, and job processing history, allowing you to monitor performance and troubleshoot issues effectively.

Debugging workers

To debug issues in production, you can access worker logs and SSH directly into running workers to inspect file systems and environment variables in real-time.

Max worker limit

By default, each Runpod account can allocate a maximum of 5 workers (flex + active combined) across all endpoints. If your account balance exceeds a certain threshold, you can increase this limit:

$100 balance: 10 max workers
$200 balance: 20 max workers
$300 balance: 30 max workers
$500 balance: 40 max workers
$700 balance: 50 max workers
$900 balance: 60 max workers

If your workload requires additional capacity beyond 60 workers, contact our support team.

Get started

Serverless

Pods

Storage

Public Endpoints

Instant Clusters

Fine-tuning

Hub

Reference

Build your first worker

Worker configurations

Worker types

Worker states

Debugging workers

Max worker limit

Next steps

Get started

Serverless

Pods

Storage

Public Endpoints

Instant Clusters

Fine-tuning

Hub

Reference

​Build your first worker

​Worker configurations

​Worker types

​Worker states

​Debugging workers

​Max worker limit

​Next steps

Build your first worker

Worker configurations

Worker types

Worker states

Debugging workers

Max worker limit

Next steps