Skip to main content
Manage Serverless endpoints, including creating, listing, updating, and deleting endpoints.
runpodctl serverless <subcommand> [flags]

Alias

You can use sls as a shorthand for serverless:
runpodctl sls list

Subcommands

List endpoints

List all your Serverless endpoints:
runpodctl serverless list

List flags

--include-template
bool
Include template information in the output.
--include-workers
bool
Include workers information in the output.

Get endpoint details

Get detailed information about a specific endpoint:
runpodctl serverless get <endpoint-id>

Get flags

--include-template
bool
Include template information in the output.
--include-workers
bool
Include workers information in the output.

Create an endpoint

Create a new Serverless endpoint from a template or from a Hub repo:
# Create from a template
runpodctl serverless create --name "my-endpoint" --template-id "tpl_abc123"

# Create from a Hub repo
runpodctl hub search vllm                                         # Find the hub ID
runpodctl serverless create --hub-id cm8h09d9n000008jvh2rqdsmb --name "my-vllm"
When using --hub-id, GPU IDs and container disk size are automatically pulled from the Hub release config. You can override the GPU type with --gpu-id.
Serverless templates vs Pod templates: Serverless endpoints require a Serverless-specific template. Pod templates (like runpod-torch-v21) cannot be used because they include configuration, which Serverless does not support. When creating a template with runpodctl template create, use the --serverless flag to create a Serverless template.Each Serverless template can only be bound to one endpoint at a time. To create multiple endpoints with the same configuration, create separate templates for each.

Create flags

--name
string
Name for the endpoint.
--template-id
string
Template ID to use (required if --hub-id is not specified). Use runpodctl template search to find templates.
--hub-id
string
Hub listing ID to deploy from (alternative to --template-id). Use runpodctl hub search to find repos.
--gpu-id
string
GPU type for workers. Use runpodctl gpu list to see available GPUs.
--gpu-count
int
default:"1"
Number of GPUs per worker.
--compute-type
string
default:"GPU"
Compute type (GPU or CPU).
--workers-min
int
default:"0"
Minimum number of workers.
--workers-max
int
default:"3"
Maximum number of workers.
--data-center-ids
string
Comma-separated list of preferred datacenter IDs. Use runpodctl datacenter list to see available datacenters.
--network-volume-id
string
Network volume ID to attach. Use runpodctl network-volume list to see available network volumes.

Update an endpoint

Update endpoint configuration:
runpodctl serverless update <endpoint-id> --workers-max 5

Update flags

--name
string
New name for the endpoint.
--workers-min
int
New minimum number of workers.
--workers-max
int
New maximum number of workers.
--idle-timeout
int
New idle timeout in seconds.
--scaler-type
string
Scaler type (QUEUE_DELAY or REQUEST_COUNT).
--scaler-value
int
Scaler value.

Delete an endpoint

Delete an endpoint:
runpodctl serverless delete <endpoint-id>

Serverless URLs

Access your Serverless endpoint using these URL patterns:
OperationURL
Async requesthttps://api.runpod.ai/v2/<endpoint-id>/run
Sync requesthttps://api.runpod.ai/v2/<endpoint-id>/runsync
Health checkhttps://api.runpod.ai/v2/<endpoint-id>/health
Job statushttps://api.runpod.ai/v2/<endpoint-id>/status/<job-id>