Find an endpoint by ID

curl --request GET \
  --url https://rest.runpod.io/v1/endpoints/{endpointId} \
  --header 'Authorization: Bearer <token>'

{
  "allowedCudaVersions": [
    "12.8"
  ],
  "computeType": "GPU",
  "createdAt": "2024-07-12T19:14:40.144Z",
  "dataCenterIds": "EU-NL-1,EU-RO-1,EU-SE-1",
  "env": {
    "ENV_VAR": "value"
  },
  "executionTimeoutMs": 600000,
  "gpuCount": 1,
  "gpuTypeIds": [
    "NVIDIA GeForce RTX 4090"
  ],
  "id": "jpnw0v75y3qoql",
  "idleTimeout": 5,
  "instanceIds": [
    "cpu3c-8-16"
  ],
  "name": "my endpoint",
  "networkVolumeId": "agv6w2qcg7",
  "scalerType": "QUEUE_DELAY",
  "scalerValue": 4,
  "template": {
    "category": "NVIDIA",
    "containerDiskInGb": 50,
    "containerRegistryAuthId": "<string>",
    "dockerEntrypoint": [],
    "dockerStartCmd": [],
    "earned": 100,
    "env": {
      "ENV_VAR": "value"
    },
    "id": "30zmvf89kd",
    "imageName": "runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04",
    "isPublic": false,
    "isRunpod": true,
    "isServerless": true,
    "name": "my template",
    "ports": [
      "8888/http",
      "22/tcp"
    ],
    "readme": "<string>",
    "runtimeInMin": 123,
    "volumeInGb": 20,
    "volumeMountPath": "/workspace"
  },
  "templateId": "30zmvf89kd",
  "userId": "user_2PyTJrLzeuwfZilRZ7JhCQDuSqo",
  "version": 0,
  "workers": [
    {
      "adjustedCostPerHr": 0.69,
      "aiApiId": null,
      "consumerUserId": "user_2PyTJrLzeuwfZilRZ7JhCQDuSqo",
      "containerDiskInGb": 50,
      "containerRegistryAuthId": "clzdaifot0001l90809257ynb",
      "costPerHr": "0.74",
      "cpuFlavorId": "cpu3c",
      "desiredStatus": "RUNNING",
      "dockerEntrypoint": [
        "<string>"
      ],
      "dockerStartCmd": [
        "<string>"
      ],
      "endpointId": null,
      "env": {
        "ENV_VAR": "value"
      },
      "gpu": {
        "id": "<string>",
        "count": 1,
        "displayName": "<string>",
        "securePrice": 123,
        "communityPrice": 123,
        "oneMonthPrice": 123,
        "threeMonthPrice": 123,
        "sixMonthPrice": 123,
        "oneWeekPrice": 123,
        "communitySpotPrice": 123,
        "secureSpotPrice": 123
      },
      "id": "xedezhzb9la3ye",
      "image": "runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04",
      "interruptible": false,
      "lastStartedAt": "2024-07-12T19:14:40.144Z",
      "lastStatusChange": "Rented by User: Fri Jul 12 2024 15:14:40 GMT-0400 (Eastern Daylight Time)",
      "locked": false,
      "machine": {
        "minPodGpuCount": 123,
        "gpuTypeId": "<string>",
        "gpuType": {
          "id": "<string>",
          "count": 1,
          "displayName": "<string>",
          "securePrice": 123,
          "communityPrice": 123,
          "oneMonthPrice": 123,
          "threeMonthPrice": 123,
          "sixMonthPrice": 123,
          "oneWeekPrice": 123,
          "communitySpotPrice": 123,
          "secureSpotPrice": 123
        },
        "cpuCount": 123,
        "cpuTypeId": "<string>",
        "cpuType": {
          "id": "<string>",
          "displayName": "<string>",
          "cores": 123,
          "threadsPerCore": 123,
          "groupId": "<string>"
        },
        "location": "<string>",
        "dataCenterId": "<string>",
        "diskThroughputMBps": 123,
        "maxDownloadSpeedMbps": 123,
        "maxUploadSpeedMbps": 123,
        "supportPublicIp": true,
        "secureCloud": true,
        "maintenanceStart": "<string>",
        "maintenanceEnd": "<string>",
        "maintenanceNote": "<string>",
        "note": "<string>",
        "costPerHr": 123,
        "currentPricePerGpu": 123,
        "gpuAvailable": 123,
        "gpuDisplayName": "<string>"
      },
      "machineId": "s194cr8pls2z",
      "memoryInGb": 62,
      "name": "<string>",
      "networkVolume": {
        "id": "agv6w2qcg7",
        "name": "my network volume",
        "size": 50,
        "dataCenterId": "EU-RO-1"
      },
      "portMappings": {
        "22": 10341
      },
      "ports": [
        "8888/http",
        "22/tcp"
      ],
      "publicIp": "100.65.0.119",
      "savingsPlans": [
        {
          "costPerHr": 0.21,
          "endTime": "2024-07-12T19:14:40.144Z",
          "gpuTypeId": "NVIDIA GeForce RTX 4090",
          "id": "clkrb4qci0000mb09c7sualzo",
          "podId": "xedezhzb9la3ye",
          "startTime": "2024-05-12T19:14:40.144Z"
        }
      ],
      "slsVersion": 0,
      "templateId": null,
      "vcpuCount": 24,
      "volumeEncrypted": false,
      "volumeInGb": 20,
      "volumeMountPath": "/workspace"
    }
  ],
  "workersMax": 3,
  "workersMin": 0
}

GET

endpoints

{endpointId}

Find an endpoint by ID

curl --request GET \
  --url https://rest.runpod.io/v1/endpoints/{endpointId} \
  --header 'Authorization: Bearer <token>'

{
  "allowedCudaVersions": [
    "12.8"
  ],
  "computeType": "GPU",
  "createdAt": "2024-07-12T19:14:40.144Z",
  "dataCenterIds": "EU-NL-1,EU-RO-1,EU-SE-1",
  "env": {
    "ENV_VAR": "value"
  },
  "executionTimeoutMs": 600000,
  "gpuCount": 1,
  "gpuTypeIds": [
    "NVIDIA GeForce RTX 4090"
  ],
  "id": "jpnw0v75y3qoql",
  "idleTimeout": 5,
  "instanceIds": [
    "cpu3c-8-16"
  ],
  "name": "my endpoint",
  "networkVolumeId": "agv6w2qcg7",
  "scalerType": "QUEUE_DELAY",
  "scalerValue": 4,
  "template": {
    "category": "NVIDIA",
    "containerDiskInGb": 50,
    "containerRegistryAuthId": "<string>",
    "dockerEntrypoint": [],
    "dockerStartCmd": [],
    "earned": 100,
    "env": {
      "ENV_VAR": "value"
    },
    "id": "30zmvf89kd",
    "imageName": "runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04",
    "isPublic": false,
    "isRunpod": true,
    "isServerless": true,
    "name": "my template",
    "ports": [
      "8888/http",
      "22/tcp"
    ],
    "readme": "<string>",
    "runtimeInMin": 123,
    "volumeInGb": 20,
    "volumeMountPath": "/workspace"
  },
  "templateId": "30zmvf89kd",
  "userId": "user_2PyTJrLzeuwfZilRZ7JhCQDuSqo",
  "version": 0,
  "workers": [
    {
      "adjustedCostPerHr": 0.69,
      "aiApiId": null,
      "consumerUserId": "user_2PyTJrLzeuwfZilRZ7JhCQDuSqo",
      "containerDiskInGb": 50,
      "containerRegistryAuthId": "clzdaifot0001l90809257ynb",
      "costPerHr": "0.74",
      "cpuFlavorId": "cpu3c",
      "desiredStatus": "RUNNING",
      "dockerEntrypoint": [
        "<string>"
      ],
      "dockerStartCmd": [
        "<string>"
      ],
      "endpointId": null,
      "env": {
        "ENV_VAR": "value"
      },
      "gpu": {
        "id": "<string>",
        "count": 1,
        "displayName": "<string>",
        "securePrice": 123,
        "communityPrice": 123,
        "oneMonthPrice": 123,
        "threeMonthPrice": 123,
        "sixMonthPrice": 123,
        "oneWeekPrice": 123,
        "communitySpotPrice": 123,
        "secureSpotPrice": 123
      },
      "id": "xedezhzb9la3ye",
      "image": "runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04",
      "interruptible": false,
      "lastStartedAt": "2024-07-12T19:14:40.144Z",
      "lastStatusChange": "Rented by User: Fri Jul 12 2024 15:14:40 GMT-0400 (Eastern Daylight Time)",
      "locked": false,
      "machine": {
        "minPodGpuCount": 123,
        "gpuTypeId": "<string>",
        "gpuType": {
          "id": "<string>",
          "count": 1,
          "displayName": "<string>",
          "securePrice": 123,
          "communityPrice": 123,
          "oneMonthPrice": 123,
          "threeMonthPrice": 123,
          "sixMonthPrice": 123,
          "oneWeekPrice": 123,
          "communitySpotPrice": 123,
          "secureSpotPrice": 123
        },
        "cpuCount": 123,
        "cpuTypeId": "<string>",
        "cpuType": {
          "id": "<string>",
          "displayName": "<string>",
          "cores": 123,
          "threadsPerCore": 123,
          "groupId": "<string>"
        },
        "location": "<string>",
        "dataCenterId": "<string>",
        "diskThroughputMBps": 123,
        "maxDownloadSpeedMbps": 123,
        "maxUploadSpeedMbps": 123,
        "supportPublicIp": true,
        "secureCloud": true,
        "maintenanceStart": "<string>",
        "maintenanceEnd": "<string>",
        "maintenanceNote": "<string>",
        "note": "<string>",
        "costPerHr": 123,
        "currentPricePerGpu": 123,
        "gpuAvailable": 123,
        "gpuDisplayName": "<string>"
      },
      "machineId": "s194cr8pls2z",
      "memoryInGb": 62,
      "name": "<string>",
      "networkVolume": {
        "id": "agv6w2qcg7",
        "name": "my network volume",
        "size": 50,
        "dataCenterId": "EU-RO-1"
      },
      "portMappings": {
        "22": 10341
      },
      "ports": [
        "8888/http",
        "22/tcp"
      ],
      "publicIp": "100.65.0.119",
      "savingsPlans": [
        {
          "costPerHr": 0.21,
          "endTime": "2024-07-12T19:14:40.144Z",
          "gpuTypeId": "NVIDIA GeForce RTX 4090",
          "id": "clkrb4qci0000mb09c7sualzo",
          "podId": "xedezhzb9la3ye",
          "startTime": "2024-05-12T19:14:40.144Z"
        }
      ],
      "slsVersion": 0,
      "templateId": null,
      "vcpuCount": 24,
      "volumeEncrypted": false,
      "volumeInGb": 20,
      "volumeMountPath": "/workspace"
    }
  ],
  "workersMax": 3,
  "workersMin": 0
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

endpointId

string

required

ID of endpoint to return.

Query Parameters

includeTemplate

boolean

default:false

Include information about the template used to create the endpoint.

Example:

true

includeWorkers

boolean

default:false

Include information about the workers running on the endpoint.

Example:

true

Response

Successful operation.

allowedCudaVersions

enum<string>[]

A list of acceptable CUDA versions for the workers on a Serverless endpoint. If not set, any CUDA version is acceptable.

Show child attributes

computeType

enum<string>

The type of compute used by workers on a Serverless endpoint.

Available options:

CPU,

GPU

Example:

"GPU"

createdAt

string

The UTC timestamp when a Serverless endpoint was created.

Example:

"2024-07-12T19:14:40.144Z"

dataCenterIds

enum<string>[]

A list of Runpod data center IDs where workers on a Serverless endpoint can be located.

Show child attributes

Example:

"EU-NL-1,EU-RO-1,EU-SE-1"

env

object

Example:

{ "ENV_VAR": "value" }

executionTimeoutMs

integer

The maximum number of milliseconds an individual request can run on a Serverless endpoint before the worker is stopped and the request is marked as failed.

Example:

600000

gpuCount

integer

The number of GPUs attached to each worker on a Serverless endpoint.

Example:

1

gpuTypeIds

enum<string>[]

A list of Runpod GPU types which can be attached to a Serverless endpoint.

Show child attributes

string

A unique string identifying a Serverless endpoint.

Example:

"jpnw0v75y3qoql"

idleTimeout

integer

The number of seconds a worker on a Serverless endpoint can be running without taking a job before the worker is scaled down.

Example:

5

instanceIds

string[]

For CPU Serverless endpoints, a list of instance IDs that can be attached to a Serverless endpoint.

Example:

["cpu3c-8-16"]

name

string

A user-defined name for a Serverless endpoint. The name does not need to be unique.

Example:

"my endpoint"

networkVolumeId

string

The unique string identifying the network volume to attach to the Serverless endpoint.

Example:

"agv6w2qcg7"

scalerType

enum<string>

The method used to scale up workers on a Serverless endpoint. If QUEUE_DELAY, workers are scaled based on a periodic check to see if any requests have been in queue for too long. If REQUEST_COUNT, the desired number of workers is periodically calculated based on the number of requests in the endpoint's queue. Use QUEUE_DELAY if you need to ensure requests take no longer than a maximum latency, and use REQUEST_COUNT if you need to scale based on the number of requests.

Available options:

QUEUE_DELAY,

REQUEST_COUNT

Example:

"QUEUE_DELAY"

scalerValue

integer

If the endpoint scalerType is QUEUE_DELAY, the number of seconds a request can remain in queue before a new worker is scaled up. If the endpoint scalerType is REQUEST_COUNT, the number of workers is increased as needed to meet the number of requests in the endpoint's queue divided by scalerValue.

Example:

4

template

object

Show child attributes

templateId

string

The unique string identifying the template used to create a Serverless endpoint.

Example:

"30zmvf89kd"

userId

string

A unique string identifying the Runpod user who created a Serverless endpoint.

Example:

"user_2PyTJrLzeuwfZilRZ7JhCQDuSqo"

version

integer

The latest version of a Serverless endpoint, which is updated whenever the template or environment variables of the endpoint are changed.

Example:

0

workers

object[]

Information about current workers on a Serverless endpoint.

Show child attributes

workersMax

integer

The maximum number of workers that can be running at the same time on a Serverless endpoint.

Example:

3

workersMin

integer

The minimum number of workers that will run at the same time on a Serverless endpoint. This number of workers will always stay running for the endpoint, and will be charged even if no requests are being processed, but they are charged at a lower rate than running autoscaling workers.

Example:

0

List endpoints Update an endpoint

⌘I

API Reference

Authorizations

Path Parameters

Query Parameters

Response